Notes to self

Ecto embeds have IDs by default

When you use Ecto’s embeds_one to embed another schema backed by JSONB, Ecto will assign an ID and complain if you try to replace this embeded record without it.

Imagine the following embeded EmbeddedData within Record:

defmodule Record do
  use Ecto.Schema

  schema "records" do
    embeds_one :data, EmbeddedData
  end
end

defmodule EmbeddedData do
  use Ecto.Schema

  embedded_schema do
    field :field, :string
    field :another_field, :string
  end
end

Although not directly specified in embedded_schema macro call, Ecto will add to the embedded data an ID attribute which will also automatically populate. This is a default behaviour.

As the Ecto docs puts it:

The embedded may or may not have a primary key. Ecto uses the primary keys to detect if an embed is being updated or not. If a primary is not present, :on_replace should be set to either :update or :delete if there is a desire to either update or delete the current embed when a new one is set.

And later:

Primary keys are automatically set up for embedded schemas as well, defaulting to {:id, :binary_id, autogenerate: true}

This tells us to expect ID named as id and to be a string (binary in Elixir).

If we would want to create our Record together with its embedded data, we would most likely be using cast_embed in the parent changeset as follows:

defmodule Record do
  ...
  def changeset(%Record{} = record, attrs \\ %{}) do
    record
    |> cast(attrs, [])
    |> cast_embed(:data)
  end
  ...

Once saved using such changeset, the JSONB would look something like:

{
  "id": "generated-id-....",
  "field": "value",
  "another_field": "another value"
}

The embeded data field of the Record would end up including an id. But why is it important?

If we work further with our Record struct and try to update it by providing only our original data struct without an ID Ecto won’t know how to continue.

...
By default it is not possible to replace or delete embeds and
associations during `cast`. Therefore Ecto requires all existing
data to be given on update. Failing to do so results in this
error message.

If you want to replace data or automatically delete any data
not sent to `cast`, please set the appropriate `:on_replace`
option when defining the relation. The docs for `Ecto.Changeset`
covers the supported options in the "Associations, embeds and on
replace" section.

However, if you don't want to allow data to be replaced or
deleted, only updated, make sure that:

 * If you are attempting to update an existing entry, you
   are including the entry primary key (ID) in the data.

 * If you have a relationship with many children, at least
   the same N children must be given on update.

This happens when you are trying to assign the data field as if it’s a new object.

And as José Valim puts it in one of relevant GitHub issues:

The reason why we require the ID is to know when you are replacing or keeping the same embed. In any case, it is not something we can change, as it would be a breaking change. But the ID should be generated by default.

So if we are using the primary key for our embed (the default behaviour) and we want to just update this JSONB we need to preserve its ID.

To demostrate, the following changeset will preserve this ID automatically if the original record is passed to it:

  ...
  def changeset(%Record{} = record, attrs \\ %{}) do
    id =
      if record.data do
        record.data.id
      else
        nil
      end

    data = attrs["data"] || %{}
    new_data = Map.merge(%{id: id}, data)
    attrs = Map.merge(attrs, %{data: new_data})

    record
    |> cast(attrs, [])
    |> cast_embed(:data)
  end
  ...

Having an ID can be handy in some cases. For example, if you are considering them resources in REST APIs. JSON:API spec would even require for a resource to have an ID assigned.

On the other hand, you might not need it. In that case, turn off the primary key with:

defmodule EmbeddedData do
  use Ecto.Schema

  @primary_key false
  embedded_schema do
    field :field, :string
    field :another_field, :string
  end
end

Then tell Ecto to delete the original embed on update with on_replace option set to :delete:

defmodule Record do
  use Ecto.Schema

  schema "records" do
    embeds_one :data, EmbeddedData, on_replace: :delete
  end
end

Work with me

I have some availability for contract work. I can be your fractional CTO, a Ruby on Rails engineer, or consultant. Write me at strzibny@strzibny.name.

RSS