Notes to self

Alternative BigInt ID identifiers for Rails

Rails comes with default BigInt IDs for your primary keys since version 5.1 (replacing regular Int) and offers UUID v4 as an supported alternative since version 6. But what are the alternatives and what to use in your next app?

BigInt

Going with the flow and default to BigInt is the safest choice. But it’s not without a drawback – you’ll leak how many records you have in the database, and might not be happy with variable length of the ID.

The good:

  • It’s the default, no extra work
  • Everything works as expected
  • You won’t run out of BigInt anytime soon
  • Short identifies
  • Small indexes

The bad:

  • Leaking record count
  • Variable length

UUIDv4

UUIDv4 are the only alternative that works out of the box with Rails, but not without complete drawbacks either. UUID is a random string in a predefined format and thus it’s superior in use as an URL identifier by not leaking your database. But it’s not ordered so things like Record.first and Record.last now don’t work. You can fix this with implicit_order_column and sort on other columns like created_at, but if you create records in bulk, you won’t know which one is the last one.

The good:

  • Officially supported by Rails
  • Not exposing record count
  • Independent generation
  • Same length

The bad:

  • Bit longer keys
  • Bigger database indexes
  • Subtle issues because of not-ordered nature
  • ActiveStorage still using BigInt
  • Possiblity of collision (but a small one)

To make Rails use UUIDs for new models, change the generator settings:

# config/initializers/generators.rb
Rails.application.config.generators do |g|
  g.orm :active_record, primary_key_type: :uuid
end

Make sure to read an excellent post on UUID by Pawel Urbanek.

Composite primary keys

If we want to keep the BigInt, one obvious way to fix the leaking part of BigInt primary key and even increasing the number of total records is to make it composite one based on the tenant or some other distinct thing in the database.

The good:

  • You can hide the total record count
  • Small indexes
  • Even more performant than BigInt at various things at scale

The bad:

  • ~10x slower inserts
  • Require some other considerations (outside access, 3rd-party tools)

You can read a real-world example of Shopify using composite keys leading with a Shop ID that mentions all considerations and trade-offs.

You can also notice from the post that they simply kept id as a primary key in Rails (while having the composite one in the database):

class Order < ApplicationRecord
  self.primary_key = :id
end

This ID is still ordered and so your usual ActiveRecord quering should work.

UUIDv7

UUIDv7 is a new suitable alternative for UUIDv4 adressing its primary drawback. UUIDs are now generated based on time and thus ordered. While this leaks the date of creation, it’s usually not important. Since this lacks support in databases, you would need to generate your IDs yourself.

The good:

  • Not exposing record count
  • Independent generation
  • Same length
  • Ordered as BigInt

The bad:

  • Bit longer keys
  • Bigger database indexes
  • Leaking creation time
  • A bit higher collision possibility
  • Not yet supported in many databases and tools
  • Extra gem

The way to implement it is to use uuid column type but generate the IDs yourself. Using the uuid7 gem, this can look like the following:

Record.create!(id: UUID7.generate)

The gem repository also contains benchmarks on speen.

NanoID

NanoID with values like izkpm55j334u or z2n60bhrj7e8 is not too different from UUID but addressing the UUID user experience. The argument is that UUID is too long and cumbersome. NanoID is shorter, easy to select with a double click, easy to generate, and with a small risk of collision. However, since it’s pretty much ordinary string, it’s mostly used as a secondary index on a column like public_id.

The good:

  • All the benefits of keeping BigInt where it matters
  • Avoiding the biggest drawback of BigInt
  • Option to influence the length of the ID

The bad:

  • Extra database column
  • Extra gem

PlanetScale built an ActiveModel concern you can copy & paste into your application. Have a look at their post on NanoID.

Hashid

Hashids are hashes generated based on the BigInt IDs that can be used in URLs. The word hash is a bit of a misnomer since the idea is converting the hash value back into BigInt. This way you don’t need an extra column or database index for this second index. The best option for Rails include Hashids and Prefixedids which are based on Hashids but includes a record type as a prefix (something like Stripe identifiers). There is a small performance hit for doing the convertion every single time. However it’s likely negligible for most cases.

The good:

  • All the benefits of keeping BigInt where it matters
  • Avoiding the biggest drawback of BigInt
  • Option to influence the length of the ID

The bad:

  • Negligible performance hit
  • Extra gem

The Rails implementation relies on overriding to_param on a model which tells Rails what to use in URLs, but there is an option to skip the override if you want to be super explicit. The configuration is quite simple too:

class Account < ApplicationRecord
  has_prefix_id :acct,
    minimum_length: 32,
    override_find: false,
    override_param: false,
    salt: "",
    fallback: false
end

Slugs

The last option is to maintain your own slug value in a separate database column as your second index, perhaps only for models that warrant it. Similar to NanoID you would use an extra column and index, but depending on a model you can use whatever value fits the best. This is almost required for models that should be searchable on the Internet such as blog posts or knowledge base articles, but can be valuable for anything that a user might copy or work with.

The good:

  • All the benefits of keeping BigInt where it matters
  • Option to generate SEO-friendly identifiers

The bad:

  • An extra column and index

You can read my post on writing a general slug concern for your Rails models.

Conclusion

There are many ways to Rome and there are no wrong answers. One of the easist option regarding general compatibility, easiness to use in apps new and old, and avoiding most problems is likely to use a hashid or a slug. New applications could perhaps go all-in on UUID7, but a small issues here and there are to be expected. Check out my Twitter where you can see the results of my poll on using UUID7. The split is pretty close!

Check out my book
Interested in Ruby on Rails default testing stack? Take Minitest and fixtures for a spin with my latest book.

Get Test Driving Rails while it's in prerelease.

by Josef Strzibny
RSS