top | item 36432818

(no title)

babbledabbler | 2 years ago

This was generally the reason I went with numeric ID as PK originally. It makes working with and analyzing the data as well as cross referencing relations easier.

For all my tables I have a base schema that looks something like this.

id: integer sequence PK uuid: uuidV4 created_at: datetime updated_at: datetime

The concern I have is when I have to distribute my system when scaling. Those numeric IDs will have to be replaced with the UUIDs so I figure I might as well do it now.

discuss

order

gregwebs|2 years ago

Everything breaks at scale. In my experience most tables don't end up with more than a few million rows and will work fine with this. If you did want to transition a large table to be UUID only, the nice thing about this approach is that you could do it with no down time. If you are using a DB that only scales writes vertically though (most DBs, including distributed DBs) then how are you actually going to scale the DB layer horizontally? Pretty much just CRDB (PG) or TiDB (MySQL) are the options there- look at their docs for how to setup your ids.

babbledabbler|2 years ago

I'm not so much concerned with figuring out scaling in terms of volume as I expect to be able to handle millions of rows in a single DB and that would be an implementation detail and fine tuning. I'm more concerned about scaling in terms of complexity and keeping the system easy to reason about when more people, tech are involved.

Lets say I have a <CAR>-[1:N]-<TRIP> in two tables in a relational DB. This works fine at first even for millions of rows as you say.

At some point in the future it makes sense to have these two entities managed by different team/services/db. Let's say TRIP becomes a whole feature laden thing with fares, hotels, itinerary, dates. So I need to take this local relation and move it to different services and different DB.

If I had been using an integer PK/FK this would be a more complicated migration than if I used UUIDs.

My assumption is that we would not want to have a sequenced integer key used in a distributed system.

In other words it seems safer bet if there's a possibility of needing to move to a distributed system to use a UUID for the key from the beginning.