(no title)
benwilber0 | 10 months ago
Use bigint, never UUID. UUIDs are massive (2x a bigint) and now your DBMS has to copy that enormous value to every side of a relation.
It will bloat your table and indexes 2x for no good reason whatsoever.
Never use UUIDs as your primary keys.
bsder|10 months ago
This seems like terrible advice.
For the vast, vast, vast majority of people, if you don't have an obvious primary key, choosing UUIDv7 is going to be an absolute no-brainer choice that causes the least amount of grief.
Which of these is an amateur most likely to hit: crash caused by having too small a primary key and hitting the limit, slowdowns caused by having a primary key that is effectively unsortable (totally random), contention slowdowns caused by having a primary key that needs a lock (incrementing key), or slowdowns caused by having a key that is 16 bytes instead of 8?
Of all those issues, the slowdown from a 16 byte key is by far the least likely to be an issue. If you reach the point where that is an issue in your business, you've moved off of being a startup and you need to cough up real money and do real engineering on your database schemas.
sgarland|10 months ago
You can monitor and predict the growth rate of a table; if you don’t know you’re going to hit the limit of an INT well in advance, you have no one to blame but yourself.
Re: auto-incrementing locks, I have never once observed that to be a source of contention. Most DBs are around 98/2% read/write. If you happen to have an extremely INSERT-heavy workload, then by all means, consider alternatives, like interleaved batches or whatever. It does not matter for most places.
I agree that UUIDv7 is miles better than v4, but you’re still storing far more data than is probably necessary. And re: 16 bytes, MySQL annoyingly doesn’t natively have a UUID type, and most people don’t seem to know about casting it to binary and storing it as BINARY(16), so instead you get a 36-byte PK. The worst.
benwilber0|10 months ago
This kind of problem only exists in unsophisticated databases like SQLite. Postgres reserves whole ranges of IDs at once so there is never any contention for the next ID in a serial sequence.
gruez|10 months ago
"enormous value" = 128 bits (compared to 64 bits)
In the worst case this causes your m2m table to double, but I doubt this has a significant impact on the overall size of the DB.
TylerE|10 months ago
sgarland|10 months ago
tpm|10 months ago
sgarland|10 months ago
sgt|10 months ago
outside1234|10 months ago
benwilber0|10 months ago
hellojesus|10 months ago
Or you could preload a table of autoincremented bigints and then atomically grab the next value from there where you need a surrogate key like in a distributed system with no natural pk.
rowanseymour|10 months ago
LunaSea|10 months ago
varispeed|10 months ago
rowanseymour|10 months ago
hu3|10 months ago
Twice now I was called to fix UUIDs making systems crawl to stop.
People underestimate how important efficient indexes are on relational databases because replacing autoincrement INTs with UUIDs works well enough for small databases, until it doesn't.
My gripe against UUIDs is not even performance. It's debugging.
Much easier to memorize and type user_id = 234111 than user_id = '019686ea-a139-76a5-9074-28de2c8d486d'
zerr|10 months ago
hu3|10 months ago
Even Microsoft default sample database schema uses INT ids.
sgarland|10 months ago
mvdtnz|10 months ago
So your advice that you DEFINITELY won't need a BIGINT, well, that decision can come back to bite you if you're successful enough.
(You're probably thinking there's no way we have over 2 billion users and that's true, but it's also a bad assumption that one user row perfectly corresponds to one registered user. Assumptions like that can and do change.)
pyuser583|10 months ago
I’ll be 110 years old telling my great-grandchildren about how we used integers for primary keys, until reason arrived and we started using uuids.
And they’ll be like, “you weren’t one of those anti-vaxxers were you?”