top | item 45324292

(no title)

Why would exposing any primary key be bad for security? If your system's security *in any way* depends on the randomness of a database private key, you have other problems. It's not the job of a primary key to add to security. Not to mention that UUIDv7 has 6 random bytes, which, for the vast majority of web applications, even finance, is more than enough randomness. Just imagine how many requests an attacker would need to make to guess even one UUID (281 trillion possible combinations for 6 random bytes, and he also would need to guess the unix timestamp in ms correctly). The only scenario I can think of is that you use the primary as a sort of API key.

discuss

btown|5 months ago

One of the big things here is de-anonymization and account correlation. Say you have an application where users'/products' affiliation with certain B2B accounts is considered sensitive; perhaps they need to interact with each other anonymously for bid fairness, perhaps people might be scraping for "how many users does account X have onboarded" as metadata for a financial edge.

If users/products are onboarded in bulk/during B2B account signup, then, leaking the creation times of each of them with any search that returns their UUIDs, becomes metadata that can be used to correlate users with each other, if imperfectly.

Often, the benefits of a UUID with natural ordering outweigh this. But it's something to weigh before deciding to switch to UUIDv7.

8organicbits|5 months ago

> system's security in any way depends on the randomness of a database private key

Unlisted URLs, like YouTube videos are a popular example used by a reputable tech company.

> UUIDv7 has 6 random bytes

Careful. The spec allows 74 bits to be filled randomly. However you are allowed to exchange up to 12 bits for a more accurate timestamp and a counter of up to 42 bits. If you can get a fix on the timestamp and counter, the random portion only provides 20 bits (1M possiblities).

Python 3.14rc introduces a UUIDv7 implementation that has only 32 random bits, for example.

Basically, you need to see what your implementation does.

bearjaws|5 months ago

only 32bits, so 4 billion guesses per microsecond... Even if youtube has 1 million videos per microsecond you would never guess them before rate limits.

rhplus|5 months ago

The German Tank Problem springs to mind. While not precisely the same problem, it’s still a case where more information that necessary is leaked in seemingly benign IDs. For the Germans, they leaked production volumes. For UUID v7, you’re leaking timing and timestamps.

https://en.wikipedia.org/wiki/German_tank_problem

andy_ppp|5 months ago

The rest of the ID will be random enough that guessing it will take an extremely long time, unless all the tanks were inserted in the same microsecond of course. I’m not sure this is a security issue with UUID though!

Hizonner|5 months ago

Because anything that knows the primary key now knows the timestamp. The UUID itself leaks information. It's not that it's not adding security. It's that it's actually subtracting security.

andy_ppp|5 months ago

There’s every chance the API has timestamps on when it was inserted. Honestly I’d rather my data was ordered correctly than imagining the extremely rare situations that leaking the insert time is going to bring the world falling down. You usually want that information.

And I’m honestly not a fan of public services using primary keys publicly for anything important. I’d much rather nice or shorter URLs.

What might be an improvement is if you can look up records efficiently from the random bits of the UUID automatically, replacing the timestamp with an index.

dotancohen|5 months ago

The timestamp can be recovered from the UUID?

lucideer|5 months ago

> leaks information

It would have to leak sensitive information to be "subtracting security", which implies you're relying on timestamp secrecy to ensure your security. This would be one of the "other problems" the gp mentioned.

sz4kerto|5 months ago

Example: if user IDs are not random but eg Bigserial (autoincremented) and they're exposed through some API, then API clients can infer the creation time of said users in the system. Now if my system is storing eg health data for a large population, then it'll be easy to guess the age of the user. Etc. This is not a security problem, this is an information governance problem. But it's a problem. Now if you say that I should not expose these IDs - fine, but then whatever I expose is essentially an ID anyway.

andy_ppp|5 months ago

I really don’t think using primary keys publicly is ever good, just because UUID4 has allowed people to smash junk into the URL doesn’t mean it’s good for the web or the users over a slug or a cleaner ID.

unknown|5 months ago

[deleted]

echelon|5 months ago

Depends how much entropy is in your primary keys.

If your primary keys are monotonic or time based, bad actors can simply walk your API.