> UUIDv7 features a time-ordered value field derived from the widely implemented and well-known Unix Epoch timestamp source, the number of milliseconds
This just seems to be a way of creating a huge class of subtle bugs. Now, when two things happen to be created in the same millisecond, they may or may not be monotonically increasing.
Plenty of systems will end up accidentally depending on the ordering of the UUID's being the same order the UUID's were generated in. And that will hold true till the system hits production and suddenly there is enough load for that not to be true for a handful of records and the whole system fails.
TL;DR: Several new UUID versions have been standardized
UUIDv5 is meant for generating UUIDs from "names" that are drawn from, and unique within, some "namespace" as per Section 6.5.
UUIDv6 is a field-compatible version of UUIDv1 (Section 5.1), reordered for improved DB locality. It is expected that UUIDv6 will primarily be implemented in contexts where UUIDv1 is used.
UUIDv7 features a time-ordered value field derived from the widely implemented and well-known Unix Epoch timestamp source, the number of milliseconds since midnight 1 Jan 1970 UTC, leap seconds excluded. Generally, UUIDv7 has improved entropy characteristics over UUIDv1 (Section 5.1) or UUIDv6 (Section 5.6).
UUIDv8 provides a format for experimental or vendor-specific use cases. The only requirement is that the variant and version bits MUST be set as defined in Sections 4.1 and 4.2. UUIDv8's uniqueness will be implementation specific and MUST NOT be assumed.
The only explicitly defined bits are those of the version and variant fields, leaving 122 bits for implementation-specific UUIDs. To be clear, UUIDv8 is not a replacement for UUIDv4 (Section 5.4) where all 122 extra bits are filled with random data.
Background for the changes:
Many things have changed in the time since UUIDs were originally created. Modern applications have a need to create and utilize UUIDs as the primary identifier for a variety of different items in complex computational systems, including but not limited to database keys, file names, machine or system names, and identifiers for event-driven transactions.
> Some UUID implementations, such as those found in Python and Microsoft, will output UUID with the string format, including dashes, enclosed in curly braces.
No … Python doesn't emit them enclosed in curly braces?
> UUIDv7 features a time-ordered value field derived from the widely implemented and well-known Unix Epoch timestamp source, the number of milliseconds since midnight 1 Jan 1970 UTC, leap seconds excluded.
That seems like a rather vague way of addressing leap seconds for UUIDv7. For positive leap seconds, an 'exclusion' of that second would suggest that the millisecond counter is halted until the leap second is over, which doesn't seem ideal for monotonicity. And an 'exclusion' of a negative leap second hardly makes any conventional sense at all, with regard to the millisecond counter.
Contrast with the timestamp of UUIDv1/v6, where positive leap seconds can just be handled by incrementing the clock sequence.
I interpreted it to mean the timer is monotonic and ignores leap seconds completely. It does make it easy to implement wrong if your most convenient time API does implement leap seconds. (I don’t see why this would have anything to do with the millisecond timer? Leap seconds happen on the second.)
Depends on your problem domain. You can be Twitter/Discord sized and get away with 64 bits. When you start dedicating parts of your UUID to a timestamp the possibility of collisions does go way up since now a significant chunk of the UUID will be the same for everyone. But when you deploy this variant you aren't trying to make globally unique ids anymore, you're trying to make application unique ids. You are sill very unlikely to not also have a globally unique id because 128 bits gives a lot of room to play around.
For hash functions, maybe not anymore, given the birthday paradox/pigeon-hole principle and other math problems in bucketing inputs versus the attack patterns for breaking hash functions and causing intentional collisions. For mostly purely random entropy in uses like UUID (and IPv6) the classic answer is that it is still more overall space than "atoms in the visible universe".
They seem better geared for usage in databases as primary keys, specifically UUID versions 6 and onwards:
> Motivation. One area in which UUIDs have gained popularity is database keys. This stems from the increasingly distributed nature of modern applications. In such cases, "auto-increment" schemes that are often used by databases do not work well: the effort required to coordinate sequential numeric identifiers across a network can easily become a burden. The fact that UUIDs can be used to create unique, reasonably short values in distributed systems without requiring coordination makes them a good alternative, but UUID versions 1-5, which were originally defined by [RFC4122], lack certain other desirable characteristics [...]
> UUIDv6 is a field-compatible version of UUIDv1 (Section 5.1), reordered for improved DB locality. It is expected that UUIDv6 will primarily be implemented in contexts where UUIDv1 is used. Systems that do not involve legacy UUIDv1 SHOULD use UUIDv7 (Section 5.7) instead.
> Instead of splitting the timestamp into the low, mid, and high sections from UUIDv1, UUIDv6 changes this sequence so timestamp bytes are stored from most to least significant. That is, given a 60-bit timestamp value as specified for UUIDv1 in Section 5.1, for UUIDv6 the first 48 most significant bits are stored first, followed by the 4-bit version (same position), followed by the remaining 12 bits of the original 60-bit timestamp. [...]
> UUIDv7 features a time-ordered value field derived from the widely implemented and well-known Unix Epoch timestamp source, the number of milliseconds since midnight 1 Jan 1970 UTC, leap seconds excluded. Generally, UUIDv7 has improved entropy characteristics over UUIDv1 (Section 5.1) or UUIDv6 (Section 5.6).
The problem that this standard solves isn't a math problem. It's an engineering problem of defining (adding) UUID formats that are suitable for use in database keys (and some other things). Previous proposals had disadvantages for the use-case.
[+] [-] londons_explore|1 year ago|reply
This just seems to be a way of creating a huge class of subtle bugs. Now, when two things happen to be created in the same millisecond, they may or may not be monotonically increasing.
Plenty of systems will end up accidentally depending on the ordering of the UUID's being the same order the UUID's were generated in. And that will hold true till the system hits production and suddenly there is enough load for that not to be true for a handful of records and the whole system fails.
[+] [-] vhcr|1 year ago|reply
[+] [-] swyx|1 year ago|reply
[+] [-] htunnicliff|1 year ago|reply
UUIDv5 is meant for generating UUIDs from "names" that are drawn from, and unique within, some "namespace" as per Section 6.5.
UUIDv6 is a field-compatible version of UUIDv1 (Section 5.1), reordered for improved DB locality. It is expected that UUIDv6 will primarily be implemented in contexts where UUIDv1 is used.
UUIDv7 features a time-ordered value field derived from the widely implemented and well-known Unix Epoch timestamp source, the number of milliseconds since midnight 1 Jan 1970 UTC, leap seconds excluded. Generally, UUIDv7 has improved entropy characteristics over UUIDv1 (Section 5.1) or UUIDv6 (Section 5.6).
UUIDv8 provides a format for experimental or vendor-specific use cases. The only requirement is that the variant and version bits MUST be set as defined in Sections 4.1 and 4.2. UUIDv8's uniqueness will be implementation specific and MUST NOT be assumed.
The only explicitly defined bits are those of the version and variant fields, leaving 122 bits for implementation-specific UUIDs. To be clear, UUIDv8 is not a replacement for UUIDv4 (Section 5.4) where all 122 extra bits are filled with random data.
Background for the changes:
Many things have changed in the time since UUIDs were originally created. Modern applications have a need to create and utilize UUIDs as the primary identifier for a variety of different items in complex computational systems, including but not limited to database keys, file names, machine or system names, and identifiers for event-driven transactions.
[+] [-] pspeter3|1 year ago|reply
[+] [-] Two4|1 year ago|reply
[+] [-] shrimp_emoji|1 year ago|reply
[+] [-] azulster|1 year ago|reply
[+] [-] newprint|1 year ago|reply
[+] [-] deathanatos|1 year ago|reply
No … Python doesn't emit them enclosed in curly braces?
[+] [-] LegionMammal978|1 year ago|reply
That seems like a rather vague way of addressing leap seconds for UUIDv7. For positive leap seconds, an 'exclusion' of that second would suggest that the millisecond counter is halted until the leap second is over, which doesn't seem ideal for monotonicity. And an 'exclusion' of a negative leap second hardly makes any conventional sense at all, with regard to the millisecond counter.
Contrast with the timestamp of UUIDv1/v6, where positive leap seconds can just be handled by incrementing the clock sequence.
[+] [-] fanf2|1 year ago|reply
[+] [-] anamexis|1 year ago|reply
[+] [-] wrs|1 year ago|reply
[+] [-] ComplexSystems|1 year ago|reply
[+] [-] Spivak|1 year ago|reply
[+] [-] WorldMaker|1 year ago|reply
[+] [-] AaronFriel|1 year ago|reply
[+] [-] free_bip|1 year ago|reply
[+] [-] unknown|1 year ago|reply
[deleted]
[+] [-] unknown|1 year ago|reply
[deleted]
[+] [-] cachvico|1 year ago|reply
[+] [-] jcrites|1 year ago|reply
> Motivation. One area in which UUIDs have gained popularity is database keys. This stems from the increasingly distributed nature of modern applications. In such cases, "auto-increment" schemes that are often used by databases do not work well: the effort required to coordinate sequential numeric identifiers across a network can easily become a burden. The fact that UUIDs can be used to create unique, reasonably short values in distributed systems without requiring coordination makes them a good alternative, but UUID versions 1-5, which were originally defined by [RFC4122], lack certain other desirable characteristics [...]
> UUIDv6 is a field-compatible version of UUIDv1 (Section 5.1), reordered for improved DB locality. It is expected that UUIDv6 will primarily be implemented in contexts where UUIDv1 is used. Systems that do not involve legacy UUIDv1 SHOULD use UUIDv7 (Section 5.7) instead.
> Instead of splitting the timestamp into the low, mid, and high sections from UUIDv1, UUIDv6 changes this sequence so timestamp bytes are stored from most to least significant. That is, given a 60-bit timestamp value as specified for UUIDv1 in Section 5.1, for UUIDv6 the first 48 most significant bits are stored first, followed by the 4-bit version (same position), followed by the remaining 12 bits of the original 60-bit timestamp. [...]
> UUIDv7 features a time-ordered value field derived from the widely implemented and well-known Unix Epoch timestamp source, the number of milliseconds since midnight 1 Jan 1970 UTC, leap seconds excluded. Generally, UUIDv7 has improved entropy characteristics over UUIDv1 (Section 5.1) or UUIDv6 (Section 5.6).
[+] [-] unknown|1 year ago|reply
[deleted]
[+] [-] posting_mess|1 year ago|reply
[deleted]
[+] [-] jcrites|1 year ago|reply
This is discussed in the "Update Motivation" section of the document: https://www.rfc-editor.org/rfc/rfc9562.html#name-update-moti...
[+] [-] sedatk|1 year ago|reply
maybe because we can’t come up with an unambiguous definition of “decent.”