Yes but if that machine with sequential data receives 100x the traffic of other machines, it can be worse than splitting this traffic evenly across all available machines.
If your database simply shards keys sequentially, it's going to get hotspots in a lot of use cases, like plain old integer keys and timestamps, not just UUIDv7. In that case it would be fair to say that your database is doing it wrong.
Fortunately, there's no rule that says you should shard your keys using the sequential part up front.
One of the rules for generating randomness from environmental sources is to throw away the high bits and only use the low bits. Distributed databases should do the same if they want a good distribution.
What distributed databases shard on the low bits? How do they do something like a range query?
The closest I’ve ever heard of is sharding based on a hash (e.g. CockroachDB can do this on request[1]) but most distributed databases with strong consistency (Spanner descendants in particular) default to “doing it wrong”.
kijin|2 years ago
Fortunately, there's no rule that says you should shard your keys using the sequential part up front.
One of the rules for generating randomness from environmental sources is to throw away the high bits and only use the low bits. Distributed databases should do the same if they want a good distribution.
johncolanduoni|2 years ago
The closest I’ve ever heard of is sharding based on a hash (e.g. CockroachDB can do this on request[1]) but most distributed databases with strong consistency (Spanner descendants in particular) default to “doing it wrong”.
[1]: https://www.cockroachlabs.com/docs/stable/hash-sharded-index...
stepanhruda|2 years ago
paulddraper|2 years ago
stepanhruda|2 years ago