top | item 42438393

(no title)

sudhirj | 1 year ago

I asked the S3 team what “prefix” meant at reinvent, and my current understanding is “whatever starting length of key gives a reasonable cardinality for your objects”.

So if your keys are 2024/12/03/22-45:24 etc, I would expect the prefix to be first 7 characters. If your keys are UUIDs I’d assume first two or three. For ULIDs I’d assume first 10. I this there’s a function that does stat analysis on key samples to figure out reasonable sharding.

discuss

order

tecleandor|1 year ago

Yep. Works similarly with google cloud storage buckets. It seems like the indexing function they use for splitting/distributing/sharding access looks at your objects keys and finds a common prefix to do this.

The problem with a date based key like the one you used (that's very common) is that if you read a lot of files that tend to be from the same date (for example: for data analysis you read all the files from one day or week, not files randomly distributed) all those files are going to share the same prefix and are going to be located in the same shard, reducing performance until the load is so high that Google splits that index in parts and begins to distribute your data in other shards.

For this reason they recommend to think your key name beforehand and split that prefix using some sort of random hash in a reasonable location of your key:

https://cloud.google.com/storage/docs/request-rate#naming-co...

jrochkind1|1 year ago

It would be nice if S3 provided similar public guidance. For instance:

> Adding a random string after a common prefix still allows auto-scaling to work, but…

No way to know if that's true of S3's algorithm too without them revealing it.

jrochkind1|1 year ago

I have never seen this explained, so thank you! Sounds like it's kind of "up to S3 and probably not predictable by you" -- which at least explains why it wasn't clear!

If you don't have "a lot" of keys, then you probably have only one prefix, maybe? Without them documenting the target order of magnitude of their shards?

sudhirj|1 year ago

I would assume so, the extreme case being just one key, which of course has only one partition. But see https://youtu.be/NXehLy7IiPM (2024 Reinvent S3 deep dive) - there’s still replication happening on single objects. So it’s still sort of sharded, but I do think key partitions where groups of keys have shared choke points based on sort order exist.