top | item 46341056

(no title)

qouteall | 2 months ago

In real-world business requirements it often need to read some data then touch other data based on previous read result.

It violates the "every transaction can only be in one shard" constraint.

For a specific business requirement it's possible to design clever sharding to make transaction fit into one shard. However new business requirements can emerge and invalidate it.

"Every transaction can only be in one shard" only works for simple business logics.

discuss

order

n2d4|2 months ago

I talk about these problems in the "How hard can sharding be?" section of the article — long story short, not all business requirements can be dealt with easily, but surprisingly many can if you choose a smart sharding key.

You can also still do optimistic concurrency across shards! That covers most of the remaining ones. Anything that requires anything more complex — sagas, 2PC, etc. — is relatively rare, and at scale, a traditional SQL OLTP will also struggle with those.

qouteall|2 months ago

Thanks for reply.

So in my understanding:

- The transactions that only touch one shard is simple

- The transactions that read multiple shards but only write shard can use simple optimistic concurrency control

- The transactions that writes (and reads) multiple shards stay complex. Can be avoided by designing a smart sharding key. (hard to do if business requirement is complex)

hinkley|2 months ago

Business people have a nasty habit of identifying two independent pieces of data you have and finding ideas to combine them to do something new. They aren’t happy until every piece of data is copied with every other piece and then they still aren’t happy because now everything is horrible because everything is coupled to everything.

rowanG077|2 months ago

in my experience most backends I have worked on people don't use the facilities of their database. They indeed simply hit the database two or more times. But that doesn't mean it's not possible to do better if you actually put more care in your queries. Most of the time multiple transactions can be eliminated. So I don't agree this is a business requirement complexity problem. It's a "it works so it's good enough" problem, or a "lazy developer" problem depending on how you want to frame it.

formerly_proven|2 months ago

This (along with n+1) is somewhat encouraged in business applications due to the prevalence of the repository pattern.

SoftTalker|2 months ago

Give each business or customer its own schema and you almost never need sharding.

n2d4|2 months ago

Yes, but you could also flip it the other way around — make the business or customer your sharding key, and you'll only need to manage one schema!