top | item 40551909

(no title)

While it is fun to see how to creatively solve such issues, it does raise the question of managability. When sharding data into loosely (fdw) coupled silo's it would become tricky to make consistent backups, ensure locking mechanisms work when sharded data might sometimes be directly related, handle zone/region failures gracefully, prevent hot spots, perform multi-region schema-changes reliably, etc. I suppose this pattern principally only works when the data is in fact not strongly related and the silo's are quite independent. I wouldn't call that a distributed system at all, really. This may be a matter of opinion of course.

It does give a "When all you have is a hammer..." vibe to me and begs the question: why not use a system that's designed for use-cases like this and do it reliably and securely ? i.e.: https://www.cockroachlabs.com/docs/stable/multiregion-overvi... (yes, I know full data domiciling requires something even more strict but I currently don't know of any system that can transparently span the globe and stay performant while not sharing any metadata or caching between regions)

discuss

tudorg|1 year ago

> It does give a "When all you have is a hammer..." vibe to me and begs the question: why not use a system that's designed for use-cases like this and do it reliably and securely ?

(disclaimer: blog post author)

A reason would be that you want to stick to pure Postgres, for example because you want to use Postgres extensions, or prefer the liberal Postgres license.

It can also be a matter of performance, distributed transactions are necessarily slower so if almost all the time you can avoid them by connecting to a single node, which has all the data that the transaction needs, that's going to get you better performance.

hazaskull|1 year ago

Hi there! Thank you for the post and your work on pgzx! Though it depends on the system (cockroachdb can place leaders on specific nodes to speed up local queries, it has global tables and otherwise there's always follower-reads) those are of course valid reasons. Admittedly if you want to keep data "pinned", you're into manual placement, rather than horizontal scaling but that's nitpicking and there's pros and cons. I do enjoy the freedom of Postgres and am hopeful that its virtually prehistoric storage-design becomes a non-issue thanks to tech such as Neon and Orioledb. The option to decouple storage would provide wonderful flexibility for solutions like yours too I think.