(no title)
adastral | 1 year ago
I wonder if they were unaware of it or disregarded it for a reason —I currently am in a similar situation as the one described in the blog, trying to shard a massive Postgres DB.
adastral | 1 year ago
I wonder if they were unaware of it or disregarded it for a reason —I currently am in a similar situation as the one described in the blog, trying to shard a massive Postgres DB.
gshulegaard|1 year ago
For example, it was only 2 years ago that Citus allowed the joining of data in "local" tables and data retrieved from distributed tables (https://www.citusdata.com/updates/v11-0). In this major update as well, Citus enabled _any_ node to handle queries, previously all queries (whether or not it was modifying data) had to go through the "coordinator" node in your cluster. This could turn into a pretty significant bottleneck which had ramifications for your cluster administration and choices made about how to shape your data (what goes into local tables, reference tables, or distributed tables).
Again, huge fan of Citus, but it's not a magic bullet that makes it so you no longer have to think about scale when using Postgres. It makes it _much_ easier and adds some killer features that push complexity down the stack such that it is _almost_ completely abstracted from application logic. But you still have be cognizant of it, sometimes even altering your data model to accommodate.
gen220|1 year ago
It's hard to account for the value of benefits that have yet to accrue, but this kind of analysis, even if you pretty heavily-discount that future value, tilts the ROI in favor of solutions like Citus, IMO. Especially if your time horizon is 5+ or 10+ years out.
Like you said, if they made this decision 3ish years ago, you would have had to be pretty trusting on that future value. A choice, made today, hinges less on that variable.
sgarland|1 year ago
junto|1 year ago
Maybe because CitusData was bought by Microsoft around the same time, so Microsoft could create “Azure Cosmos DB for Postgres Cluster”, yet another one of Microsoft’s typical product naming crapshoots.
victor106|1 year ago
Well said. I haven't seen any company as terrible as Microsoft at naming things. Anyone know why?
jabart|1 year ago
gen220|1 year ago
It'd be interesting to compare the expected capex of developing this in-house solution + the opex of maintaining it vs the same categories of expected costs for option (3) – because I imagine that's probably the most palatable option.
They also may have pre-paid for dedicated RDS instances for the next X years (before this horizontal scaling initiative began, to boot), as AWS allows companies to do this at a pretty steep discount rate, which would probably tilt them away from (3).
iamdanieljohns|1 year ago
_boffin_|1 year ago
dijit|1 year ago
Not sure what your definition of massive is, I think Spanner would easily beat it.
Also, it's very use-case dependent, you can't "just use" Citus for everything, it's not quite as flexible as a bog-standard pgsql install due to the way it's sharding, you have to be a tad more careful with your data model.
adastral|1 year ago
gregors|1 year ago