top | item 43019180

(no title)

Thews | 1 year ago

They slow to a crawl when you have huge tables with lots of versioned data and massive indexes that can't perform maintenance in a reasonable amount of time, even with the fastest vertically scaled hardware. You run into issues partitioning the data and spreading it across processors, and spreading it across servers takes solutions that require engineering teams.

There's a large amount of solutions for different kinds of data for a reason.

discuss

pritambaral|1 year ago

I have built "huge tables with lots of versioned data and massive indexes". This is false. I had no issues partitioning the data and spreading it across shards. On Postgres.

> ... takes solutions that require engineering teams.

All it took was an understanding of the data. And just one guy (me), not an "engineering team". Mongo knows only one way of sharding data. That one way may work for some use-cases, but for the vast majority of use-cases it's a Bad Idea. Postgres lets me do things in many different ways, and that's without extensions.

If you don't understand your data, and you buy in to the marketing bullshit of a proprietary "solution", and you're too gullible to see through their lies, well, you're doomed to fail.

This fear-mongering that you're trying to pull in favour of the pretending-to-be-a-DB that is Mongo is not going to work anymore. It's not the early 2010s.

Thews|1 year ago

Where did I ever say anything about Mongo?

I have worked with tables on this scale. It definitely is not a walk in the park with traditional setups. https://www.timescale.com/blog/scaling-postgresql-to-petabyt...

Now data chunked into objects distributed around to be accessed by lots of servers, that's no sweat.

I'd love to see how you handle database maintenance when your active data is over 100TB.

troupo|1 year ago

> They slow to a crawl when you have huge tables

Define "huge". Define "massive".

For modern RDBMS that starts at volumes that can't really fit on one machine (for some definition of "one machine"). I doubt Mongo would be very happy at that scale, too.

On top of that an analysis of the query plan usually shows trivially fixable bottlenecks.

On top of that it also depends on how you store your versioned data (wikipedia stores gzipped diffs, and runs on PHP and MariaDB).

Again, none of the claims you presented have any solid evidence in real world.

Thews|1 year ago

Wikipedia is tiny data. You don't start to really see cost scaling issues until you have active data a few hundred times larger and your data changes enough that autovacuuming can't keep up.

I'm getting paid to move a database that size this morning.