WingNews

zihotki|1 year ago

Or is it? Jepsen reported a number of issues like "read skew, cyclic information flow, duplicate writes, and internal consistency violations. Weak defaults meant that transactions could lose writes and allow dirty reads, even downgrading requested safety levels at the database and collection level. Moreover, the snapshot read concern did not guarantee snapshot unless paired with write concern majority—even for read-only transactions."

That report (1) is 4 years old, many things could have changed. But so far any reviewed version was faulty in regards to consistency.

1 - https://jepsen.io/analyses/mongodb-4.2.6

pipe_connector|1 year ago

Jepsen found a more concerning consistency bug than the above results when Postgres 12 was evaluated [1]. Relevant text:

We [...] found that transactions executed with serializable isolation on a single PostgreSQL instance were not, in fact, serializable

I have run Postgres and MongoDB at petabyte scale. Both of them are solid databases that occasionally have bugs in their transaction logic. Any distributed database that is receiving significant development will have bugs like this. Yes, even FoundationDB.

I wouldn't not use Postgres because of this problem, just like I wouldn't not use MongoDB because they had bugs in a new feature. In fact, I'm more likely to trust a company that is paying to consistently have their work reviewed in public.

1. https://jepsen.io/analyses/postgresql-12.3

endisneigh|1 year ago

That’s been resolved for a long time now (not to say that MongoDB is perfect, though).

vorticalbox|1 year ago

That is for mongo 4.x but latest stable is 6.0.7 which has note More resilient operations and Additional data security.

https://www.mongodb.com/blog/post/big-reasons-upgrade-mongod...

throwup238|1 year ago

> I'm not sure what "with strong consistency benefits" means.

"Doesn't use MongoDB" was my first thought.

unknown|1 year ago

[deleted]

danpalmer|1 year ago

MongoDB had "strong consistency" back in 2013 when I studied it for my thesis. The problem is that consistency is a lot bigger space than being on or off, and MongoDB inhabited the lower classes of consistency for a long time while calling it strong consistency which lost a lot of developer trust. Postgres has a range of options, but the default is typically consistent enough to make most use-cases safe, whereas Mongo's default wasn't anywhere close.

They also had a big problem trading performance and consistency, to the point that for a long time (v1-2?) they ran in default-inconsistent mode to meet the numbers marketing was putting out. Postgres has never done this, partly because it doesn't have a marketing team, but again this lost a lot trust.

Lastly, even with the stronger end of their consistency guarantees, and as they have increased their guarantees, problems have been found again and again. It's common knowledge that it's better to find your own bugs than have your customers tell you about them, but in database consistency this is more true than normal. This is why FoundationDB are famous for having built a database testing setup before a database (somewhat true). It's clear from history that MongoDB don't have a sufficiently rigorous testing procedure.

All of these factors come down to trust: the community lacks trust in MongoDB because of repeated issues across a number of areas. As a result, just shipping "strong consistency" or something doesn't actually solve the root problem, that people don't want to use the product.

pipe_connector|1 year ago

It's fair to distrust something because you were burned by using it in the past. However, both the examples you named -- Postgres and FoundationDB -- have had similar concurrency and/or data loss bugs. I have personally seen FoundationDB lose a committed write. Writing databases is hard and it's easy to buy into marketing hype around safety.

I think you should reconsider your last paragraph. MongoDB has a massive community, and many large companies opt to use it for new applications every day. Many more people want to use that product than FoundationDB.

nijave|1 year ago

Have you looked at versions in the last couple years to see if they've made progress?

throwaway2037|1 year ago

    > my thesis

Can you share a link? I would like to read your research.

Izkata|1 year ago

> MongoDB has supported the equivalent of Postgres' serializable isolation for many years now.

That would be the "I" in ACID

> I'm not sure what "with strong consistency benefits" means.

Probably the "C" in ACID: Data integrity, such as constraints and foreign keys.

https://www.bmc.com/blogs/acid-atomic-consistent-isolated-du...

lkdfjlkdfjlg|1 year ago

> Pongo - Mongo but on Postgres and with strong consistency benefits.

I don't read this as saying it's "MongoDB but with...". I read it as saying that it's Postgres.

jokethrowaway|1 year ago

Have you tried it in production? It's absolute mayhem.

Deadlocks were common; it uses a system of retries if the transaction fails; we had to disable transactions completely.

Next step is either writing a writer queue manually or migrating to postgres.

For now we fly without transaction and fix the occasional concurrency issues.

pipe_connector|1 year ago

Yes, I have worked on an application that pushed enormous volumes of data through MongoDB's transactions.

Deadlocks are an application issue. If you built your application the same way with Postgres you would have the same problem. Automatic retries of failed transactions with specific error codes are a driver feature you can tune or turn off if you'd like. The same is true for some Postgres drivers.

If you're seeing frequent deadlocks, your transactions are too large. If you model your data differently, deadlocks can be eliminated completely (and this advice applies regardless of the database you're using). I would recommend you engage a third party to review your data access patterns before you migrate and experience the same issues with Postgres.

threeseed|1 year ago

> Next step is either writing a writer queue manually

You can just use a connection pool and limit writer threads.

You should be using one to manage your database connections regardless of which database you are using.

(no title)

discuss