top | item 8730220

(no title)

hallmark | 11 years ago

Would you consider graph to be a fourth type of database, after relational, key value, and document/hierarchical?

discuss

Not really (see below). The article confuses a number of things and assumes only the simplest or most naive implementations.

A graph database is essentially a relational implementation that supports recursive joins. While not part of the strictest, most minimalist relational implementation in theory, most mature SQL databases support this to some extent. Some databases that support recursive joins will limit you to directed acyclic graphs.

A document database is a relational database implemented around materialized joins. It saves you from doing common joins dynamically. This is more restricted than a graph because it is a directed acyclic graph only. Again, most mature SQL databases support this to some extent or another.

A relational database that supports both materialized and recursive joins has the same expressiveness as a document and/or graph database. However, it trades some query performance in those individual cases for flexibility and performance in other cases that both document and graph databases are relatively poor at. Document and graph databases have existed since databases were first invented. Most of the implementation optimizations for documents and graphs are optionally available in good SQL databasess.

Key-value databases are a different beast. Most people only consider primitive databases based on key equality relationships or ordered ranges, which are quite limited. You can also build them based on multidimensional spatial relationships (i.e. keys are hyper-rectangles that can be relatively addressed), which are equivalent in expressiveness to relational implementations. If you added recursive joins to implementations of the spatial variants, you'd again have something that could express all of the models mentioned.

In summary, these models are APIs, they are not fundamentally different things if the underlying database engine is adequate for the purpose. It is why, for example, PostgreSQL can be used quite effectively these days as a document database.

pm90|11 years ago

What about distributed databases? I agree with your point that Postgres is actually the best option for most use cases. But as a practicing programmer, when should I consider choosing something like MongoDB?

eigenrick|11 years ago

Ooh. Good question. In implementation, they end up looking a lot like key/value stores, since most of the ones I know are implemented as edge-vertex associations.

However, there is certainly a lot of specialized functionality on top of that. You can then turn around and apply both relational models and hierarchical models with them.

There are definitely some use cases for which I would heartily recommend a graph database over the others, so, yeah, it is another category. It is also something that should have been mentioned in this article. :)

jandrewrogers|11 years ago

A simple key-value model is a very low-scalability, low-performance method of implementing a graph database. The key to graph database performance is maintaining consistent locality over traversals, which is no small trick from a computer science standpoint. I know a lot of graph databases work using naive key-value lookups but it is not recommended.

Most modestly scalable graph databases are implemented as traditional relational-style engines with highly optimized equi-join recursion. The most scalable implementations use topological methods that are beyond the scope of this post but definitely not simple key-value designs.

mrkurt|11 years ago

The "how your DB stores the data" and "how you query it" is kind of interesting. Time series DBs are similar to graph DBs — implementation-wise they're not really distinct, but the specialized functionality makes them extra useful.