What PostgreSQL has over other open source SQL databases: Part II

[+] halayli|10 years ago|reply

I wish postgreSQL focuses more on clustering, sharding and replication and bundle these features in the product rather than asking us to use other thirdparty tools like slony and pgpoolII. Other than that, postgreSQL is one of the cleanest SQL implementations. It's source code is incredibly clean, consistent and easy to follow.

[+] lobster_johnson|10 years ago|reply

There are a bunch of commercial engines built on Postgres that implement sharding and parallel querying at huge scale: Amazon Redshift, Truviso, Netezza, ParAccel, Aster Data and CitusDB come to mind. It's a shame that none of this has tricked down into the open-source version.

There's Postgres-X2 (formerly Postgres-XC), sponsored by NTT, which implements fully consistent multimaster replication and partitioning. After many, many years apparently still isn't production-ready nor particularly scalable, and the likelihood that it will ever be merged into mainline Postgres is zero (its planner changes alone are apparently several hundred thousand lines of code).

More interestingly, there's Postgres-XL, which is apparently a merger of Postgres-XC and a different implementation called StormDB that was bought by a company called TransLattice. (TransLattice also sponsored or acquired an earlier project that went nowhere, Postgres-R.) Unlike XC/X2, they say they aim to contribute changes back to the mainline, and they also claim their distributed query model is superior. With the commerical support behind it (it's used as the basis of a commercial product), it's possible that this is something that will be usable. Unfortunately, it seems very quiet and not very open-sourcy; most of the development seems to be by just one guy [3], and nobody seems to be using it in production at this point.

[1] https://github.com/postgres-x2/postgres-x2

[2] http://www.postgres-xl.org/

[3] http://git.postgresql.org/gitweb/?p=postgres-xl.git;a=summar...

[+] rdtsc|10 years ago|reply

In general that is not something that can be easily slapped on later easily. There are some solutions but they feel like add-ons.

For software some things can be added on later, but some can't. Things like fault tolerance, distribution, security can't be easily bolted on after the product has already been developed (as those usually cut right through the whole stack).

People who like Postgres and make fun of NoSQL products for not having well ... SQL, usually forget that NoSQL isn't as much about not having SQL as about having a good distributed/scalable backend story.

[+] intellectable|10 years ago|reply

For replication instances run in a master-replica setup using Streaming Replication try:

http://www.repmgr.org/

repmgr is an open-source tool suite to manage replication and failover in a cluster of PostgreSQL servers. It enhances PostgreSQL's built-in hot-standby capabilities with tools to set up standby servers, monitor replication, and perform administrative tasks such as failover or manual switchover operations.

repmgr has provided advanced support for PostgreSQL's built-in replication mechanisms since they were introduced in 9.0, and repmgr 2.0 supports all PostgreSQL versions from 9.0 to 9.4.

https://github.com/2ndQuadrant/repmgr src: http://instagram-engineering.tumblr.com/post/13649370142/wha...

For Sharding try:

http://instagram-engineering.tumblr.com/post/10853187575/sha... http://rob.conery.io/2014/05/29/a-better-id-generator-for-po...

For clustering try:

https://github.com/smbambling/pgsql_ha_cluster/wiki/Building... https://wiki.postgresql.org/wiki/Replication,_Clustering,_an...

I too wish there existed a nice dashboard db management tool like the ones Rethinkdb and Couchbase provide out of the box.

[+] brianwawok|10 years ago|reply

For sure. One line FT with auto fail over would make PG so much more useful for me. I can't have a DB be a SPOF in an app. But the current PGpool etc situation is not ideal.

[+] lsc|10 years ago|reply

yeah. So I haven't set up a postgres or mysql server in more than five years, but I was seriously into both before that. In the early to mid-aughts, I was heavily into PostgreSQL when I had a choice, but spent a lot of time on MySQL when other people did the deciding.

So last week I was setting up a sql server for a side project I was doing with a friend, and went with MySQL just because I knew it would take me ten minutes to setup async replication, whereas with PostgreSQL, I'd have to spend a bunch of time figuring out what third party thing to use now.

Even just having simple master/slave asynchronous replication functionality built in would help a lot.

[+] chrisutz|10 years ago|reply

Having worked with MySQL for years, my mind was blown using Postgres for the first time this year (mainly due to HStore, CTEs, and transactional DDL statements). However, one constant pain point for me was the lack of an upsert. Glad to read Postgres will be getting it in 9.5!

[+] spacemanmatt|10 years ago|reply

I wrote an upsert generator a long time ago. It was one of the easier things to work around.

[+] chanux|10 years ago|reply

Part I of the series: https://www.compose.io/articles/what-postgresql-has-over-oth...

[+] Veratyr|10 years ago|reply

I needed to implement tagging a while ago, particularly looking up a row with multiple tags. As far as I know, the main way to do this is usually to put the tags in another table and join multiple times (once per tag) to find a row that has them all. This performed rather badly (100s of ms for 5 tags from memory).

But with Postgres, you can put a GIN index on an Array column. So I moved my tags into an array column and suddenly querying for every one of several million rows that had matched a set of 7 tags took ~10ms.

Postgres is awesome.

[+] b3n|10 years ago|reply

I don't know much about SQL, but are you sure you need multiple joins for that?

I'd do something like this (psuedocode) on the table with the tags:

  SELECT foreign-id, COUNT(*)
  FROM Tags
  WHERE tag IN ('foo', 'bar', 'baz')
  GROUP BY foreign-id
  HAVING COUNT(*) = 3

It does seem rather hacky though.

[+] garyclarke27|10 years ago|reply

Re GUI clients - I've tried most purchased several - I agree pgAdmin is quite clunky and bit ugly also crashes regularly on Mac but not windows in my experience, it's main problem is the query editor is so weak. I don't understand why Postico has so many positive comments, I bought it and was very disappointed, yes it looks pretty but its capability is pathetic, much weaker than pgAdmin, I was hoping its query editor would be better but this is weak also, does not even give line numbers when reporting errors so useless for large queries I often run. I've found I great solution though - Sublime Text an amazing this awesome query editor lighting fast - rock solid loads of great plugins for sql and postgres, auto complete, snippits, the search and replace ability alone makes it worth using compared to pgAdmin. The build system makes it easy to execute queries inside ST, I used to copy paste queries from Notepad++ into PgAdmin years ago before I discovered the joys of direct execution and immediate feedback possible in Sublime Text. See this link for details on how to set up - only takes a few minutes http://blog.code4hire.com/2014/04/sublime-text-psql-build-sy...

[+] dorfsmay|10 years ago|reply

Have you tried DBeaver (http://dbeaver.jkiss.org/)?

[+] chucksmash|10 years ago|reply

pgAdmin is unfortunately unusable on a Linux desktop with high pixel count.

The GUI support for the system DPI setting is very haphazard and leads to problems like query results where the line height of the text respects the setting but the height of the cell itself doesn't so you only see the middle two quarters of each glyph.

[+] ww520|10 years ago|reply

Postgres has come a long way. Some of these features are very cool, like the WITH RECURSIVE. I wonder what's the performance implication is. Often you can gauge the performance by looking at the query, like a table scan or using index. Assuming the parent topic column is indexed, would the recursive walk take INDEX + SCAN? Where the INDEX is the parent topic look up and the SCAN is the aggregate look up of the subtopic.

[+] simoncion|10 years ago|reply

Is the information provided by

  explain select ...

insufficient?

[+] djrobstep|10 years ago|reply

I absolutely love Postgres, for these and other reasons, but the one weak point is the official GUI client. It's garbage, and unfortunately it's one of the first things people ask me about when I encourage them to use Postgres.

[+] phunge|10 years ago|reply

I had the same sentiment. I eventually ditched the gui and went back to psql (the CLI) based on this advice: http://www.craigkerstiens.com/2013/02/13/How-I-Work-With-Pos...

The CLI is really really good for a CLI. Wagon is also good, my coworkers have been using it to pull data and like it a lot.

Overall love postgres, it's been 6 months (since starting a new job) and I'll never go back.

[+] reactor|10 years ago|reply

http://dbeaver.jkiss.org works pretty well for me

[+] DevX101|10 years ago|reply

I just bought Postico. Best Postgres client I've come across so far for OSX. Would recommend if you're on a mac.

[+] pacquiao882|10 years ago|reply

For Windows users, HeidiSQL is a very solid open source client for interacting with Postgres.

[+] phaedryx|10 years ago|reply

I've been using Valentina Studio lately and like it so far. Postico looks good too.

[+] saltedshiv|10 years ago|reply

I use Navicat, its fantastic.

[+] zenocon|10 years ago|reply

Use JetBrains 0xDBE -- still available now as early pre-release for free.

[+] ebbv|10 years ago|reply

This article and the previous one are interesting but it's just highlighting features Postgres has that MySQL/MariaDB don't have. I was expecting a "fair" comparison and that's not what these are.

[+] simoncion|10 years ago|reply

> [The article is] just highlighting features Postgres has that MySQL/MariaDB don't have. [That's not a fair comparison.]

I disagree. If one project has a significant, useful feature that another project does not, it behooves someone who is comparing the projects to mention this fact.

[+] dragonwriter|10 years ago|reply

> This article and the previous one are interesting but it's just highlighting features Postgres has that MySQL/MariaDB don't have.

That's exactly what the title says.

> I was expecting a "fair" comparison and that's not what these are.

Assuming by a "'fair' comparison" you mean "an article laying out the relative advantages and disadvantages of Postrgres and other open source databases", how on earth is that even a remotely reasonable expectation of an article titled "What PostgreSQL has over other open source SQL databases".

Its like walking into a clearly-labeled strip club and complaining that you expected fully-clothed, family-friendly entertainment.

[+] chmaynard|10 years ago|reply

Agreed. The article implies that PostgreSQL has everything the other open-source databases have, plus these additional hot features. If that's true, the author should make it explicit.

[+] resca79|10 years ago|reply

Good article, but according to the post Title, I'd like to read how each features make postgres better than mysql or other, in terms of performance for example. Nice to have a list of unique features of postgres but having many indexing feature doesn't mean to be better than other dbs.

P.s. Postgres is my favourite db

[+] spacemanmatt|10 years ago|reply

Many features supported by PostgreSQL but not by MySQL are the features that make PostgreSQL better for many purposes. I think you should compare the suitability of the database to the task, not databases to databases.

[+] jeltz|10 years ago|reply

Having many indexing features allows PostgreSQL to be useful for more types of users. Many of the indexing features are primarily useful for the GIS community, where they are used by PostGIS. BRIN indexes on the other hand will primarily help for data warehousing workloads by trading performance for reduced disk space.

And GIN indexes allows fast indexing of JSON documents and for PostgreSQL's full-text search (mostly useful for quickly adding search to simple applications).

93 comments