This is a common mistake people make, I think. Just because an ecosystem is smaller does not mean it is inferior (or less practical). In fact, the opposite is routinely true.
Is it harder to find tutorials for PostgreSQL? Yeah, maybe. Is it harder to get high-quality advice and resources from subject matter experts? No; in fact, it's far easier. I have used PostgreSQL for many years and in all of them every question I've had has been answered quickly and properly by somebody in #postgres who knows way more about postgres and SQL than I'd ever even want to know.
And what about hires? When a million people "know" MySQL it takes a lot more effort to find somebody who actually does. There may be fewer contractors for PostgreSQL but they are practically by definition better than the average MySQL contractor.
unlike learning a programming language, with databases, the domain knowledge is harder to discover and more risky if you don't understand it
Exactly. So should you trust the ecosystem that relies on lazy, bandwagon developers for the majority of its market share, or the one that has been silently doing shit the right way for years? There may be a whole lot of really skilled MySQL admins out there (I guess there must be given how many huge sites go to extreme lengths to make MySQL work for them), but I can find you a handful of grossly overqualified PostgreSQL admins in no time. The same can't be said for the former.
That's really interesting to know, actually. I've only been with companies that had MySQL installations (Facebook and Twitter) and so jumping in was really easy, but on the other hand I do like to do things right. A few questions, though:
0/ How would I be able to tell if the limitations of our DB are because of MySQL and would be solved by PostgreSQL? That is, common pain points in MySQL that we wouldn't recognize immediately until we switched? (Edit: mostly looking for performance-related, but anything is fine.)
1/ Is there a PostgreSQL consulting firm that has the same reputation that Percona does with MySQL?
2/ Is there a good place to start when there's an issue or question I have with PostgreSQL (aside from IRC)? With MySQL I've been either going to the docs, or to Google, but it's been spotty with either.
Ok. Being borderline trollish here, but if I read this article, this means to me "yeah. we are using a technically inferior solution and thus risking data loss and increasing the workload on our developers because our HR department gets more resumes if we search for an in-house database specialist".
Postgres usually does its thing and does it well. You usually don't get weirdness that requires you to even hire a specialist. And if you do, the consultants around are great and actually understand the problem.
As other people here have said: If you find someone listing PostgreSQL experience on their resume, you know they have the experience. If you find somebody listing MySQL, you'd have to check whether they are just listing it because they have heard of it or whether they really understand the problem domain.
So. DB specialists are settled. What about developers? PostgreSQL and MySQL are close enough in what SQL constructs they support (no wonder - MySQL provides a subset of SQL anyways), so coming from MySQL to PostgreSQL usually is trivial and the few things that are unsupported unix_timestamp() for example, you can write wrappers for if you really need them.
Of course, developers will at first find it strange that they can't, for example insert a 200 characters string into a varchar(10) column, but even the most braindead developer sooner or later will understand that silently truncating data usually is a bad thing.
No. The arguments listed in that article are just fear of the unknown and total non-issues (in german, there's this saying: "Was der Bauer nicht kennt, das frisst er nicht" - ever so true).
I'm glad I can work with PostgreSQL day-in, day-out. I'm glad I migrated that huge application from MySQL to Postgres back in 2000. I'm glad I did not touch MySQL for serious stuff ever since then.
I understand if you need to use MySQL if you want to run some software that runs only on MySQL. But if you are writing your own application, there's just no reason not to got with the real RDBMS.
I'm glad I migrated that huge application from MySQL to
Postgres back in 2000. I'm glad I did not touch MySQL for
serious stuff ever since then.
You should refrain from commenting on MySQL then. It's amazing (and sad) how many pro-PG-anti-MySQL folks don't even know basic things about MySQL but feel entitled to rant and call it "toy DB", "not real RDBMS", etc.
A few bugs/limitations of Mysql not shared by Postgres. All AFAIK.
* Integer columns with null constraints will silently translate nulls to 0s, rather than throwing an error (as constraints in all other contexts do). Imagine a database which, on violation of a foreign key constraint, just silently picked the first row in the target table - it's just insane.
* Transactions can't roll back schema changes, so if you have a Rails migration which fails on the application of a unique index in the midst of other operations, too bad. Your DB is stuck in the no-mans-land between migrations.
* Constraints are anemic relative to postgres: no integer range or string length constraints, no whitelist, blacklist, regex match, &c. &c.
There are certainly others I've run into, but the above indicate to me that the Postgres developers are working with/delivering a better code-base.
That's a feature of postgres rather than a limitation of mysql however. As far as I know the SQL Standard doesn't ask for transactional DDL, Oracle has pretty much no support for transactional DDL (in 10g anyway, each DDL statement is executed in its own transaction and any pending transaction is COMMITted before a DDL), I have no idea about SQL Server.
Is quora flooding their own site with self-posed questions trying to get publicity and rankings? The only time I've ever heard of it is from the oxygen of publicity HN seems to give it.
Nah. You can usually (but not always) tell who asks the question by checking the first person who's following the question, or by checking the question history by clicking on the timestamp.
It's interesting to see how PostgresSQL didn't get nowhere near the popularity of MySQL. I think a few major things contributed to this:
1) MySQL focused on performance first while PostgresSQL positioned themselves as "feature-complete."
Of course, that was years ago and in the meantime they became comparable in both aspects. MySQL+InnoDB is ACID compliant and Postgres matches the InnoDB performance.
Sure, Postgres might still have the upper hand in the variety of features but MySQL has MyISAM. If you don't need transactions you can get a significant performance boost over Postgres. Speed IS a feature and one could argue it's more important to many than all the nice extra things Postgres has.
2) MySQL was compatible with mSQL so they got a pretty good user base right from the start. mSQL later became obsolete.
3) Inertia. Even if Postgres would be net superior to MySQL (think feature complete but faster than MyISAM) it would still take a lot of time to catch up, if ever. There are tons of tools already developed for MySQL and a huge knowledge base around it. Just like C++, Windows, or any other big technology, there's a lot of inertia to it.
I think having a "launch" at the right time might have had some impact also. MySQL launched in 1995, right around the time that the idea of generating hypertext dynamically was taking off, so a bunch of web stuff was looking for a lightweight, fast DB without necessarily a lot of features. Postgres had been around since the 1980s as a research project, and in 1994 was publicly released as an open-source project and added SQL support (rather than its own query language), but it was sort of off the internet's radar as a research project mostly known in academia. The official launch of the rebranded PostgreSQL was in 1997, but that might've been a bit late already, since 1995-1997 were a fairly crucial two years for the LAMP stack's birth.
MySQL being easier to install on Windows from 1995 to 2005 might also have had something to do with it. PostreSQL had to be run in Cygwin until release 8.0 in 2005.
Most of MySQL's momentum I'd say would be because Slashdot used it. Because of course, a database that is (only) suitable for a scenario where reads outnumber writes a few thousand to one, and a comment getting lost is actually no big deal is obviously suitable for your OLTP app...
There are a number of companies that provide support for PostgreSQL. Enterprise DB is probably the most prominent of them. Kind of funny that they didn't have a good technical reason not to use PostgreSQL, just that the community was smaller.
Postgres 9, with streaming replication and easy to setup bullet proof read only slave replication will probably move momentum in the direction of PostgreSQL. The replication was always easier to setup on MySQL, which gave it an advantage.
This article is an example of choosing easy option over the better option. From a startup's perspective, it's completely understandable. Thousands of cheap MySQL admins, or a few hundred expensive PostgreSQL admins?
As for the "nobody uses PostgreSQL" argument; Facebook and Twitter aren't an endorsement of MySQL because any Linux hosting provider out there will have MySQL installed BY DEFAULT. PostgreSQL would require you to upgrade to a more expensive account (for root access) and install it yourself. As a result, all the hobbyists learn on the free and always available MySQL when building "The Next Big Thing." Remember, Facebook was started in a college dorm room.
Those who commented in favor of postgres: What are some of the biggest deployments of postgres? Facebook has a gargantic deployment of MySQL and it just works. The author of the answer on Quora mentions reliable scalability as the most important factor for their MySQL choice on another answer here: http://bit.ly/dm6HtQ.
What is the biggest system that postgres is deployed in?
Yahoo builds two petabyte PostgreSQL database
James Hamilton writes about Yahoo's "over 2 petabyte repository of user click stream and context data with an update rate for 24 billion events per day".
It apparently is built on top of a modified version of PostgreSQL and runs on about 1k machines. In his post, James speculates on the details of the internals. Very interesting.
A better question would be: who uses MySQL as anything but a glorified key-value store when reaching large sizes? I personally have been involved in trying to use MySQL as a "real database" in the 10-100TB range and let me tell you, it's not pretty. I'm not sure about the open source PostgreSQL, but I know Greenplum has petabyte level warehouses running on a distributed version of it.
A few observations:
1) MyISAM's performance is highly dependent on certain idiosyncrasies of a lot of applications. Using MyISAM in this day and age is a very bad idea. InnoDB at least gets closer to real database behavior.
2) The "query optimizer" is insulting at best and actively impeding getting things right if you use it for much more than simple queries. Something that's more along the lines of what really large databases (as opposed to KV stores) get used for can implode the server.
Personally, I think too many people try and stick things in relational databases that don't belong there simply because they've got the hammer in their hand and it's easier than pulling out a screwdriver.
Database abstraction isn't, or at least isn't practical.
Small projects don't need an interchangeable database. Bigger projects won't let you get away without making assumptions about the database. Nothing but the basics work the same between SQLite and ORACLE.
Depends on your abstraction system. If we're talking about SQL, the incompatibilities abound. And ORM's or other libraries tend to be either very, very complex or tend to implement everything in the most simple way. Both not very good for performance. Never mind that most frameworks don't have the same level of support for every DB (open source libraries tend to have very limited Oracle/SQL Server support)
And this is only talking about queries, once we get to triggers, constraints or even stored procedures…
[+] [-] tdavis|15 years ago|reply
Is it harder to find tutorials for PostgreSQL? Yeah, maybe. Is it harder to get high-quality advice and resources from subject matter experts? No; in fact, it's far easier. I have used PostgreSQL for many years and in all of them every question I've had has been answered quickly and properly by somebody in #postgres who knows way more about postgres and SQL than I'd ever even want to know.
And what about hires? When a million people "know" MySQL it takes a lot more effort to find somebody who actually does. There may be fewer contractors for PostgreSQL but they are practically by definition better than the average MySQL contractor.
unlike learning a programming language, with databases, the domain knowledge is harder to discover and more risky if you don't understand it
Exactly. So should you trust the ecosystem that relies on lazy, bandwagon developers for the majority of its market share, or the one that has been silently doing shit the right way for years? There may be a whole lot of really skilled MySQL admins out there (I guess there must be given how many huge sites go to extreme lengths to make MySQL work for them), but I can find you a handful of grossly overqualified PostgreSQL admins in no time. The same can't be said for the former.
[+] [-] bdr|15 years ago|reply
[+] [-] catch23|15 years ago|reply
[+] [-] astrofinch|15 years ago|reply
- Programming languages have libraries
- Frameworks have plugins
- Operating systems have compatibility issues
The best code is the code you don't have to write.
[+] [-] 146|15 years ago|reply
0/ How would I be able to tell if the limitations of our DB are because of MySQL and would be solved by PostgreSQL? That is, common pain points in MySQL that we wouldn't recognize immediately until we switched? (Edit: mostly looking for performance-related, but anything is fine.)
1/ Is there a PostgreSQL consulting firm that has the same reputation that Percona does with MySQL?
2/ Is there a good place to start when there's an issue or question I have with PostgreSQL (aside from IRC)? With MySQL I've been either going to the docs, or to Google, but it's been spotty with either.
[+] [-] dangelo|15 years ago|reply
http://www.quora.com/How-reliable-is-Postgres-replication
that's one of the biggest issues scaring people away from Postgres.
[+] [-] lzimm|15 years ago|reply
[+] [-] pilif|15 years ago|reply
Postgres usually does its thing and does it well. You usually don't get weirdness that requires you to even hire a specialist. And if you do, the consultants around are great and actually understand the problem.
As other people here have said: If you find someone listing PostgreSQL experience on their resume, you know they have the experience. If you find somebody listing MySQL, you'd have to check whether they are just listing it because they have heard of it or whether they really understand the problem domain.
So. DB specialists are settled. What about developers? PostgreSQL and MySQL are close enough in what SQL constructs they support (no wonder - MySQL provides a subset of SQL anyways), so coming from MySQL to PostgreSQL usually is trivial and the few things that are unsupported unix_timestamp() for example, you can write wrappers for if you really need them.
Of course, developers will at first find it strange that they can't, for example insert a 200 characters string into a varchar(10) column, but even the most braindead developer sooner or later will understand that silently truncating data usually is a bad thing.
No. The arguments listed in that article are just fear of the unknown and total non-issues (in german, there's this saying: "Was der Bauer nicht kennt, das frisst er nicht" - ever so true).
I'm glad I can work with PostgreSQL day-in, day-out. I'm glad I migrated that huge application from MySQL to Postgres back in 2000. I'm glad I did not touch MySQL for serious stuff ever since then.
I understand if you need to use MySQL if you want to run some software that runs only on MySQL. But if you are writing your own application, there's just no reason not to got with the real RDBMS.
/rant.
[+] [-] rimantas|15 years ago|reply
[+] [-] Empact|15 years ago|reply
* Integer columns with null constraints will silently translate nulls to 0s, rather than throwing an error (as constraints in all other contexts do). Imagine a database which, on violation of a foreign key constraint, just silently picked the first row in the target table - it's just insane.
* Transactions can't roll back schema changes, so if you have a Rails migration which fails on the application of a unique index in the midst of other operations, too bad. Your DB is stuck in the no-mans-land between migrations.
* Constraints are anemic relative to postgres: no integer range or string length constraints, no whitelist, blacklist, regex match, &c. &c.
There are certainly others I've run into, but the above indicate to me that the Postgres developers are working with/delivering a better code-base.
[+] [-] masklinn|15 years ago|reply
That's a feature of postgres rather than a limitation of mysql however. As far as I know the SQL Standard doesn't ask for transactional DDL, Oracle has pretty much no support for transactional DDL (in 10g anyway, each DDL statement is executed in its own transaction and any pending transaction is COMMITted before a DDL), I have no idea about SQL Server.
[+] [-] weego|15 years ago|reply
[+] [-] 146|15 years ago|reply
[+] [-] scg|15 years ago|reply
1) MySQL focused on performance first while PostgresSQL positioned themselves as "feature-complete."
Of course, that was years ago and in the meantime they became comparable in both aspects. MySQL+InnoDB is ACID compliant and Postgres matches the InnoDB performance.
Sure, Postgres might still have the upper hand in the variety of features but MySQL has MyISAM. If you don't need transactions you can get a significant performance boost over Postgres. Speed IS a feature and one could argue it's more important to many than all the nice extra things Postgres has.
2) MySQL was compatible with mSQL so they got a pretty good user base right from the start. mSQL later became obsolete.
3) Inertia. Even if Postgres would be net superior to MySQL (think feature complete but faster than MyISAM) it would still take a lot of time to catch up, if ever. There are tons of tools already developed for MySQL and a huge knowledge base around it. Just like C++, Windows, or any other big technology, there's a lot of inertia to it.
[+] [-] _delirium|15 years ago|reply
[+] [-] mncaudill|15 years ago|reply
[+] [-] jap|15 years ago|reply
[+] [-] gaius|15 years ago|reply
[+] [-] dman|15 years ago|reply
[+] [-] zende|15 years ago|reply
HN must have stripped out the ' from the link when I submitted it
[+] [-] narrator|15 years ago|reply
Postgres 9, with streaming replication and easy to setup bullet proof read only slave replication will probably move momentum in the direction of PostgreSQL. The replication was always easier to setup on MySQL, which gave it an advantage.
[+] [-] efsavage|15 years ago|reply
(not that there aren't a fair number of postgres experts, but there's no question there are far more competent mysql experts)
[+] [-] masklinn|15 years ago|reply
[+] [-] jinushaun|15 years ago|reply
As for the "nobody uses PostgreSQL" argument; Facebook and Twitter aren't an endorsement of MySQL because any Linux hosting provider out there will have MySQL installed BY DEFAULT. PostgreSQL would require you to upgrade to a more expensive account (for root access) and install it yourself. As a result, all the hobbyists learn on the free and always available MySQL when building "The Next Big Thing." Remember, Facebook was started in a college dorm room.
[+] [-] blueberry|15 years ago|reply
[+] [-] vark|15 years ago|reply
Yahoo builds two petabyte PostgreSQL database James Hamilton writes about Yahoo's "over 2 petabyte repository of user click stream and context data with an update rate for 24 billion events per day".
It apparently is built on top of a modified version of PostgreSQL and runs on about 1k machines. In his post, James speculates on the details of the internals. Very interesting.
[+] [-] petrilli|15 years ago|reply
A few observations:
1) MyISAM's performance is highly dependent on certain idiosyncrasies of a lot of applications. Using MyISAM in this day and age is a very bad idea. InnoDB at least gets closer to real database behavior.
2) The "query optimizer" is insulting at best and actively impeding getting things right if you use it for much more than simple queries. Something that's more along the lines of what really large databases (as opposed to KV stores) get used for can implode the server.
Personally, I think too many people try and stick things in relational databases that don't belong there simply because they've got the hammer in their hand and it's easier than pulling out a screwdriver.
[+] [-] jeltz|15 years ago|reply
Skype uses PostgreSQL for their VOIP services.
Yahoo! has a customized PostgeSQL for their data warehousing storing a couple of petabytes of data.
I believe IMDB is another quite prominent user.
[+] [-] code_duck|15 years ago|reply
[+] [-] scg|15 years ago|reply
Small projects don't need an interchangeable database. Bigger projects won't let you get away without making assumptions about the database. Nothing but the basics work the same between SQLite and ORACLE.
[+] [-] mhd|15 years ago|reply
And this is only talking about queries, once we get to triggers, constraints or even stored procedures…
[+] [-] drivingmenuts|15 years ago|reply
[+] [-] unknown|15 years ago|reply
[deleted]
[+] [-] joshu|15 years ago|reply
specifically, interviewees that inform you you should be using postgresql can be immediately rejected.
"do you use mysql or postgresql?" "oh, mys-" "YOU SHOULD SWITCH TO POSTGRESQL!!"
[+] [-] andrewvc|15 years ago|reply
[+] [-] joshu|15 years ago|reply