I'm using the AWS stack for http://www.soundslice.com/ and I've been using MySQL instead of Postgres, purely because my hatred for MySQL is less than my hatred for being a sysadmin. It was a tradeoff, and I miss Postgres dearly every time I use MySQL.
Yea I was using MySQL on RDS just to escape the Heroku monolith. Funny that Heroku announced a new Postgres pricing model that is VERY expensive compared to old setup (https://news.ycombinator.com/item?id=6712570) just in time to be wiped off the map by RDS.
I've heard about PostgreSQL and know that HN community raves about it, but am currently using RDS with MySQL.
Does it make sense to migrate to PostgreSQL, I don't have a lot of data as I'm in the early stage?
What are the primary advantages that PostgreSQL provides over MySQL?
Any advise/pointers is appreciated.
Perhaps you mistakenly insert "2013-10-32" into a date column. MySQL will silently convert this to "0000-00-00" (!!). Postgres will raise an error.
Perhaps you make an error in a transaction. MySQL lets you keep doing subsequent things in the transaction. Postgres treats the transaction as invalid and forces you to start over.
Of course, there are things you can do to make MySQL less horrible, and this is a generalization. But Postgres is just more respectful and more solid.
Oh, and PostGIS (Postgres' geo add-on) is by far the best open-source geospatial database. If you're doing anything with geographic queries, you need to be using it. MySQL's stuff is laughable in comparison.
Context: I've dealt extensively with both databases, both from the perspective of a framework author (Django) and a developer making products. I've used both databases on and off since 2001.
I think the biggest advantages are trust and flexibility.
Trust, because PostgreSQL language semantics are cleaner, closer to the SQL standard, and less likely to surprise you. And postgresql just has a good reputation for traditional engineering quality.
Flexibility, because it offers a lot of APIs and features that can be very useful to adapt your application as needs change. You don't have to go crazy with features, but even simple apps can benefit a lot from prudent use of them -- a trigger here, a foreign table there, or LISTEN/NOTIFY (for cache invalidation) can just save you a huge amount of work and make the system more robust overall. The extension mechanism is very powerful.
Before making any big decisions, do a trial migration and see what you think.
- More robust, fewer crashes, less corruption of data
- More features (JSON data type, partial indexes, function/expression indexes, window functions, CTEs, hstore, ranges/sequences/sets, too many to list)
- More disciplined (doesn't do things like auto-truncate input to get it to fit into a column)
- Not owned by Oracle, it's actively developed, regular major release schedule, etc.
- Better Python driver (don't know about other languages)
- Choice of languages for database functions/procedures (Python, JS, etc.)
The basic answer is (from what I can tell) that postgres ships with sane defaults. Lots of little gotchas exist in mysql that experienced mysql dbas know to deal with and avoid. On the more sophisticated side, pgSQL seems to have a focus on being "Really Awesome For Experts" where MySQL seems to be focusing on "Being Easy to Get Started".
I've been using pgSQL for my side project and for my "personal tooling" at work, and I can honestly say that it's just as easy for me as MySQL was for the same sort of things.
> Does it make sense to migrate to PostgreSQL, I don't have a lot of data as I'm in the early stage? What are the primary advantages that PostgreSQL provides over MySQL? Any advise/pointers is appreciated.
If MySQL works for you, there is no reason to change. Some people have quite a dependency on Postgres (hstore, JSON, transactional DDL, pubsub, PLPGSQL etc.)
Just DDL transaction make life so much easier but there is much more like Materialized Views, Advanced Data type, Efficient JOIN, pg_stat_statements, PL, full text search, GIS features (pretty useful to do near by query), fuzzy search, ISO SQL standard compliant, Piece of cake replication.
Do some search and give it a try one weekend.. Trying it is loving it.
Hating to pile on, but doing so anyways, one of the overlooked things I'd argue is that postgresql tries much harder to let you do neat things with your data without much hassle. Needing to write a custom function to process data? Not only does PostgreSQL have support for PL/PgSQL, but it's made in a way that has let people write plugins to python, perl, v8, tcl, etc, so you can do that data transformation on the database instead of shipping it back and forth to another server with the latency that brings. The recent version also brings read/write support to redis, so you have have your updates trigger redis to store new values, etc. Plus, being that it's not controlled by Oracle, there's much less a feeling of the community being held back; MySQL feels very much like Oracle's tying one hand back, so that when you get to the size to need a "real" database, Oracle will gladly step in, at their standard consulting fees, of course.
I have but one suggestion. Unless RDS for PostgreSQL is drastically different from RDS for MySQL, you still need a DB administrator.
RDS removes remarkably little of the pain of running a database instance (most of the pain that's removed is just the up front setup), and ends up adding a lot of inconveniences for your day-to-day operations.
Also don't count on their replication as your backup.
> Also don't count on their replication as your backup.
This statement seems like a complete non-sequitur: does anyone credible recommend replication as a backup strategy? It's like dinging a server vendor because you can't rely on RAID as a backup plan.
Any link to a page providing more info on those quircks ? I'm deciding between a traditionnal ec2 instance running postgres or rds at the very precise moment !
I think this is great news... PostgreSQL is definitively my favorite open-source database. It's also nice to see Amazon get into the game, as hosted pg options have been fairly limited. I am slightly disappointed to not see the server-side JS support baked in, and that apparently you can't do reads from distributed replicas. Just the same, I think there will be a lot of progress in this area.
Administering databases is a full-time type of responsibility. Yes, you can get pretty sane defaults, and up and running without much difficulty with MS-SQL, and mySQL has been a defacto standard in the LAMP stack. That said PostreSQL has been a rock solid RDBMS. The commercial extensions for replication have been cumbersome and expensive. Here's hoping that AWS will grow/expand the replications options/features, and that they'll grow to include JS procs as that feature stabilizes.
It looks like the Multi-AZ setup is using block level replication such as DRBD instead of the built-in replication:
> Database updates are made concurrently on the primary and standby resources to prevent replication lag.
Makes me feels better for setting up my own pg cluster on EC2 a week ago, which does allow reads from the replication slave. Plus, I can provision <1000 IOPS (provisioned IOPS is damn expensive with AWS), and get to use ZFS.
From a very quick glance at the small end the Amazon options are a little more than half the price (Ireland single AZ instance pricing and even less reserved) but multi-AZ options are probably very similar.
I don't actually know whether Heroku can failover across an AZ failure.
Yeah I was disappointed to see this.. Can you configure pgpool externally to talk to the replicas or are those multi-az instances a black box? I've never used RDS before so I'm curious as to how much flexibility is afforded.
Hmm, interesting development. But am I the only one to see a huge gap in services between Amazon and Heroku, and one which most certainly bodes bad for Heroku? Specifically, with Heroku's recent PG 2.0 service, their 'production grade' Standard plan ranges from $50 - $3500, but "Up to 1 hr downtime per mo." ?!?1! Huh, really, with up to an hour downtime /month, you couldn't be serious about the product that runs atop this tier - a non starter me any most other SaaS businesses. Heroku's cheapest "high availability option starts at $200/month, still with "Up to 15 minutes of downtime per month."... still, this is a concern for me...
Now, comparing to Amazon, their '1.7 GB memory Small DB, 1-year reserved, multi-region' is around $28/month (with storage & transfer for my app no more $35/ month). The equivalent Heroku plan, Tengu (1.7 GB mem) STARTS at $350/month!!! Wow, not I'm really rethinking my hosting platform.... Amazon looks more attractive, even if I have to do a bit of sys admin for my web server/cloud server.
Heroku will still be my go-to for quick tests and experiments. But for larger projects it would be foolish to not consider PostgresRDS + Docker on AWS. *assuming they both leave beta.
this is awesome!
still, i miss one essential feature - compared to mysql on aws and heroku/postgres: there is no replication for read replicas. yes, you can deploy a "hot standby" replica in another availability zone for failover, but you cant read from it.
[+] [-] adrianh|12 years ago|reply
I'm using the AWS stack for http://www.soundslice.com/ and I've been using MySQL instead of Postgres, purely because my hatred for MySQL is less than my hatred for being a sysadmin. It was a tradeoff, and I miss Postgres dearly every time I use MySQL.
This new Amazon offering solves that.
I wrote a little more about my AWS setup here: http://www.holovaty.com/writing/aws-notes/
[+] [-] fat0wl|12 years ago|reply
[+] [-] sync|12 years ago|reply
FYI, it doesn't look your payment modal works on smaller browser windows: https://www.monosnap.com/image/UvJbxMQEwkzqLH5btNJ7a9G6Q ... no visible pay button, and the modal itself scrolls when you scroll the page.
[+] [-] mmcclellan|12 years ago|reply
[+] [-] pc86|12 years ago|reply
[+] [-] coolrhymes|12 years ago|reply
[+] [-] saltyknuckles|12 years ago|reply
[+] [-] yumraj|12 years ago|reply
Does it make sense to migrate to PostgreSQL, I don't have a lot of data as I'm in the early stage? What are the primary advantages that PostgreSQL provides over MySQL? Any advise/pointers is appreciated.
[+] [-] adrianh|12 years ago|reply
Perhaps you mistakenly insert "2013-10-32" into a date column. MySQL will silently convert this to "0000-00-00" (!!). Postgres will raise an error.
Perhaps you make an error in a transaction. MySQL lets you keep doing subsequent things in the transaction. Postgres treats the transaction as invalid and forces you to start over.
Perhaps you want to add a column to a table that has millions of rows. With MySQL, you'll be waiting a looooong time (see http://stackoverflow.com/questions/463677/alter-table-withou...). With Postgres, it takes about a second.
Of course, there are things you can do to make MySQL less horrible, and this is a generalization. But Postgres is just more respectful and more solid.
Oh, and PostGIS (Postgres' geo add-on) is by far the best open-source geospatial database. If you're doing anything with geographic queries, you need to be using it. MySQL's stuff is laughable in comparison.
Context: I've dealt extensively with both databases, both from the perspective of a framework author (Django) and a developer making products. I've used both databases on and off since 2001.
[+] [-] jeffdavis|12 years ago|reply
I think the biggest advantages are trust and flexibility.
Trust, because PostgreSQL language semantics are cleaner, closer to the SQL standard, and less likely to surprise you. And postgresql just has a good reputation for traditional engineering quality.
Flexibility, because it offers a lot of APIs and features that can be very useful to adapt your application as needs change. You don't have to go crazy with features, but even simple apps can benefit a lot from prudent use of them -- a trigger here, a foreign table there, or LISTEN/NOTIFY (for cache invalidation) can just save you a huge amount of work and make the system more robust overall. The extension mechanism is very powerful.
Before making any big decisions, do a trial migration and see what you think.
[+] [-] gdulli|12 years ago|reply
- More robust, fewer crashes, less corruption of data
- More features (JSON data type, partial indexes, function/expression indexes, window functions, CTEs, hstore, ranges/sequences/sets, too many to list)
- More disciplined (doesn't do things like auto-truncate input to get it to fit into a column)
- Not owned by Oracle, it's actively developed, regular major release schedule, etc.
- Better Python driver (don't know about other languages)
- Choice of languages for database functions/procedures (Python, JS, etc.)
- Better partitioning support
- Better explain output, explain analyze, buffers
- Multiple indexes allowed per table in a query
[+] [-] pnathan|12 years ago|reply
The basic answer is (from what I can tell) that postgres ships with sane defaults. Lots of little gotchas exist in mysql that experienced mysql dbas know to deal with and avoid. On the more sophisticated side, pgSQL seems to have a focus on being "Really Awesome For Experts" where MySQL seems to be focusing on "Being Easy to Get Started".
I've been using pgSQL for my side project and for my "personal tooling" at work, and I can honestly say that it's just as easy for me as MySQL was for the same sort of things.
[+] [-] the_mitsuhiko|12 years ago|reply
If MySQL works for you, there is no reason to change. Some people have quite a dependency on Postgres (hstore, JSON, transactional DDL, pubsub, PLPGSQL etc.)
[+] [-] rachbelaid|12 years ago|reply
Do some search and give it a try one weekend.. Trying it is loving it.
[+] [-] Sanddancer|12 years ago|reply
[+] [-] rektide|12 years ago|reply
[+] [-] unknown|12 years ago|reply
[deleted]
[+] [-] falcolas|12 years ago|reply
RDS removes remarkably little of the pain of running a database instance (most of the pain that's removed is just the up front setup), and ends up adding a lot of inconveniences for your day-to-day operations.
Also don't count on their replication as your backup.
OK. That's two suggestions, but I think it's OK.
[+] [-] acdha|12 years ago|reply
This statement seems like a complete non-sequitur: does anyone credible recommend replication as a backup strategy? It's like dinging a server vendor because you can't rely on RAID as a backup plan.
[+] [-] bsaul|12 years ago|reply
[+] [-] cbsmith|12 years ago|reply
[+] [-] tracker1|12 years ago|reply
Administering databases is a full-time type of responsibility. Yes, you can get pretty sane defaults, and up and running without much difficulty with MS-SQL, and mySQL has been a defacto standard in the LAMP stack. That said PostreSQL has been a rock solid RDBMS. The commercial extensions for replication have been cumbersome and expensive. Here's hoping that AWS will grow/expand the replications options/features, and that they'll grow to include JS procs as that feature stabilizes.
[+] [-] j-kidd|12 years ago|reply
> Database updates are made concurrently on the primary and standby resources to prevent replication lag.
Makes me feels better for setting up my own pg cluster on EC2 a week ago, which does allow reads from the replication slave. Plus, I can provision <1000 IOPS (provisioned IOPS is damn expensive with AWS), and get to use ZFS.
[+] [-] frakkingcylons|12 years ago|reply
[+] [-] josephlord|12 years ago|reply
I don't actually know whether Heroku can failover across an AZ failure.
[+] [-] thinkalone|12 years ago|reply
[+] [-] rachbelaid|12 years ago|reply
Not all Pl are available, and it misses the PL/V8 and PL/Python at least.
And it seems that all fdw (Foreign Data Wrapper) extensions are missing.
But it's a great start, I'm looking forward to try.
If anybody know if we can still access the WAL log then it will be very useful
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_P...
[+] [-] aschreyer|12 years ago|reply
EDIT: you are right, PL/Python is supported as well. I only read the "Language Extensions :PL/Perl, PL/pgSQL, PL/Tcl" part at the top.
[+] [-] elchief|12 years ago|reply
[+] [-] onedognight|12 years ago|reply
[+] [-] skyebook|12 years ago|reply
[+] [-] scosman|12 years ago|reply
[+] [-] ptio|12 years ago|reply
Here is a related whitepaper if you still want to setup PostgreSQL yourself: http://media.amazonwebservices.com/AWS_RDBMS_PostgreSQL.pdf
[+] [-] revetkn|12 years ago|reply
Edit: It does, see http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_P...
[+] [-] noviceapp|12 years ago|reply
Now, comparing to Amazon, their '1.7 GB memory Small DB, 1-year reserved, multi-region' is around $28/month (with storage & transfer for my app no more $35/ month). The equivalent Heroku plan, Tengu (1.7 GB mem) STARTS at $350/month!!! Wow, not I'm really rethinking my hosting platform.... Amazon looks more attractive, even if I have to do a bit of sys admin for my web server/cloud server.
[+] [-] frankwiles|12 years ago|reply
[+] [-] tzaman|12 years ago|reply
[+] [-] mike_herrera|12 years ago|reply
[+] [-] jakozaur|12 years ago|reply
[+] [-] caioariede|12 years ago|reply
I expect this helps you.
[1] https://pypi.python.org/pypi/py-mysql2pgsql/0.1.5
[+] [-] MrGando|12 years ago|reply
[+] [-] runako|12 years ago|reply
[+] [-] michaelbuckbee|12 years ago|reply
[+] [-] NatW|12 years ago|reply
If/when a Heroku RDS plugin for Postgresql arrives, competing benchmarks, a cost calculator, pros and cons would be very interesting, indeed.
Heroku pricing: https://addons.heroku.com/heroku-postgresql
Amazon pricing: http://aws.amazon.com/rds/postgresql/
[+] [-] eonil|12 years ago|reply
Though it was free test version, I surprised to see my database is living together with many other people. (`\l`)
Your database is available publicly, and you have only minimal security.
[+] [-] jeffdavis|12 years ago|reply
[+] [-] philliphaydon|12 years ago|reply
This is awesome news! I hope we can move to it at work!
[+] [-] bmaeser|12 years ago|reply
[+] [-] simonebrunozzi|12 years ago|reply
[+] [-] spwestwood|12 years ago|reply