Like every year before, the Postgres team has blessed us with an early christmas present. And like every release post before, I'd like to use this opportunity to say thanks to the team for the awesome job they are doing year after year.
It's not just the database itself (and that's awesome on its own right), but it's also all the peripheral stuff: The documentation is seriously amazing and very complete, the tools that come with the database are really good too (like psql which I still prefer to the various UIs out there).
Code-wise, I would recommend anybody to have a look at their git repo and the way how they write commit-messages: They are a pleasure to read and really explain what's going on. If everybody wrote commit messages like this, we'd be in a much better place what code-archeology is concerned.
Patches from the community are always patiently reviewed and, contrary to many other projects, even new contributors are not really required to have a thick skin nor flame retardant suits. The only thing required is a lot of patience as the level of quality required for a patch to go in is very, very high.
Finally, there's #postgresql on Freenode where core developers spend their time patiently helping people in need of support. Some questions could be solved by spending 30 seconds in the (as I said: excellent) manual and some of them point to really obscure issues, but no matter what time it is: Somebody in #postgresql is there to help you.
I think there's no other free software project out there that just gets everything right: Very friendly community, awesome documentation, awesome tools, and of course and awesome product offering to begin with.
Huge thanks to everybody involved.
Also: Huge YAY for jsonb - I have many, many things in mind I can use that for and I have been looking forward to this for a year now.
> we'd be in a much better place what code-archaeology is concerned.
This sounds like a great setup for a sci-fi novel. 500 years into the future, the infrastructure their distant ancestors coded has begun to fail. Now Biff Miffington, code-archaeologist, must sift through millions of forgotten messages using a mysterious tool remembered only as "git." Its interface is arcane and the remaining messages broken, tainted by the destructive influence of Mountain Dew and Cheetos. Will he unravel the mystery that's causing Candy Crush Saga MCCXXXI to kill its users?
Edit: On what OP actually said, I'd also like to say that Postgres is an awesome product.
In addition to those peripheral benefits there's also the intangibles. Mainly that worrying about database stability is just not part of my life the way it was five years ago before I'd fully migrated to PostgreSQL.
well said @pilif. I've been using PostgreSQL since the early 6.x days as the back end to web sites. As time went on, I developed an Accounting & ERP software suite in C++ with a thin wrapper around libpq. We've pretty much standardized on PostgreSQL for the things you just said: "friendly community, awesome documentation, awesome tools, and of course and awesome product...."
I literally have a multi-million dollar business & product because of it. And without it, we would be stuck with Oracle or MS-SQL.
And we just recently added full text search within our product using the PostgreSQL full text add-on. My customers absolutely love the feature and they love us because of this.
+1 for #postgresql help. Recently I had a really weird problem, and after a few hours with forums and manuals I went there to ask for help. They saved me I don't know how many more hours that would almost for sure not found the completely unexpected answer. Thanks!
Since you're planning on doing some conceptual work with JSONB, just a heads upon one gotcha -
duplicate properties are not allowed. I.E.:
{
task: "do stuff",
task: "do other stuff"
}
which sometimes is useful when you have front-end data with an N-number of entries but its a form that serializes to an object instead of an array. There are other use cases too.
Just want to say to the PostgreSQL and EnterpriseDB guys that it's always great to see the progress on this. My hopes for 9.5/10 is that we will see PLV8 and Replication baked into the actual release.
PLV8 is such a natural fit with the new JSON(B) types that it's probably going to become the most used extension with that data type... And imho sorely missing from the out of the box experience. I'm glad that they've concentrated on getting the data structure and storage right first. Hopefully we'll see this in vNext.
As to replication, I understand that this is part of EnterpriseDB's business model, just the same not having the basic replication pieces baked in, is still lacking compared to other databases. Even if the graphical tooling was commercial only, and all the knob frobbing via config or command line is more complex, having it in the box is a must imho. I actually really like how MongoDB handles their replica sets, and where RethinkDB is going with this as well. Though they aren't transactional SQL databases primarily, it's a must have feature these days. Replication with automagic failover is a feature that has gone past enterprise-only.
One last piece, would be if there were built in functions similar to the String.prototype.normalize that was added to JavaScript... so that input strings could be normalized easier for comparison/indexing, though PLV8 support could/would bring this readily.
All the same, thanks for all of your hard work, and I look forward to the future of PostgreSQL.
Yes, replication with auto failover is far away from enterprice-only today. If you pay 5$/month you can get a SQL Database in Azure running on 3 nodes with auto failover. One node is synchronous and the second one is async updated. I wouldnt choose a database today unless setting up somerhing like that is obvious and trivial.
JSONB is getting a lot of attention (and deservedly so) but logical decoding is much more exciting to me. Being able to capture postgres transactions to put into a durable log (like Kafka) for downstream processing is a fundamental tool needed to build a unified logging architecture. If you've worked with hadoop you've probably tried to approximate this by hand by taking regular snapshots of your database or something, but this is much, much more sane. Really exciting. Great work postgres team!
Wow, that's fantastic. My queries are littered with that type of construct. Unfortunately, we are using SQL Server and Microsoft hardly ever makes any developer-friendly enhancements to T-SQL.
Switched over to PostgreSQL for a personal project for the json datatype. Great if you want some mongo-esque document storage without losing out on having a relational database.
That info is for PostgreSQL 9.3 (and 9.2). 9.4 comes with other enhancements if you're using json. jsonb in particular opens up whole new possibilities if you're doing heavier work with json data.
Next year I really need to switch from MSSQL to Postgres. The work that the Postgres team have done in the last 2-3 years is really amazing.
They are also clearly reaping the benefits of some very smart architectural decisions, and that gives me the confidence that they will be able to continue innovating in the coming years.
There is actually a lot of overlap in the feature set. When I moved from MSSQL to a gig that was all Postgres it was fairly smooth to do so, the main differences being OS related. Thus you don't need to switch over entirely to add a useful tool to the toolbox.
MSSQL has some advantages in parallel query execution and data warehousing enhancements - if you can afford the licensing for the latter. The other view on that is you can tip up as many Postgres production instances as you have hardware for without impacting the software budget.
It looks like PostgreSQL is on track to slowly succeed MySQL as the de-facto open source database.
Microsoft tentatively seems to be settling on them as the preferred RDBMS for non-Windows platforms [1]:
> Within ASP.NET 5 our primary focus is on SQL Server, and then PostgreSQL to support the standard Mac/Linux environment.
I use EF+SQL Server and they're very much complementary and provide an excellent developer experience. NHibernate+SQL Server is woeful unless you want to use the loosely-typed Criteria stuff. NH's LINQ provider is terrible and it gets confused at the drop of a hat (call Distinct and then OrderBy? "I'm sorry Dave, I'm afraid I can't do that"). At this point I'm convinced only MS know how to write LINQ providers that won't fall over the moment you try to do something useful with them.
Microsoft writing a LINQ provider for PgSql is a great thing for running .NET code on non-Windows platforms.
> Microsoft writing a LINQ provider for PgSql is a great thing for running .NET code on non-Windows platforms.
I've heard nothing of this kind with respect to npgsql, which is the current .NET provider for PostgreSql. Microsoft has never spent a minute on EF support in Npgsql, so that they bother now is a first. The thing is that to support EF in npgsql (or any other ADO.NET provider), the ADO.NET provider has to contain a command interpreter which interprets the command trees coming from EF's linq provider, and which are then to be used to create SQL statements. This isn't simple at all, and as the command trees change with EF7, it will be a struggle for MS to get a lot of ADO.NET providers support EF7 at the start.
Microsoft's only great linq provider is the one in Linq to Sql: it is able to handle a tremendous amount of edge cases. The thing with linq providers is that a general linq provider gets you only that far: a tremendous amount of cases are 'special cases' which have to get their own path to get from the expression-tree to specific sql. e.g.: ctx.A.Select(a=>a.B.Cs);. This gives a set of sets of C instances. To do this, you have to know at the materialization side which C rows belong to which set (as you have to group them by B, which isn't in the projection). Linq to Sql has a specific piece of code for this, it produces a specifically grafted SQL query which contains an extra column so the materializer can know which C rows to group together. EF doesn't, it obtains a big joined soup.
Irony is that EF7's linq provider will be built on Relinq, which is also the base of NHibernate's linq provider, and they didn't re-use the Linq to sql linq provider, which is kind of odd, considering linq to sql's is pretty db agnostic.
Writing a linq provider isn't simple btw. It took me a full year full time to write the one for LLBLGen Pro.
> At this point I'm convinced only MS know how to write LINQ providers that won't fall over the moment you try to do something useful with them.
I would argue Revenj + Postgres provide much better developer experience. But as you said, it's not written by Microsoft so that attitude doesn't help it out.
I'm 100% sure Microsoft can't write LINQ provider which actually understands Postgres and can use it to the fullest (as Revenj can).
I'm really pumped about the update to GIN indexes, and the ability to concurrently update materialized views. Both enhancements are huge for the postgres ecosystem, and especially for productionizing postgres databases.
Postgres is about to be the new hotness. I mentioned to our hosting provider that we were looking into moving our in-house Oracle and MySQL to Postgres (off Oracle because it's expensive, off MySQL 'cos it's shit) and he said more than a few customers were looking into this precise move.
We're just getting into PG now, and it's just really nice to set up and use. I really wish more web stuff properly supported PG and didn't pretty much require MySQL.
Only someone with a poor understanding of the industry would think that. Mongo is killing it in the enterprise and their partnerships with Teradata et al is only going to cement their position for likely decades to come.
Plus MongoDB's pluggable engine approach will definitely breath some new life into it.
It might go against the "no transaction" crowd, but seems useful for performance-critical needs. I'm scheduling a bit of testing time with it next week to see if it's something I'd roll out in production (Maria 10 system)
Great news! I'd love to move over to this from MongoDB for a project that has high uptime requirements. But while I think the JSON will really replace it, does PG have a solution for High Availability (like replica sets) in the works?
I'm newer to Postgres so am not sure. Replica Sets are the killer feature for me, more so than just storing JSON documents. I'd appreciate if someone can chime in. I've done some googling but there seem to be multiple strategies for replication.
Hey there. I gave a talk at PGConfNYC '14 earlier this year about our solution to HA using open source solutions. The animations and stuff don't really work too well on Slideshare, but it should point you in the right direction:
pgSQL doesn't have anything built in for fencing, failing-over, etc. by default, but by using stuff like Pacemaker, you can get the job done with a little elbow grease.
There are some open-source options, and several closed source options for this... parts are closer to baked in with 9.4, but being realistic, you need to defer to a commercial option.
EnterpriseDB pricing for this isn't too bad (about $7k/cpu-socket/year), which is a lot less than MS-SQL, DB2 or Oracle for most uses... but it's imho a feature that should be in the box.
Allow views to be automatically updated even if they contain some non-updatable columns
Allow control over whether INSERTs and UPDATEs can add rows to an auto-updatable view that would not appear in the view. This is controlled with the new CREATE VIEW clause WITH CHECK OPTION.
Allow security barrier views to be automatically updatable
Whats a good place to suggest a postgresql [json] improvement, my message was intercepted when posting to pg-performance. A major one at the moment is that offset does the select projection on discarded rows (common to use offset in paging), under normal circumstances this isn't a problem, however when does a json operation such as reading a field ->> this causes major performance degrading. Note of course only immutable functions can this optimization apply. There are several workarounds but if PostgreSQL wants to win back some nosql heads it should be straight forward.
In addition to update a json field isn't straight forward, these operations should be supported by first class inbuilt functions.
Its getting close but its not quite a nosql killer yet if they are targetting people who didn't originally come from rdbms background..
[+] [-] pilif|11 years ago|reply
It's not just the database itself (and that's awesome on its own right), but it's also all the peripheral stuff: The documentation is seriously amazing and very complete, the tools that come with the database are really good too (like psql which I still prefer to the various UIs out there).
Code-wise, I would recommend anybody to have a look at their git repo and the way how they write commit-messages: They are a pleasure to read and really explain what's going on. If everybody wrote commit messages like this, we'd be in a much better place what code-archeology is concerned.
Patches from the community are always patiently reviewed and, contrary to many other projects, even new contributors are not really required to have a thick skin nor flame retardant suits. The only thing required is a lot of patience as the level of quality required for a patch to go in is very, very high.
Finally, there's #postgresql on Freenode where core developers spend their time patiently helping people in need of support. Some questions could be solved by spending 30 seconds in the (as I said: excellent) manual and some of them point to really obscure issues, but no matter what time it is: Somebody in #postgresql is there to help you.
I think there's no other free software project out there that just gets everything right: Very friendly community, awesome documentation, awesome tools, and of course and awesome product offering to begin with.
Huge thanks to everybody involved.
Also: Huge YAY for jsonb - I have many, many things in mind I can use that for and I have been looking forward to this for a year now.
[+] [-] gh02t|11 years ago|reply
This sounds like a great setup for a sci-fi novel. 500 years into the future, the infrastructure their distant ancestors coded has begun to fail. Now Biff Miffington, code-archaeologist, must sift through millions of forgotten messages using a mysterious tool remembered only as "git." Its interface is arcane and the remaining messages broken, tainted by the destructive influence of Mountain Dew and Cheetos. Will he unravel the mystery that's causing Candy Crush Saga MCCXXXI to kill its users?
Edit: On what OP actually said, I'd also like to say that Postgres is an awesome product.
[+] [-] gdulli|11 years ago|reply
[+] [-] emcrazyone|11 years ago|reply
I literally have a multi-million dollar business & product because of it. And without it, we would be stuck with Oracle or MS-SQL.
And we just recently added full text search within our product using the PostgreSQL full text add-on. My customers absolutely love the feature and they love us because of this.
Another HUGE Thank you from me too.
[+] [-] danmaz74|11 years ago|reply
[+] [-] IndianAstronaut|11 years ago|reply
I will have to look into this community more. I've been burned by some oss communities before.
[+] [-] qooleot|11 years ago|reply
duplicate properties are not allowed. I.E.:
{ task: "do stuff", task: "do other stuff" }
which sometimes is useful when you have front-end data with an N-number of entries but its a form that serializes to an object instead of an array. There are other use cases too.
[+] [-] tracker1|11 years ago|reply
PLV8 is such a natural fit with the new JSON(B) types that it's probably going to become the most used extension with that data type... And imho sorely missing from the out of the box experience. I'm glad that they've concentrated on getting the data structure and storage right first. Hopefully we'll see this in vNext.
As to replication, I understand that this is part of EnterpriseDB's business model, just the same not having the basic replication pieces baked in, is still lacking compared to other databases. Even if the graphical tooling was commercial only, and all the knob frobbing via config or command line is more complex, having it in the box is a must imho. I actually really like how MongoDB handles their replica sets, and where RethinkDB is going with this as well. Though they aren't transactional SQL databases primarily, it's a must have feature these days. Replication with automagic failover is a feature that has gone past enterprise-only.
One last piece, would be if there were built in functions similar to the String.prototype.normalize that was added to JavaScript... so that input strings could be normalized easier for comparison/indexing, though PLV8 support could/would bring this readily.
All the same, thanks for all of your hard work, and I look forward to the future of PostgreSQL.
[+] [-] internetisthesh|11 years ago|reply
[+] [-] gfodor|11 years ago|reply
[+] [-] jeltz|11 years ago|reply
[+] [-] wvenable|11 years ago|reply
[+] [-] electrum|11 years ago|reply
We also have a count_if() function: count_if(bar)
http://prestodb.io/docs/current/functions/conditional.html#i...
http://prestodb.io/docs/current/functions/aggregate.html#cou...
[+] [-] wiredfool|11 years ago|reply
[+] [-] systematical|11 years ago|reply
http://clarkdave.net/2013/06/what-can-you-do-with-postgresql...
[+] [-] aidos|11 years ago|reply
[+] [-] radicalbyte|11 years ago|reply
They are also clearly reaping the benefits of some very smart architectural decisions, and that gives me the confidence that they will be able to continue innovating in the coming years.
[+] [-] bsg75|11 years ago|reply
MSSQL has some advantages in parallel query execution and data warehousing enhancements - if you can afford the licensing for the latter. The other view on that is you can tip up as many Postgres production instances as you have hardware for without impacting the software budget.
[+] [-] g8oz|11 years ago|reply
[+] [-] taspeotis|11 years ago|reply
Microsoft tentatively seems to be settling on them as the preferred RDBMS for non-Windows platforms [1]:
> Within ASP.NET 5 our primary focus is on SQL Server, and then PostgreSQL to support the standard Mac/Linux environment.
I use EF+SQL Server and they're very much complementary and provide an excellent developer experience. NHibernate+SQL Server is woeful unless you want to use the loosely-typed Criteria stuff. NH's LINQ provider is terrible and it gets confused at the drop of a hat (call Distinct and then OrderBy? "I'm sorry Dave, I'm afraid I can't do that"). At this point I'm convinced only MS know how to write LINQ providers that won't fall over the moment you try to do something useful with them.
Microsoft writing a LINQ provider for PgSql is a great thing for running .NET code on non-Windows platforms.
[1] http://blogs.msdn.com/b/adonet/archive/2014/12/02/ef7-priori...
[+] [-] otis_inf|11 years ago|reply
I've heard nothing of this kind with respect to npgsql, which is the current .NET provider for PostgreSql. Microsoft has never spent a minute on EF support in Npgsql, so that they bother now is a first. The thing is that to support EF in npgsql (or any other ADO.NET provider), the ADO.NET provider has to contain a command interpreter which interprets the command trees coming from EF's linq provider, and which are then to be used to create SQL statements. This isn't simple at all, and as the command trees change with EF7, it will be a struggle for MS to get a lot of ADO.NET providers support EF7 at the start.
Microsoft's only great linq provider is the one in Linq to Sql: it is able to handle a tremendous amount of edge cases. The thing with linq providers is that a general linq provider gets you only that far: a tremendous amount of cases are 'special cases' which have to get their own path to get from the expression-tree to specific sql. e.g.: ctx.A.Select(a=>a.B.Cs);. This gives a set of sets of C instances. To do this, you have to know at the materialization side which C rows belong to which set (as you have to group them by B, which isn't in the projection). Linq to Sql has a specific piece of code for this, it produces a specifically grafted SQL query which contains an extra column so the materializer can know which C rows to group together. EF doesn't, it obtains a big joined soup.
Irony is that EF7's linq provider will be built on Relinq, which is also the base of NHibernate's linq provider, and they didn't re-use the Linq to sql linq provider, which is kind of odd, considering linq to sql's is pretty db agnostic.
Writing a linq provider isn't simple btw. It took me a full year full time to write the one for LLBLGen Pro.
[+] [-] zapov|11 years ago|reply
I would argue Revenj + Postgres provide much better developer experience. But as you said, it's not written by Microsoft so that attitude doesn't help it out.
I'm 100% sure Microsoft can't write LINQ provider which actually understands Postgres and can use it to the fullest (as Revenj can).
[+] [-] dodyg|11 years ago|reply
[+] [-] squigs25|11 years ago|reply
[+] [-] davidgerard|11 years ago|reply
We're just getting into PG now, and it's just really nice to set up and use. I really wish more web stuff properly supported PG and didn't pretty much require MySQL.
[+] [-] mrmondo|11 years ago|reply
[+] [-] gamesbrainiac|11 years ago|reply
[+] [-] codeaken|11 years ago|reply
[+] [-] jpgvm|11 years ago|reply
[+] [-] chx|11 years ago|reply
Disclaimer: I consult for MongoDB Inc.
[+] [-] organsnyder|11 years ago|reply
[+] [-] threeseed|11 years ago|reply
Plus MongoDB's pluggable engine approach will definitely breath some new life into it.
[+] [-] davidw|11 years ago|reply
[+] [-] mgkimsal|11 years ago|reply
http://www.percona.com/doc/percona-server/5.5/performance/ha...
http://www.slideshare.net/akirahiguchi/handlersocket-2010062...
It might go against the "no transaction" crowd, but seems useful for performance-critical needs. I'm scheduling a bit of testing time with it next week to see if it's something I'd roll out in production (Maria 10 system)
[+] [-] tiffanyh|11 years ago|reply
http://www.dragonflybsd.org/performance/
[+] [-] jvinet|11 years ago|reply
http://goessner.net/articles/JsonPath/
http://blog.redfin.com/devblog/2012/03/json_in_postgres.html
That article was written before JSON/JSONB showed up, but the idea remains the same.
I didn't have plv8 installed, so I did some plumbing code in plpython. plv8 would be more suitable though.
https://github.com/jvinet/pg-jsonpath
[+] [-] k_sze|11 years ago|reply
[+] [-] digitalzombie|11 years ago|reply
I ask this question every year and postgresql have not deliver this. If there is any, there are hardly any documentation on it.
[+] [-] atonse|11 years ago|reply
I'm newer to Postgres so am not sure. Replica Sets are the killer feature for me, more so than just storing JSON documents. I'd appreciate if someone can chime in. I've done some googling but there seem to be multiple strategies for replication.
[+] [-] arthursilva|11 years ago|reply
[+] [-] devonnulled|11 years ago|reply
http://www.slideshare.net/TeamARIN/building-a-high-availabil...
pgSQL doesn't have anything built in for fencing, failing-over, etc. by default, but by using stuff like Pacemaker, you can get the job done with a little elbow grease.
[+] [-] en4bz|11 years ago|reply
http://slony.info/
[+] [-] tracker1|11 years ago|reply
EnterpriseDB pricing for this isn't too bad (about $7k/cpu-socket/year), which is a lot less than MS-SQL, DB2 or Oracle for most uses... but it's imho a feature that should be in the box.
[+] [-] gfodor|11 years ago|reply
[+] [-] elchief|11 years ago|reply
http://www.postgresql.org/docs/9.4/static/release-9-4.html
My favourite parts:
Allow views to be automatically updated even if they contain some non-updatable columns
Allow control over whether INSERTs and UPDATEs can add rows to an auto-updatable view that would not appear in the view. This is controlled with the new CREATE VIEW clause WITH CHECK OPTION.
Allow security barrier views to be automatically updatable
[+] [-] nonuby|11 years ago|reply
In addition to update a json field isn't straight forward, these operations should be supported by first class inbuilt functions.
Its getting close but its not quite a nosql killer yet if they are targetting people who didn't originally come from rdbms background..