For those stuck on older versions of Postgres, I highly recommend paying the downtime to upgrade. Going from 9.x to 11 will get you a measurably large performance gain for free.
Out of interest (SQL Server guy mainly, so only partly keep up with what other engines are doing), what changes significantly affect performance (without making changes to your own code/configuration to make use of new features) in 10.x & 11.x?
I maintain a couple of MySQL based applications. I don't really use any features outside of "standard SQL" is there a reason to switch over to Pg? I haven't used Pg before and usually default to MySQL.
begin;
alter table foos add answer int not null default 42;
alter table foos drop column plumbus;
update foos set name = upper(name);
create table bars (t serial);
drop table dingbats;
rollback; // Or, of course, commit
What's the benefit? Atomic migrations. You can create, alter, drop tables, update data, etc. in a single transaction, and it will either commit complete if all the changes succeed, or roll back everything.
This is not possible in MySQL, or almost any other database [1], including Oracle — DDL statements aren't usually transactional. (In MySQL, I believe a DDL statement implicits commits the current transactions without warning, but I could be wrong.)
Beyond that, I'd mention: PostGIS, arrays, functional indexes, and window functions. You may not use these things today, but once you discover them, you're bound to.
At my PHP-shop company, most projects are limited to MySQL 5.7 (legacy reason, dependency reason, boss-likes-MySQL reason...). They are all handicapped by MySQL featureset, and can't update to 8 yet. If they had used Postgres some years ago, they would get:
- JSON column (actually MySQL 5.6 supports it but I doubt if it's as good as Postgres)
- Window functions (available in MySQL 8x only, while this has been available since Postgres 9x)
- Materialized views, views that is physical like a table, can be used to store aggregated, pre-calculated data like sum, count...
- A robust security model with role inheritance that supports both column and row policies.
- PLV8/PLPython/C functions/etc (with security!)
- TimescaleDB
- Better JSON query support
- Foreign Data Wrappers
- Better window function support
- A richer extension ecosystem (IMO)
Honestly, at this point I wouldn't use MySQL unless you only care about slightly better performance for very simple queries and simpler multi-master scaling/replication. Even saying that, if you don't need that simple multi-master scaling RIGHT NOW, improvements to the Postgres multi-master scaling story are not too far off on the roadmap, so I would still choose PG in that case.
The benefits are better defaults in terms of data reliability. Hard to say if migration is worth it to you without a lot more details, but I'd definitely recommend trying it in a new project.
I can't remember the last time I started a project using MySQL, may be it is catching up - but PITR, partition tables and document-oriented columns are some features I use a lot. Postgis also used to be stronger then the MySQL counterpart.
Running 10.7 and 10.6 on two production applications with Heroku. Thinking about moving to 11 to ensure support for the long run as I rarely need to touch this and it's very stable but would like to minimize any headaches in the future.
Any complications or hiccups I need to worry about moving from 10 to 11?
Per Heroku Docs: By supporting at least 3 major versions, users are required to upgrade roughly once every three years. However, you can upgrade your database at any point to gain the benefits of the latest version.
PostgreSQL has replication built in now. I set it up at work, and it replicates reliably, in a fraction of a second. I've never had to fail over, but it seems straightforward to do so. The only hard part was following Postgres's documentation in setting it all up. It seemed to me a bit scattered to me. I had to jump around to different sections before I put it all together in my mind.
We've been doing replication for 3+ years with zolando patroni. It works great. We run pg in docker and patroni too. First it was patroni with consul and right now its patroni with kubernes store (it store leader in endpoint). Highly recommend. There are other popular tools for this, it just a preference.
Question from a Python web developer. (Django mainly, exploring Flask presently)
For a complex web-app, would you suggest an ORM (looking at SQLAlchemy) or a custom module with hand written queries and custom methods for conversion to python objects?
My app has a lot of complex queries, joins, etc. and the data-model is most likely to change quite a bit as the app nears production. I feel using an ORM is an unnecessary layer of abstraction in the thinking process. I feel comfortable with direct SQL queries, and in some cases, want to directly get JSON results from PGSQL itself.
Would that be a good idea, and more importantly, scalable?
Note : My app will be solely developed by me, not expecting to have a team or even another developer work on it.
Use both. Many of the business logics are just as simple as query by id, filter/sort by a couple of columns. A smart ORM will handle fetching relationships without hitting N+1 problem
For advanced queries, you can write raw SQL
The way I see it, an ORM has three useful features:
- A migration/seed mechanism (you will need it anyway)
- A schema definition for mapping tables to object
- A query builder
If you feel that an ORM is too heavy, you can seek for just the query builder.
SQL alchemy is good for fairly straightforward queries where you mainly need to do "select * from ..." and you want to pull down related rows from another table based on a foreign key in the first table as a separate query. It's also good if you have a lot of junior devs that don't know SQL and you want to encapsulate complex sql logic for them.
If you're doing anything more complex than these basic sorts of queries and subqueries, or your developers are proficient in sql, using even a very good ORM like sqlalchemy is going to be a step down.
Since you say you're doing this all yourself, and SQL is probably the most ubiquitous programming language (in terms of percentage of jobs requiring it, not total LOC) so learning opportunities there are more valuable, I would go direct.
I have used Postgres with both Django and Flask quite a bit now.
Since you're probably used to dealing with and migrating your tables manually, I would keep custom SQL for all your complex operations, and use SQLAlchemy for doing basic insert/update/select. Django also has an "unmanaged" mode where you can create a model and it will avoid trying to create a migration to create the table.
Of course, you have to manually update the model if you manually change your DDL.
Watch out for differences on how you are serializing data from Django/SQLAlchemy models vs. raw dicts from PsychoPG.
I like to organize my SQL by keeping each query in a separate .sql file and writing a little wrapper that fetches the files (+1 for caching it) and then executing it. I'm not a fan of lots of inline SQL mixed with Python.
Fellow Python-Postgres web dev here. (Pyramid is my framework of choice, check it out!)
I'm developing a web application that uses SQLAlchemy. The ORM has been a huge boon for CRUD functionality. We also have some very complicated reporting features and use SQLAlchemy's query builder almost exclusively. I find that the query builder maps very cleanly to SQL, so I can still "think" in SQL while writing and reading it. And the query builder makes complex query composition easier to manage.
I've found that 'sqlalchemy.sql.text' works well for complex queries that don't need to be built up incrementally, and the fluent sql interface is great otherwise. Also, reflection can be really useful when working with existing databases, and for maintenance scripts that might not need to be tied directly to your model.
SQLAlchemy provides more than just the ORM... I actually wish the docs were structured differently to better emphasize that in search results, etc.
You can use SQLAlchemy Core for SQL generation and execution, without using its ORM. This lets you build queries from reusable Python objects rather than strings, and use Alembic for DB migrations, while still retaining control over the generated SQL.
Totally wish we could upgrade but for some reason AWS have still not implemented any upgrade path for Aurora PostgreSQL other than dump and reimport despite apparently working on it for a year...
MySQL has Galera: is there a multi-master option for Pg?
I know of BDR, earlier versions of which are open source, but there hasn't been much movement with Pg 10 or 11 AFAICT.
We don't do anything complicated, but simply want two DBs (with perhaps a quorum system) that has a vIP that will fail-over in case one system goes down (scheduled or otherwise).
Galera provides this in a not-too-complicated fashion.
PostgreSQL has logical replication built-in since version 10. This allows you to replicate specific tables between multiple master databases, accepting writes on each. You define a merge function in case there's conflicts.
impressive and .. upgrade on 10.x now in process, easily, quickly, thanks to the Postgres PGDG Debian/Ubuntu repos .. BUT do not choose meta-package postgres ! Under Ubuntu at least, upgrading the meta-package postgres adds an entire new server 11+ without confirmation .. why is this tolerated.. genuinely annoying
keep in mind that the "official" docker images are "offical" in the sense of docker inc marking them as official, not in the sense of "the upstream provides these". This is the repo for the Dockerfiles https://github.com/docker-library/postgres and it begins with:
> This is the Git repo of the Docker "Official Image" for postgres (not to be confused with any official postgres image provided by postgres upstream)
I'm not a database guy so have no clue, but why are there so many versions receiving support? Is there just that much legacy crap they can't get away from, like Python?
People are slow to upgrade database systems, as it can take a log of regression testing to make absolutely sure your applications don't rely on unsupported/undocumented/undefined behaviours that make them compatible with the newest release (or are affected by officially acknowledged breaking changes). Especially in enterprise systems. Even if developers upgrade quickly, their clients with on-prem installations may not. That means that to be taken seriously you need to support your major and minor releases for some time to be accepted as a serious option in some arenas.
Supporting five versions is no more than MS do: currently SQL Server versions 2017, 2016sp2, 2016sp1, 2014sp3, 2014sp2, 2012sp4, 2008R2sp2 and 2008sp3. 2008sp3, 2008R2sp2, and 2016sp1 will hit their final EOL in a couple of months taking SQL Servers's supported list back down to 5 too.
I expect other significant DB maintainers have similar support life-time requirements for much the same reasons, though I'll leave researching who does[n't] as an exercise for the reader.
With databases being often mission critical, the PostgreSQL people decided heroically to support major versions for 5 years -- and as they come out with a new major version every year, minor updates come out for 5 different branches.
Note the recent versioning change: 9.4, 9.5, 9.6 were the previous 3 major versions bases, and the last two are 10 and 11.
Moving between major versions of Postgres requires downtime proportional to the size of the database. Supporting older versions allows users to go many years without having to do this.
Postgres users actually generally upgrade faster than those using other databases because there are a lot of new features each year. But once your database gets huge then upgrading still becomes a pain, so that's why they keep providing security support and bug fixes for older versions as well.
Well supported older releases of the database engine, with clearly defined migration documentation and technology -- are the hallmark of successful Open source software ecosystem.
Because it mirrors and supports the reality of the business world.
Every large or small organization that manages their business, every year make 'Grow/Invest', 'Maintain', 'Disinvest' decision for each of the product/service lines.
Does not matter if is software, or making kielbasa.
Postgres is exceptional, and is supporting the first 2.
With semantic versioning each time the major version changes it signifies a breaking change. If you have an application that breaks from one of those breaking changes you may not see it as a business opportunity to update because it “works” as it is. However, minor version changes can include anything that doesn’t break. So security patches are hopefully added to any major version that is officially supported.
rtpg|6 years ago
dspillett|6 years ago
barrkel|6 years ago
I'll update this thread when I find out why.
eberkund|6 years ago
atombender|6 years ago
This is not possible in MySQL, or almost any other database [1], including Oracle — DDL statements aren't usually transactional. (In MySQL, I believe a DDL statement implicits commits the current transactions without warning, but I could be wrong.)
Beyond that, I'd mention: PostGIS, arrays, functional indexes, and window functions. You may not use these things today, but once you discover them, you're bound to.
[1] https://wiki.postgresql.org/wiki/Transactional_DDL_in_Postgr...
kangoo1707|6 years ago
- JSON column (actually MySQL 5.6 supports it but I doubt if it's as good as Postgres)
- Window functions (available in MySQL 8x only, while this has been available since Postgres 9x)
- Materialized views, views that is physical like a table, can be used to store aggregated, pre-calculated data like sum, count...
- Indexing on function expression
- Better query plan explanation
oarabbus_|6 years ago
CuriouslyC|6 years ago
- PLV8/PLPython/C functions/etc (with security!)
- TimescaleDB
- Better JSON query support
- Foreign Data Wrappers
- Better window function support
- A richer extension ecosystem (IMO)
Honestly, at this point I wouldn't use MySQL unless you only care about slightly better performance for very simple queries and simpler multi-master scaling/replication. Even saying that, if you don't need that simple multi-master scaling RIGHT NOW, improvements to the Postgres multi-master scaling story are not too far off on the roadmap, so I would still choose PG in that case.
mixmastamyk|6 years ago
scardine|6 years ago
dbrgn|6 years ago
brightball|6 years ago
EDIT: Apparently 11.1 is available in beta as of April 9th.
cglace|6 years ago
SnowingXIV|6 years ago
Any complications or hiccups I need to worry about moving from 10 to 11?
Per Heroku Docs: By supporting at least 3 major versions, users are required to upgrade roughly once every three years. However, you can upgrade your database at any point to gain the benefits of the latest version.
skymt|6 years ago
rooam-dev|6 years ago
How do you manage failover and replication? At my previous job this was done by a consultant. Is this doable on a self hosted setup?
Thank you in advance.
combatentropy|6 years ago
truth_seeker|6 years ago
https://github.com/citusdata/citus
cromantin|6 years ago
htn|6 years ago
kumarvvr|6 years ago
For a complex web-app, would you suggest an ORM (looking at SQLAlchemy) or a custom module with hand written queries and custom methods for conversion to python objects?
My app has a lot of complex queries, joins, etc. and the data-model is most likely to change quite a bit as the app nears production. I feel using an ORM is an unnecessary layer of abstraction in the thinking process. I feel comfortable with direct SQL queries, and in some cases, want to directly get JSON results from PGSQL itself.
Would that be a good idea, and more importantly, scalable?
Note : My app will be solely developed by me, not expecting to have a team or even another developer work on it.
kangoo1707|6 years ago
For advanced queries, you can write raw SQL
The way I see it, an ORM has three useful features:
- A migration/seed mechanism (you will need it anyway)
- A schema definition for mapping tables to object
- A query builder
If you feel that an ORM is too heavy, you can seek for just the query builder.
scardine|6 years ago
It maps pretty much 1:1 to SQL and for me it beats the alternative (using text interpolation for composing queries).
CuriouslyC|6 years ago
If you're doing anything more complex than these basic sorts of queries and subqueries, or your developers are proficient in sql, using even a very good ORM like sqlalchemy is going to be a step down.
Since you say you're doing this all yourself, and SQL is probably the most ubiquitous programming language (in terms of percentage of jobs requiring it, not total LOC) so learning opportunities there are more valuable, I would go direct.
dliff|6 years ago
Since you're probably used to dealing with and migrating your tables manually, I would keep custom SQL for all your complex operations, and use SQLAlchemy for doing basic insert/update/select. Django also has an "unmanaged" mode where you can create a model and it will avoid trying to create a migration to create the table.
Of course, you have to manually update the model if you manually change your DDL.
Watch out for differences on how you are serializing data from Django/SQLAlchemy models vs. raw dicts from PsychoPG.
I like to organize my SQL by keeping each query in a separate .sql file and writing a little wrapper that fetches the files (+1 for caching it) and then executing it. I'm not a fan of lots of inline SQL mixed with Python.
Overall I think it's a great + powerful setup!
mitch3x3|6 years ago
cur.execute(query, {‘foo’: bar})
Passing values directly into cur.execute is the best way to prevent SQL injection as well since it will sanitize the input params upon running
luhn|6 years ago
I'm developing a web application that uses SQLAlchemy. The ORM has been a huge boon for CRUD functionality. We also have some very complicated reporting features and use SQLAlchemy's query builder almost exclusively. I find that the query builder maps very cleanly to SQL, so I can still "think" in SQL while writing and reading it. And the query builder makes complex query composition easier to manage.
rch|6 years ago
SQLAlchemy provides more than just the ORM... I actually wish the docs were structured differently to better emphasize that in search results, etc.
nicwolff|6 years ago
mixmastamyk|6 years ago
Hard to say, but don't forget about migration support, which is quite helpful.
Tomdarkness|6 years ago
mevile|6 years ago
jontonsoup|6 years ago
throw0101a|6 years ago
I know of BDR, earlier versions of which are open source, but there hasn't been much movement with Pg 10 or 11 AFAICT.
We don't do anything complicated, but simply want two DBs (with perhaps a quorum system) that has a vIP that will fail-over in case one system goes down (scheduled or otherwise).
Galera provides this in a not-too-complicated fashion.
smilliken|6 years ago
mistrial9|6 years ago
shawnz|6 years ago
dochtman|6 years ago
Xylakant|6 years ago
> This is the Git repo of the Docker "Official Image" for postgres (not to be confused with any official postgres image provided by postgres upstream)
micmil|6 years ago
dspillett|6 years ago
Supporting five versions is no more than MS do: currently SQL Server versions 2017, 2016sp2, 2016sp1, 2014sp3, 2014sp2, 2012sp4, 2008R2sp2 and 2008sp3. 2008sp3, 2008R2sp2, and 2016sp1 will hit their final EOL in a couple of months taking SQL Servers's supported list back down to 5 too.
I expect other significant DB maintainers have similar support life-time requirements for much the same reasons, though I'll leave researching who does[n't] as an exercise for the reader.
Erwin|6 years ago
Note the recent versioning change: 9.4, 9.5, 9.6 were the previous 3 major versions bases, and the last two are 10 and 11.
sargun|6 years ago
2) It's horrifically high risk because downgrading is usually not a thing
3) It usually requires downtime.
profmonocle|6 years ago
Alex3917|6 years ago
75dvtwin|6 years ago
Because it mirrors and supports the reality of the business world.
Every large or small organization that manages their business, every year make 'Grow/Invest', 'Maintain', 'Disinvest' decision for each of the product/service lines.
Does not matter if is software, or making kielbasa. Postgres is exceptional, and is supporting the first 2.
unknown|6 years ago
[deleted]
todd3834|6 years ago
ddorian43|6 years ago
There's legacy crap everywhere, all langs,db,versions etc. Supported sometimes for 10+ years.
FraaJad|6 years ago
baq|6 years ago
unknown|6 years ago
[deleted]