PostGIS – Spatial and Geographic Objects for PostgreSQL

[+] Ahmed90|5 years ago|reply

4 years ago one of my clients wanted to "donate" a system for the local fire department to help them do a quick proximity search to find the fire hydrants and quickly choose the healthy one near the fire.

And since it was charity and had a bunch of private data Google was not an option ($$$$), so I (just a full-stack developer back then) was like "listen I have no idea what is this GIS stuff, but I'll give it a try", after a quick research boom PostGIS, reading the docs and testing things, plus QGIS to visualize and to help understand it better 10/10!

That was my unintentional "career" shift, thanks to PostGIS I'm now a senior dev at the largest food delivery company (local), specialized in GIS and realtime data driven systems using PostGIS everyday lol

[+] tomrod|5 years ago|reply

Had a similar career shift after helping local public health authorities map COVID-19!

[+] ntechau|5 years ago|reply

[deleted]

[+] jpgvm|5 years ago|reply

PostGIS is one of those rare examples of highly specialised software that is both OSS and best-in-class. Usually OSS is relegated for highly commoditized software but GIS is anything but even after all these years.

I can't really think of many others? Maybe OptaPlanner would be another candidate.

[+] jandrewrogers|5 years ago|reply

Yes, among off-the-shelf GIS databases, PostGIS is generally as good or better than anything else on the market. More capable systems exist, particularly in terms of scale and performance, but those are all bespoke.

[+] leetrout|5 years ago|reply

Is OptaPlanner best in class? (No snark) — It might be the best in Java but not sure more broadly.

[+] gardaani|5 years ago|reply

Here's an interesting performance comparison of PostgreSQL with PostGIS and MongoDB. PostgreSQL outperformed MongoDB in almost all their cases. https://link.springer.com/article/10.1007/s10707-020-00407-w

I've decided to use PostgreSQL in my projects.

[+] tcbasche|5 years ago|reply

Thanks for sharing this! I've been thinking about alternatives to PostGIS to handle larger datasets (millions) and nothing seems to come close to the level of functionality, performance and community support of Postgres

[+] dunefox|5 years ago|reply

Everytime Postgres comes up I'm impressed.

[+] nixpulvis|5 years ago|reply

I've been using PostGIS a bit for a toy project with Elite: Dangerous star system data. It's been a hoot, but I do worry that I'm going to start having trouble optimizing my queries. Finding all the systems within say 20 Ly of our sun can take upwards of a few seconds, and I've already added a GIST index on the positions.

I also find it a bit strange how 3D feels kinda tacked on, but it makes sense when you realize most maps are in fact 2D.

Anyway, my 2-cents of experience. If anyone has some good advise for > million row, 3D spacial optimizations for PostGIS, please let me know.

[+] estsauver|5 years ago|reply

The only thing I can recommend off hand is to try and simplify your geometries a bit for searching if they’re complex. Searching along the border of a complicated shape is much harder than searching around a square, it has to do a lot more calculations.

[+] nicois|5 years ago|reply

Is it possible you haven't added the bounding box for your geometries? The GIST index will use your bounding box to optimise the queries, but only if it finds one. Also make sure your query is actually using the index.

[+] gfiorav|5 years ago|reply

You can subdivide large geometries. You can decide how much by controlling the amount of vertices a polygon can have before it’s chopped away.

Intersections with large polygons will perform faster if subdivided.

[+] billfruit|5 years ago|reply

What kind of front end do you use. Usually the spatial db is only a part of the solution, and the GIS information needs to displayed and functionality for interaction with maps and objects needs to provided. What framework would pair well with PostGis in the open source world?

[+] tcbasche|5 years ago|reply

My suggestion might be to a precalculated step to split "the universe" into grid areas divided by geometry complexity (i.e. population density gridding) then process these reference areas in parallel. How you do the latter is probably the interesting part!

[+] jhgb|5 years ago|reply

> Finding all the systems within say 20 Ly of our sun can take upwards of a few seconds, and I've already added a GIST index on the positions.

Doesn't that sound a bit pathological? It should be nowhere near that slow. Try implementing a VP tree over that data.

[+] fdr|5 years ago|reply

Hmm, I think you are looking for SP-GiST (unbalanced space partitioned GiST), unless you meant that implicitly. It can be much faster for...space partitioning.

[+] 7952|5 years ago|reply

You could use a clustered index

[+] yxhuvud|5 years ago|reply

Hmm. That sounds like a lot of time. How does your query look like for that?

[+] cwmma|5 years ago|reply

there is a specific command you need to use to create a 3d index

CREATE INDEX [indexname] ON [tablename] USING GIST ([geometryfield] gist_geometry_ops_nd);

by default postgis just creates 2d ones

[+] thom|5 years ago|reply

PostGIS is loads of fun with sports data. Being able to find similar passages of play just using ST_MakeLine and ST_FrechetDistance feels like magic when you first do it in all of five lines of code.

Spatial tools are a bit like logic programming in that they’re very slightly esoteric. But once you know they’re the right hammer for some nails, they’ll save you lots of time and effort over your life, and let you express some ideas you might otherwise struggle with.

[+] another-dave|5 years ago|reply

This sounds really interesting. Was thinking it would be cool to build a hobby project in this space, but wasn't sure how to get started.

Do you have any recommended sources for sample datasets to play around with?

[+] stdbrouw|5 years ago|reply

FWIW, even though PostGIS is pretty great, if your use-case is primarily offline analysis and you don't need the data to be permanently accessible or writable, consider not using a database at all. You can do a lot, a lot faster with e.g. https://shapely.readthedocs.io/en/stable/manual.html and/or https://geopandas.org/.

[+] twelvechairs|5 years ago|reply

Ultimately for spatial functions both PostGIS and Shapely/Geopandas are using GEOS as the base library, so neither are 'a lot faster' for analysis at a fundamental level. Lots of queries to a database brings some speed loss but this isn't necessary for the vast majority of use cases. OTOH PostGIS is in many cases much more finely tuned than shapely/geopandas for speed, scale and parallelisation (it is a much more highly maintained library) so can actually be faster.

Functionality is a different thing. PostGIS has a lot more baked in functionality than shapely. But shapely/geopandas exposes the depth of python and its libraries which allow much more extensive customisation of how to work with data.

Lots of tradeoffs and overlapping use cases - I just wanted to add some depth to this discussion.

[+] rbranson|5 years ago|reply

Yes, even for on-disk read-only use cases, shapefiles are almost always much faster. Overhead from serialization, network, etc often dominates the performance of GIS.

[+] jimktrains2|5 years ago|reply

I'd argue that being able to use sql to more easily combine ans filter datasets, including non-geospatial ones is still very useful in the circumstances you described.

[+] tcbasche|5 years ago|reply

I'm not sure what 'a lot faster' really means in this context.

Honestly, I've found using Spatialite queries to be orders of magnitude faster for analysis than shapely or geopandas. The latter typically imply row-by-row selection and manipulation for starters.

If you can wrangle the data into a geopackage first it's super easy to run queries over the data and extract what you need.

[+] jononor|5 years ago|reply

Do those assume/require that all the data will fit into memory?

[+] twelvechairs|5 years ago|reply

A couple of excellent GIS related links I would recommend to anyone interested in this area.

Talks and writing by PostGIS co-creator Paul Ramsey (who is incredible): http://blog.cleverelephant.ca/writings

The mapscaping podcast - a lot of intros to the complex and overlapping worlds of GIS: https://mapscaping.com/blogs/the-mapscaping-podcast

[+] craigkerstiens|5 years ago|reply

Paul also has quite a few posts on the Crunchy Data blog as well - https://blog.crunchydata.com/blog/author/paul-ramsey

[+] cwmma|5 years ago|reply

Talks by Paul Ramsey are usually a highlight of any GIS conference.

[+] willtemperley|5 years ago|reply

It's good software (I've used it more than a decade), however I found GEOS to be a sticking point. When using it on very large polygons, e.g. 10k to 1 million vertices, memory leaks are not uncommon and performance drops off considerably. Debugging SQL -> C -> C++ is not fun and hacking C++ geometry code when it's not part of your normal work is nigh on impossible. I've found the ESRI geometry API for Java to be by far the best geometry API out there. Harder to use initially and obviously JVM specific but faster and more reliable. It's a very good fit for Hadoop / Spark or other JVM applications. Ignore the brand name, I'm not affiliated and it's FOSS with an Apache license.

https://github.com/Esri/geometry-api-java

[+] sabman|5 years ago|reply

https://getgeodb.com is a hosted CartoDB which is an open source stack built on PostGIS but its notoriously hard to run and configure. So we decided to run it as a service for others... we find it very useful to build apps on top of PostGIS since running them requires configuring middleware like tileservices, data APIs and importing data.

would love any feedback :D ... you can also hit me up on twitter https://twitter.com/sabman

[+] ascorbic|5 years ago|reply

Next week is PostGIS's 20th birthday

[+] zeeZ|5 years ago|reply

I've toyed with PostGIS to make a poster from OSM data by selecting streets that intersect with one of Tokyo's special wards[0]. Sadly I haven't found any excuses to play with it more since.

[0]: https://christian.rinjes.me/posts/2020-11-25-tokyo-street-ar...

[+] mooneater|5 years ago|reply

Lets hear some love for Refractions Research out of Victoria BC, who gave us PostGIS! Great folks, very helpful, and a wonderful open source product.

http://www.refractions.net/

[+] motiejus|5 years ago|reply

I am writing an open-source line simplification algorithm[1] with PostGIS. So far it's a few hundred lines of SQL-like code[2], but it will grow.

I have never written anything serious in plpgsql, so this is my first non-trivial project. Writing in PostGIS is a strange mix of SQL (fully declarative) and a "real" programming language with variables, arrays, functions and loops. The mix is very interesting and requires getting used to. But doable.

Things I like so far:

- geometric functions are very robust and well documented. Reference manual[3] is amazing. Creating/editing/iterating over lines and polygons, detecting their features, is a breeze.

- good integration with QGIS -- I can see the result graphically on my desktop system very easily.

- I can even write tests[4].

Things I am worried about or don't like as much:

- The SQL/procedural mix requires a lot of getting used to, and sometimes "simple" things in a "real" programming language becomes complex here. E.g. I would love to have mutable linked-lists. Or moving data between "sql-world" (rows) to "plpgsql-world" (arrays and data structures) is non-obvious, and best references are, unfortunately, stackoverflow and scripts of others'.

- My program will run on a large data set (e.g. all rivers in a country or a continent). I will want to optimize it. There is an out-of-tree profiler[5], but "unofficial, out of tree" always adds risk.

- Biggest one: deep copies everywhere. Since plpgsql does not do any memory management, it is deep-copying everything. Sometimes (often in this algorithm!) I want to add a vertex to a line (=river bend); that always requires a full copy of the whole bend. When I know it will not be used and will be safe, I would like to say "I am mutating this geometry and I don't want a deep copy".

In general, I like it for algorithms. Though the moment it needs to hit performance-sensitive production, I believe I will re-do it in C (which, looking at the postgis source code, is quite write-able too).

[1]: https://www.tandfonline.com/doi/abs/10.1559/1523040987824417...

[2]: https://github.com/motiejus/stud/blob/master/IV/wm.sql

[3]: https://postgis.net/docs/reference.html

[4]: https://github.com/motiejus/stud/blob/master/IV/tests.sql

[5]: https://github.com/okbob/plpgsql_check

Edit: formatting

[+] atonse|5 years ago|reply

Regarding deep copies, have you considered writing it as a plugin in something like rust?

https://github.com/zombodb/pgx

[+] pella|5 years ago|reply

- https://planet.postgis.net/ ( postgis related posts )

- https://planet.postgresql.org/ ( postgresql related posts )

- https://github.com/postgis/docker-postgis ( docker images for postgis ; alpine+debian)

[+] dbreunig|5 years ago|reply

Relatedly: the appendix section "SQL Primer" in Obe and Hsu's 'PostGIS in Action' is my favorite quick intro to SQL and the first stop I give anyone looking to learn about the language.

https://www.manning.com/books/postgis-in-action-second-editi...

[+] brianzelip|5 years ago|reply

A couple of recent and insightful podcast episodes related to PostGIS:

1. https://talkpython.fm/episodes/show/295/gis-python

2. https://changelog.com/podcast/417 (postgres)

[+] billfruit|5 years ago|reply

What would be a good front end framework to use with PostGIS with capable rendering of GiS artifacts to make a compete desktop GIS application?

The only viable open source route I am seeing is through some kind of QGIS extensions. Is there something better for that purpose?

[+] samstave|5 years ago|reply

R seems like a great idea, with building shiny apps...

[+] hnedeotes|5 years ago|reply

Used it a few times and it's pretty amazing what you can do with it out of the box.

[+] ruffrey|5 years ago|reply

I helped build a relatively successful SaaS offering for finance atop PostGIS with MapServer and Mapproxy+Redis. The stack ran circles around ArcGis. Not even a close comparison.

[+] poxrud|5 years ago|reply

I've used PostGIS + maxmind datasets for my geolocation needs, instead of using SaaS based services, like Google's geolocation API.

87 comments