top | item 37839437

Geospatial data science with Julia

159 points| juliohm | 2 years ago |juliaearth.github.io

53 comments

order

jstrickshire|2 years ago

I have a passion project 4x4anarchy.com that operates with a Python-MariaDB system for querying map data by latitude and longitude, transforming it into GeoJSON for map display. The website deals with sizable tables, approximately 1 GB in size. I've made extensive optimizations, relying on well-structured indexes, caching mechanisms, and query optimization to enhance performance.

Given these circumstances, how might the incorporation of Julia and some geospatial DB (PostGIS) contribute to further optimizing geospatial data retrieval and presentation, especially when dealing with large datasets and intricate geospatial operations?

gabegm|2 years ago

It would depend on where most of the processing is happening.

PostGIS gives you the benefit of spatial indexes which are extremely performant.

I've seen Python GeoSpatial applications taking hours to finish processing which only took a few minutes when shifted onto PostGIS.

If you're also doing a lot of processing in Python, exploring other languages could also help. In the case of Julia you get a typed language that's also JIT compiled.

tony_cannistra|2 years ago

Interesting! I work on a very similar product.

I don't know Julia well, but I definitely would suggest exploring whether PostGIS can help improve the speed of your DB queries.

I'd also consider how you deliver your geospatial data to your clients -- I'm not sure GeoJSON is your best bet. Protobuf tiles might be better for your use-case (e.g. the Mapbox Vector Tiles spec).

benzofuran|2 years ago

Cool site! Any chance of a adding a simple KMZ export for offline use for a given area of interest?

fiedzia|2 years ago

If all you do is "find records within x miles from lat,lon", solr/ES is the best solution. I think it can match a shape too.

alekseiprokopev|2 years ago

Nice thing about Julia is that you randomly find cool projects like this.

nraynaud|2 years ago

Be mindful that most of julia's geometry code is a wrapper of libGEOS (C version) and libGDAL, that means that you can't easy extend the algorithms, everythig is behind a black box on the C side. Source: I have worked in the field last year, I have a small patch in LibGEOS.jl .

beeburrt|2 years ago

In the preface you list:

- Generate high-performance code

- Specialize on multiple arguments

- Evaluate code interactively

- Exploit parallel hardware

> This list of requirements eliminates Python, R and other mainstream languages used for data science.

Can you elaborate on why/how? Awesome work by the way

wodenokoto|2 years ago

Python and R do not generate high performing code. At best they generate calls to high performing code.

ekianjo|2 years ago

R can exploit parallel hardware just fine with Parallel, Future and other libraries like Mirai. The problem is that execution speed is going to be a bottleneck for anything large and when you reach some optimizations, maybe R is not the best language to do the job. But it depends a lot on the use case.

juliohm|2 years ago

Geospatial Data Science with Julia presents a fresh approach to data science with geospatial data and the Julia programming language. It contains best practices for writing clean, readable and performant code in geoscientific applications involving sophisticated representations of the (sub)surface of the Earth such as unstructured meshes made of 2D and 3D geometries.

eigenket|2 years ago

Are you a bot? Why did you copy and paste the top paragraph of the linked page?