top | item 13284741

Tile38 – Realtime geofencing and geospatial index, v1.7.0

130 points| tidwall | 9 years ago |github.com | reply

26 comments

order
[+] tunesmith|9 years ago|reply
I wonder if a kind of pony-express messaging app could be done with something like this. Like pick a friend in another state, write a message, carry it around in memory hoping that you'll be within a quarter mile of someone else running the app, and see how long it would take for the message to get delivered.
[+] Bedon292|9 years ago|reply
I love it when geo stuff shows up here. Its always fun to see what people are doing.

I see that it is in Memory, but are there any practical limits? How much memory is recommended for what size datasets?

And it doesn't look like the documentation has an examples as to how to load data in? Do I need to convert to Web Mercator, or can I give it WGS84 and have it convert? Also, any plans to allow for pure WGS84? I never actually use Web Mercator, so would rather a system that uses WGS84 natively.

[+] tidwall|9 years ago|reply
> I see that it is in Memory, but are there any practical limits?

Yes. The in-memory database is basically a big btree + a big rtree. These trees contain many pointers to geo objects and depending on the complexity of the objects, it may take up a lot of memory. A 16GB machine can typically store 100 million+ points.

> How much memory is recommended for what size datasets?

Depends on the type of object and the length of the keys, but I would conservatively use 128-bytes per point as a general rule. So 16GB machine should support ~135 million points.

> And it doesn't look like the documentation has an examples as to how to load data in?

Please check out http://tile38.com for extra info and docs on commands. Tile38 uses the Redis protocol (https://redis.io/topics/protocol), so it's importing data is similar to https://redis.io/topics/mass-insert.

> Do I need to convert to Web Mercator, or can I give it WGS84 and have it convert? Also, any plans to allow for pure WGS84?

All coordinates are in WGS84 lat/lon, so you should be fine. Under-the-hood the calculations and such are in EPSG:3857 / WGS 84 Web Mercator.

[+] sandking|9 years ago|reply
Looks pretty neat! Since OP is the author, can I ask: how many simultaneous concurrent (client) connections does this support, how many simultaneous moving objects and at what update frequency? Do these limits change if the shapes get complicated?
[+] tidwall|9 years ago|reply
> how many simultaneous concurrent (client) connections does this support?

The client connections are very lightweight. The protocol library is based on https://github.com/tidwall/redcon. I don't have exact numbers but this link may be helpful (which compares Redcon to Redis): https://simongui.github.io/2016/10/24/benchmarking-go-redis-.... I've seen it handle 16K+ client connection without issue, but typically a client library that supports pooling should be used (like Redigo).

> how many simultaneous moving objects and at what update frequency?

Depends on the complexity of the objects, the number of nearby objects, and the server hardware. Each collection of objects consists of one btree and one rtree for key/id lookups and spatial queries respectively. So updating a single point is like O(log(n)+log(m)). An in-memory database like Tile38 can achieve 100K+ per second. Utilizing network pipelining can help out too. FYI, I use a 4-core server with 8GB ram for testing.

In the case of a roaming geofence (http://tile38.com/topics/roaming-geofences), following each update there an additional query of the rtree O(log(n)) to retrieve nearby neighbors. If there's a lot of nearby objects, this can result in a lot of data sent over the network. But in general, a collection of millions of points spread over a continent with 100K+ updates per second should work, while keeping with realtimelyness.

> Do these limits change if the shapes get complicated?

Yes. Complicated objects such are MultiPolygons may require additional calculations like raycasting.

[+] realrocker|9 years ago|reply
Neat. Seems like a good weekend is coming up. I was wondering how was the gifs/images done? I would really like to see the code if it was done programmatically.
[+] tidwall|9 years ago|reply
Some programming, some design tools.

I programmatically simulated the movements as paths at 30fps into a series of data points, then imported into Animate CC and tweaked the tweening by hand, then exported to a series of PNGs, then imported PNGs into Photoshop as a movie sequence, finally exported as gif.

The superior tweening capabilities of Animate CC and the excellent gif compression of Photoshop makes tuning the animation super easy.

Perhaps next time I'll programmatically draw gif frames, for the fun of it.

[+] Fiahil|9 years ago|reply
Neat! Does someone already used it in production? I'm wondering if it could be useful to replace a medium sized postgres database. I'm trying to replace some clustering currently done with Elasticsearch with postgis, but it's really slow (6m points * 700 boxes, it's taking 20 minutes to scan the table).
[+] tidwall|9 years ago|reply
Yes, it is being used in production.

General purpose spatial queries with Tile38 are very fast. Likely much quicker than PostGIS, but PostGIS will be able to store much more data. It's a tradeoff for sure.

If your dataset isn't terribly big and performance is what you're after then I recommend giving Tile38 a try.

[+] 3adawi|9 years ago|reply
how fast is within? does it use ray tracing?
[+] bpicolo|9 years ago|reply
Depends on which method you use I assume. For geohashing, WITHIN should be best case O(1) and worst case O(n) depending on specificity you care about, where n is the number of geohash cells you need to generate to match the bounding box you care about (which is typically pretty small unless you want, like, a 1km box with 1cm granularity)

Frontloading computation into generating smart data structures give you better choices here.

[+] tidwall|9 years ago|reply
Sorry, my previous answer was not very descriptive.

The WITHIN call will be on average O(log(N)) because all objects are stored in an rtree. But it's usually very very fast to locate objects based on their outermost bounding area. The complexity of the object will add to the duration. Simple points are super fast, like sub microsecond fast in many cases. A MultiPolygon can be much slower.

Tile38 uses a modified raycasting algorithm, not raytracing for it's polygon detections.