top | item 46922679

(no title)

cpa | 23 days ago

The big reason is that H3 is data independant. You put your data in predefined bins and then join on them, whereas kd/r trees depend on the data and building the trees may become prohibitive or very hard (especially in distributed systems).

discuss

order

mgaunard|23 days ago

Indices are meant to depend on the data yes, not exactly rocket science.

Updating an R-tree is log(n) just like any other index.

vouwfietsman|23 days ago

I think the key is in the distributed nature, h3 is effectively a grid so can easily be distributed over nodes. A recursive system is much harder to handle that way. R-trees are great if you are OK with indexing all data on one node, which I think for a global system is a no-go.

This is all speculation, but intuitively your criticism makes sense.

Also, mapping 147k cities to countries should not take 16 workers and 1TB of memory, I think the example in the article is not a realistic workload.

cpa|23 days ago

To add to sibling comment, if you have streaming data you have to update the whole index every time with r/kd trees whereas with H3 you just compute the bin, O(1) instead of O(log n).

Not rocket science but different tradeoffs, that’s what engineering is all about.

rockinghigh|22 days ago

How do you join two datasets using r-trees? In a business setting, having a static and constant projection is critical. As long as you agree on zoom level, joining two datasets with S2 and H3 is really easy.