(no title)
macmccann | 8 years ago
"Data size matters in Elasticsearch; a large index means too much data, caching churn, and poor response time. When query volumes are large, ELK nodes become overloaded, causing long garbage collection pauses or even system outages.
To address this, we switched to hexagonal queries, dividing our maps into hexagonal cells. Each hexagonal cell has a string ID determined by the hexagon resolution level. A geodistance query can be roughly translated to a ring of hexagon IDs; although a hexagonal ring is not a circular surface, it is close enough for our use case. Due to this adjustment, our system’s query capacity more than tripled."
jandrewrogers|8 years ago
For example, "equal area" subdivision of the surface is not a particularly useful property. This seems like it is wrong on its face such that most people try to achieve it but you have to remember that equal area only matters if your data is uniformly and predictably distributed. Geospatial data models are neither in a pretty severe way as a rule, which means you'll have to deal with data load asymmetry regardless via some other mechanism. If you can ignore equal area because it is handled by some other mechanism, it opens up other possible surface decompositions that have much stronger properties in other ways.
Compactness of representation and cost of computing relationships on the representation has a large impact on real-world index performance that is frequently ignored. The poorness of lat/lon grids in this regard is often significantly underestimated. A singularity-free 3D DGGS can have an indexing structure that is an order of magnitude smaller than a basic 2D lat/long grid for addressing, and the storage on disk of records encoded in these DGGS are also significantly smaller. This all adds up to a lot of performance.
Hexagonal grids tend to work particularly well for visualization. However, they do have their own significant weaknesses e.g. it is typically not a good representation for join operations and they are relatively expensive to search at large scales relative to some other DGGS tessellation systems.
espeed|8 years ago
refset|8 years ago
dvanduzer|8 years ago
There are several options for partitioning your grid, and the geometric consequences on your index. This overview in particular should be accessible even without a deep GIS background: http://webpages.sou.edu/~sahrk/sqspc/pubs/xuzhouKeynote13.pd...
elihu|8 years ago
https://wiki.haskell.org/IcoGrid
My tiling had a fixed resolution that you specify up-front; I didn't really consider how to make my grid hierarchical.
scdoshi|8 years ago
Squares divide into sub-squares exactly, but hexagon's don't sub-divide into hexagon's neatly.[2]
Can someone explain the advantages of hexagons over squares in this use case?
[1]https://code.google.com/archive/p/s2-geometry-library/ [2]https://www.illustrativemathematics.org/content-standards/ta...
jacobolus|8 years ago
dvanduzer|8 years ago
edit: Oops, that only explains the tiling. For explanations of how the hierarchy descends, Dr. Sahr produced some GIFs that give a little bit of an overview of how the resolutions overlay: http://webpages.sou.edu/~sahrk/dgg/images/topogif/topogif.ht...
malandrew|8 years ago
This means you can better simulate dynamic systems where there is flow between cells.
adrianratnapala|8 years ago
jacobolus|8 years ago
Guidii|8 years ago
Squares don't pack as closely as hexes, meaning that you can get 18% more hexes into a space than squares for a given perimeter-size. Squares also degenerate badly over the surface of the sphere.... you can't keep a consistent size as you change latitude, and they turn into triangles when you reach the poles.
timmaxw|8 years ago
b0rsuk|8 years ago
unknown|8 years ago
[deleted]