top | item 36245800

(no title)

Nice post! Couple things that might be useful:

1. While JCH will usually be the most performant hashing method, naively, removing a node will affect all nodes of higher order. This makes the logic of node deletions somewhat more complex than (say) Discord's hash ring. This is why JCH is more common for long-term, distributed, redundant storage -- where the topology changes far less frequently.

2. For sharding, what makes distribution hard is not so much the hashing but consensus on the cluster state -- this is the hidden problem. Bryan Hunter's talk on Waterpark (https://youtu.be/9qUfX3XFi_4) is a excellent example of what you can do when you can set things up so that the topology is fixed. In fact, this approach makes things so straight forward that it is shared by Riak, where the number of vnodes is fixed.

However, if you have a rapidly changing topology (like several Kubernetes clusters that are frequently scaling up and down), you can often need some sort of consensus mechanism to make sure every node has a consistent view of the cluster. In my experience, this usually ends up being the most complex part of distribution problem to solve.

discuss

artellectual|2 years ago

Will definitely checkout discord’s hash ring!

Thank you for your feedback.