I've implemented CHAMP [0] for Peergos in the IPLD/IPFS setting, largely based on the Steindorfer paper. There is one improvement that whyrusleeping from ipfs came up with which is to allow a small number of hash collisions in a level, before pushing things down a level.
Haven't implemented it myself, though I mean to try one day. But I've spoken to Michael before and used his implementation: https://github.com/usethesource/capsule
Can someone explain the reasoning for using a HAMT here? I have played around with hashtables quite a bit, but altough HAMT's are cool and a lot more memory efficient, they are likely not faster and will at some point suffer from more severe memory fragmentation than hashtables.
Is the concern speed or memory? Because speed would kind of seem strange, since after changing to the new architecture, they should see a strong decrease in messages being exchanged anyways, right?
If they are cache friendly then they can be very fast. The papers by Phil Bagwell (may he rest in peace) are very rich. Lots of implementation detail. Check them out!
Maybe I'm just missing it, but I don't see in this description any explanation of how to handle hash collisions - ie, when two keys map to the same hash value. Do you iterate through an infinite series of secondary hash functions when you run out of distinguishing bits? Or do you bottom out at a table that you linearly scan?
Overall, it seems like the algorithm is "maintain a trie on the hash values, using an auxiliary bitmap and popcnt to determine the index within any trie node." Does that miss anything?
[+] [-] omginternets|5 years ago|reply
Better yet, has anybody here implemented one before, and might I pick your brain?
[0] https://michael.steindorfer.name/publications/oopsla15.pdf
[+] [-] ianopolous|5 years ago|reply
[0] https://github.com/Peergos/Peergos/blob/master/src/peergos/s...
[+] [-] Apanatshka|5 years ago|reply
[+] [-] joshlemer|5 years ago|reply
[+] [-] eeperson|5 years ago|reply
EDIT It looks like it did make it in: https://github.com/scala/scala/commit/d5ae93e1b35fac4002cfbf...
[0] https://github.com/scala/collection-strawman/issues/192
[+] [-] neutronicus|5 years ago|reply
https://github.com/arximboldi/immer
[+] [-] majke|5 years ago|reply
[+] [-] benibela|5 years ago|reply
https://github.com/benibela/hamt
[+] [-] invisiblerobot|5 years ago|reply
[+] [-] huhnmonster|5 years ago|reply
Is the concern speed or memory? Because speed would kind of seem strange, since after changing to the new architecture, they should see a strong decrease in messages being exchanged anyways, right?
Or are go's maps inefficient? Genuinely curious
[+] [-] fsloth|5 years ago|reply
[+] [-] masklinn|5 years ago|reply
They’re slower and less efficient than a mutable hashmap, but way more so if you need to keep the “old revisions” around.
[+] [-] jsdavo|5 years ago|reply
If they are cache friendly then they can be very fast. The papers by Phil Bagwell (may he rest in peace) are very rich. Lots of implementation detail. Check them out!
[+] [-] sfink|5 years ago|reply
Overall, it seems like the algorithm is "maintain a trie on the hash values, using an auxiliary bitmap and popcnt to determine the index within any trie node." Does that miss anything?