I'd love to see a size comparison of the DWARF encoding vs the binary search map. I have a strong suspicion that there's a neat perfect hash solution to this problem - perfect hashes can encode keys in single digit #bits, and you get faster lookups.
Let me check! I don't our approach to be significantly more compact than DWARF's. DWARF is very compact despite being so much more expressive.
Funny that you mentioned perfect hashing, I thought about this, but didn't go for this approach as it would have too many drawbacks for our use-case:
- It would further increase the complexity of the unwinder, which is already not trivial. Ideally, we would like to reduce the surface area of things that can go wrong.
- Implementing perfect hashing in BPF might be tricky for several reasons. First it might take quite a bit of precious instructions, but also, it would force us to ship a compiler to generate the BPF code in the fly rather than shipping it pre-compiled. We really want to avoid this.
- Last, it would force us to generate entries for every program counter, while thanks to using binary search we can omit redundant entries. DWARF expressions would suffer from a similar issue.
javierhonduco|3 years ago
Funny that you mentioned perfect hashing, I thought about this, but didn't go for this approach as it would have too many drawbacks for our use-case:
- It would further increase the complexity of the unwinder, which is already not trivial. Ideally, we would like to reduce the surface area of things that can go wrong.
- Implementing perfect hashing in BPF might be tricky for several reasons. First it might take quite a bit of precious instructions, but also, it would force us to ship a compiler to generate the BPF code in the fly rather than shipping it pre-compiled. We really want to avoid this.
- Last, it would force us to generate entries for every program counter, while thanks to using binary search we can omit redundant entries. DWARF expressions would suffer from a similar issue.