They were only touched on (and just barely) in my CS education, so don’t feel too left out. Spend an evening or two on the Wiki for Probabilistic data structures[0]. With a CS education you should have the baseline knowledge to find them really fascinating. Enjoy!
Oh, and
I don’t find myself actually implementing any of these very often or knowing that they are in use. I occasionally use things like APPROX_COUNT_DISTINCT in Snowflake[1], which is a HyperLogLog (linked in the Wiki).
I made it until Google posted an article about their use of Bloom Filters back around -2000 before I even heard of them, which is at least 25 years after they were invented. Anger was my emotion, not impostor syndrome. I went to a top ten school and none of my profs thought to mention it. Had to learn AVL trees twice though. Which I’ve used fuck-all.
they're common in databases and performance instrumentation of various kinds (as are other forms of data structure "sketch" like count sketches) but not as common outside those realms.
i've gotten interview questions best solved with them a few times; a Microsoft version involved spell-checking in extremely limited memory, and the interviewer told me that they'd actually been used for that back in the PDP era.
My education didn't touch upon it but I've been grilled on it multiple times in interviews.
I learned about them after the first time I got grilled and rejected. Sucks to be the first company that grilled me about it, thanks for the tip though, you just didn't stick around long enough to see how fast I learn
That curriculums didn’t have distributed computing classes when I was in school (mine did, but few took it) made some sense. That modern coursework omits it is unconscionable.
benmanns|7 days ago
Oh, and I don’t find myself actually implementing any of these very often or knowing that they are in use. I occasionally use things like APPROX_COUNT_DISTINCT in Snowflake[1], which is a HyperLogLog (linked in the Wiki).
[0]: https://en.wikipedia.org/wiki/Category:Probabilistic_data_st...
[1]: https://docs.snowflake.com/en/sql-reference/functions/approx...
hinkley|7 days ago
vyr|7 days ago
i've gotten interview questions best solved with them a few times; a Microsoft version involved spell-checking in extremely limited memory, and the interviewer told me that they'd actually been used for that back in the PDP era.
dheera|7 days ago
I learned about them after the first time I got grilled and rejected. Sucks to be the first company that grilled me about it, thanks for the tip though, you just didn't stick around long enough to see how fast I learn
jb3689|7 days ago
hinkley|7 days ago
on_the_train|7 days ago