I learned Hamming code in school this summer (digital systems) .It was probably the single most magical thing (even more than flipflops! Loops make memory!) I was exposed to that class.
I remember upon learning it someone went "wow, who came up with that". The prof answered dryly "some genius named Hamming".
Sort of a probabilistic hash, where you trade space for accuracy. But it's also like a memory function - it can remember if it has seen a piece of data before.
What I love about bloom filters is that when checking to see if something is in a bloom filter, you can only get false positives, not false negatives. That property is just awesome to me.
I love that they're fast enough to use in hardware. Some of the newer schemes for hardware-accelerated transactional memory use Bloom filters to probabilistically detect when two transactions are using the same memory addresses. This is so much easier and more efficient than previous approaches, which had to use content associative memories and expensive broadcast operations.
Singular value decomposition. It continually amazes me how many problems -- from noise reduction to Netflix prediction -- SVD can be usefully applied to.
I still do not completely understand it. I see the math, but I see it as symbols that I can memorize, rather than something intuitive (like when I see a moving cursor through a state tree during a binary search for instance).
Compressed Sensing is exciting - using a sequence of low res images to obtain a higher resolution sample. It was discussed in Wired (http://www.wired.com/magazine/2010/02/ff_algorithm/all/1) together with a compelling example (minimizing the time of a young patient in an MRI machine)
That sounds far too good to be true. I guess if you don't care about the fine details then its a decent technique, but to take the MRI example, what if the thing that was wrong was only visible in those small details?
For me, it's boosting. Basically, if you make the assumption that your weak classifier can achieve X% correct on any random distribution over a data set, then you can create an ensemble of weak learners that together get Y% correct on the same set, where Y > X.
Please don't post links through bit.ly and other URL indirection sites. It hurts the web by making it much more difficult to follow the link in the case that an intermediary goes out of business. Hacker News and your post will probably be around in a few years, but will bit.ly? (What's their business model, exactly...?)
It's been discussed here before, and no doubt it'll be cited numerous times in the comments of that reddit post, but it was the first time I'd ever seen such trickery. When I got into low level DSP programming I saw many more examples of clever hacks, but this is the one which sticks in my mind above all of those.
My mind is blown by the algorithm for matching with mismatches which I present in the first chapter of my thesis. It shouldn't be, given that I discovered this algorithm -- but somehow "I take the Fourier Transfer of the FreeBSD kernel" sounds more like the punch line to a joke than the first step in an algorithm.
At least to me, it seems hard to imagine that multiplying two N-bit integers could be done with less than O(N^2) operations. Schoenage-Strassen lets you do it using O(N log N log log N) operations!
High dimensional work is bloody hard, and this algorithm works amazingly well. I've spoken with Lenstra (one of them) and he's amazingly insightful on these things. He helped to crystalise my understanding of why high-dimensional spheres should be thought of as "spikey," rather than "round."
Agreed, but while the notation's simple, the iterative effect is hard to follow -
it's like a spinning rod, angle doubled and length squared each iteration, with a rod of fixed length and angle added to it each iteration (if the spinner is facing away, it gets shorter, but if towards, it gets longer). Iterate til it goes over 2. If it doesn't then it's in the Set (or maybe you need to iterate more...). [that's my current understanding]
It's pretty clear to me that I can't visualize what it will do over a few iterations (apart from simple cases) - and I sure can't see how it would produce self-similar (ie fractal) shapes. Agreed that that's amazing.
It lets you search a completely unstructured N-item search space using the square root (!) of N queries, not the N queries you'd think were necessary.
Also, the algorithm is so simple that once you know it's possible, and provided you're very comfortable with basic quantum mechanics, it's almost trivial to find the algorithm.
Ok, not quite an algorithm, but the PCP theory (http://en.wikipedia.org/wiki/PCP_theorem) is easily the most mind-blowing result in comp sci in the past 20 years.
PCPs are true magic. I once learned how to construct a PCP-proof with verification using just 7 bits, but it still feels like magic to me. The 3-bit variants are just crazy.
The Burrows-Wheeler transform used in bzip2 is truly mind-blowing.
Many algorithms are ordinary genius (eventually you would have come up with it because the problem space dictates the solution), but this one is extraordinary genius (mind-blowingly original and non-obvious). Every time I see it, I'm amazed that it works.
Welzl's algorithm for minimum enclosing disk does it for me. Pick a point, recursively see if it is in the disk without the point - if not, it is part of the definition the minimum enclosing disk (forced to be on the boundary). Randomized, simple, clean and expected O(n).
CORDIC (http://en.wikipedia.org/wiki/Cordic), used to efficiently calculate trig functions with only very basic hardware requirements (add, sub, shift, and table lookup).
. . . relatedly, from Jim Blinn, from Marvin Minsky -- drawing a circle (near enough), or doing incremental rotation, using just integer add, sub, and two shifts:
N = 128 * 3.14159
X = 1000
Y = 0
MOVE(X,Y)
FOR I = 1 TO N
X = X - (Y >> 6)
Y = Y + (X >> 6)
DRAW(X,Y)
Some of the sorting algorithms, as somebody learning them, it's almost entirely opaque how somebody could come up with those.
It seems like they must have sprung forth wholecloth to the inventor in the shower...it's almost impossible to have iteratively developed some of them because even small changes in the algorithms produce terrible results. I remember thinking over and over again, "how the hell could somebody come up with this?"
Searching in comparison looks very engineered, very studied, something that most people could come up with given need, motivation and time.
The Genetic Algorithm. It totally blew my mind when I read about it's successful application to evolving satellite antennas for JPL, and the automated design of an analog->digital signal converter that relies on a component that isn't actually connected to anything else.
[+] [-] jgrahamc|15 years ago|reply
http://en.wikipedia.org/wiki/Hamming(7,4)
I got interested in these codes because of the problems involved with deep space communication.
http://en.wikipedia.org/wiki/Error_detection_and_correction#...
[+] [-] icegreentea|15 years ago|reply
I remember upon learning it someone went "wow, who came up with that". The prof answered dryly "some genius named Hamming".
[+] [-] Keyframe|15 years ago|reply
[+] [-] mayank|15 years ago|reply
And the reasons why it blows my mind:
(1) Knuth called it "Algorithm of the year 1973" (possibly because it beat an intuitive lower bound that he had for a problem I can't remember).
(2) It's relatively new for a "core", first-principles algorithm. Ukkonen's algorithm is from 1995, although there are earlier versions.
(3) This BOOK is largely devoted to the many, many applications of suffix trees: http://www.amazon.com/Algorithms-Strings-Trees-Sequences-Com... It should be required reading for anyone interested in advanced algorithms.
(4) Longest common substring in linear time.
(5) Lempel-Ziv decomposition in linear time.
(6) Longest repeated substrings in linear time.
And too many more to list.
[+] [-] nl|15 years ago|reply
Sort of a probabilistic hash, where you trade space for accuracy. But it's also like a memory function - it can remember if it has seen a piece of data before.
[+] [-] daeken|15 years ago|reply
[+] [-] pjscott|15 years ago|reply
[+] [-] gamache|15 years ago|reply
http://en.wikipedia.org/wiki/Singular_value_decomposition
[+] [-] cdavid|15 years ago|reply
Someone else already mentioned compressed sensing, which expands on some of those ideas. Terrence Tao had a pretty good presentation on the topic: http://terrytao.files.wordpress.com/2009/08/compressed-sensi...
[+] [-] rdtsc|15 years ago|reply
[+] [-] jefffoster|15 years ago|reply
[+] [-] thalur|15 years ago|reply
[+] [-] anamax|15 years ago|reply
[+] [-] endtime|15 years ago|reply
Question 3 here will walk you through the proof: http://bit.ly/cQ03na
[+] [-] Figs|15 years ago|reply
For future reference, the unshortened version of his link is: http://www.stanford.edu/class/cs221/handouts/cs221-ps2.pdf
[+] [-] smcl|15 years ago|reply
http://www.beyond3d.com/content/articles/8/
It's been discussed here before, and no doubt it'll be cited numerous times in the comments of that reddit post, but it was the first time I'd ever seen such trickery. When I got into low level DSP programming I saw many more examples of clever hacks, but this is the one which sticks in my mind above all of those.
[+] [-] cperciva|15 years ago|reply
[+] [-] michael_nielsen|15 years ago|reply
http://en.wikipedia.org/wiki/Sch%C3%B6nhage%E2%80%93Strassen...
At least to me, it seems hard to imagine that multiplying two N-bit integers could be done with less than O(N^2) operations. Schoenage-Strassen lets you do it using O(N log N log log N) operations!
[+] [-] jules|15 years ago|reply
[+] [-] davidj|15 years ago|reply
[+] [-] RiderOfGiraffes|15 years ago|reply
http://en.wikipedia.org/wiki/Lenstra%E2%80%93Lenstra%E2%80%9...
High dimensional work is bloody hard, and this algorithm works amazingly well. I've spoken with Lenstra (one of them) and he's amazingly insightful on these things. He helped to crystalise my understanding of why high-dimensional spheres should be thought of as "spikey," rather than "round."
[+] [-] jules|15 years ago|reply
[+] [-] barrydahlberg|15 years ago|reply
[+] [-] 8ren|15 years ago|reply
it's like a spinning rod, angle doubled and length squared each iteration, with a rod of fixed length and angle added to it each iteration (if the spinner is facing away, it gets shorter, but if towards, it gets longer). Iterate til it goes over 2. If it doesn't then it's in the Set (or maybe you need to iterate more...). [that's my current understanding]
It's pretty clear to me that I can't visualize what it will do over a few iterations (apart from simple cases) - and I sure can't see how it would produce self-similar (ie fractal) shapes. Agreed that that's amazing.
[+] [-] michael_nielsen|15 years ago|reply
It lets you search a completely unstructured N-item search space using the square root (!) of N queries, not the N queries you'd think were necessary.
Also, the algorithm is so simple that once you know it's possible, and provided you're very comfortable with basic quantum mechanics, it's almost trivial to find the algorithm.
[+] [-] abhijitr|15 years ago|reply
[+] [-] mzl|15 years ago|reply
[+] [-] rhettinger|15 years ago|reply
Many algorithms are ordinary genius (eventually you would have come up with it because the problem space dictates the solution), but this one is extraordinary genius (mind-blowingly original and non-obvious). Every time I see it, I'm amazed that it works.
[+] [-] cosbynator|15 years ago|reply
Not enough people know about it! http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.46....
[+] [-] ardit33|15 years ago|reply
Think about it, every time you get a google map direction/route, one of them is in play.
[+] [-] varjag|15 years ago|reply
http://en.wikipedia.org/wiki/Rapid_Spanning_Tree_Protocol
[+] [-] joe_bleau|15 years ago|reply
[+] [-] hxa7241|15 years ago|reply
[+] [-] elblanco|15 years ago|reply
It seems like they must have sprung forth wholecloth to the inventor in the shower...it's almost impossible to have iteratively developed some of them because even small changes in the algorithms produce terrible results. I remember thinking over and over again, "how the hell could somebody come up with this?"
Searching in comparison looks very engineered, very studied, something that most people could come up with given need, motivation and time.
[+] [-] jeffmiller|15 years ago|reply
[+] [-] dlnovell|15 years ago|reply