Damn Cool Algorithms: Fountain Codes

[+] tikhonj|14 years ago|reply

This sounds really similar to secret sharing[1]. While not doing exactly the same thing, a simple secret sharing algorithm is also interesting to read about now--it's a different algorithm chunking up and reassembling data in a similar way with different properties.

One difference is that with secret sharing unless you have enough chunks you do not have any information about the result. (That's why it's called secret sharing--you can share a secret with a bunch of people without any of them knowing it unless a minimum number get together.) The neat trick is that any subset of the chunks big enough will get you all of the information.

[1]: http://en.wikipedia.org/wiki/Shamir%27s_Secret_Sharing

In fact, the algorithm (at a high level) is exceptionally simple: encode your data as the coefficients of a polynomial of degree n and transmit it as a bunch of points in the polynomial. You then need n+1 points to recover the polynomial, but which particular points you have does not matter.

What I love about this algorithm is that it is useful, nontrivial and can be explained in a single sentence (by a better writer than I :)).

[+] zacharyvoase|14 years ago|reply

Another way of phrasing the same concept:

An infinite number of Euclidean planes can be drawn containing a single point, but you only need three of these—and it can be any three—to find the original point.

[+] epaulson|14 years ago|reply

Alas, digital fountains are tied up in patents, and people have been leery:

http://www.eecs.harvard.edu/~michaelm/postscripts/itw2004.pd...

[+] michaelf|14 years ago|reply

I know there are lots of anti-patent folks on HN, but if anything deserves patent protection, digital fountains certainly do. This is useful, novel, and certainly non-obvious stuff. And really quite beautiful, IMHO.

Its always bothered me to read about inventors who discovered beautiful and amazing things, only to die penniless while powerful interests reaped all of the financial benefits. Hopefully the people who discovered these codes have been richly rewarded for their work.

[+] jleader|14 years ago|reply

Why does the described algorithm only decode transmitted blocks when the receiver has decoded all but one of the constituent blocks? For example, if the receiver has received the following 3 blocks:

  1. A ^ B ^ C
  2. A ^ B
  3. B ^ C

it can combine #1 and #2 to decode C, then combine C and #3 to decode B, then combine B and #2 to decode A.

The algorithm as described would in this situation cause the receiver to wait until it's received a single block, either A or B or C, before decoding anything. This strikes me as inefficient.

Edited to add: I think this is analogous to forward vs. backwards chaining: you can either start by combining single blocks with composite blocks to simplify them, or start by combining blocks whose component block lists differ by only a single block. Or you could apply both, which should get the greatest simplification with the smallest amount of input.

[+] baruch|14 years ago|reply

Normally one would use some distribution of the number of encoded source blocks in a block. So one would send 30% single blocks, 30% double blocks, 15% triple, and so on until you get some relatively small number of high order encoded blocks. This means that while you can find a solution the way you posted it is a rare case that you have no single block for decoding and since in any case you need to receive a bit over 100% of the blocks before you can decode anything there is no real reason to go to exquisite lengths to find the perfect solution when the trivial case will work 100% of the times (well 99.9999% if you cared to be accurate :-)

The algorithm is also not so much about computational efficiency, to check at any point if you can or cannot decode requires quite a bit of work anyhow. The main advantage of the algorithm is in its relative low complexity and high efficiency on the transmission line.

[+] foobarqux|14 years ago|reply

"Systematic code" is the term of art in coding. There are systematic fountain codes. The raptor code adopted by 3GPP is systematic.

[+] unknown|14 years ago|reply

[deleted]

[+] sown|14 years ago|reply

Where do people keep these algorithms around? I've asked in the past about reference guides for stuff like this they throw Knuth's "Art of Computer Programming" which is a neat book but doesn't have esoteric and useful but hard to find stuff like this.

Where do I find stuff like this?!

[+] nl|14 years ago|reply

http://en.wikipedia.org/wiki/List_of_algorithms (seriously! Scan it, choose something that looks interesting and you'll have hours of fun ahead of you.)

[+] sc00ter|14 years ago|reply

I don't pretend to understand most of it, but to my untrained eye, "Clever Algorithms" seems to be a phenomenal resource and available as a free PDF, or in print: http://www.cleveralgorithms.com/

Edit: Discussion here: http://news.ycombinator.com/item?id=2141542

[+] repiret|14 years ago|reply

Whats the advantage of Fountain Codes over more well known and deterministic forward error correction techniques like Reed-Soloman Codes?

The D.J.C.MacKay paper linked from the article lists applications in broadcast and data storage, but Reed-Soloman would work equally well in both those applications.

Hypothesis: With Reed-Soloman and friends, if you're unlucky and loose the /right/ n blocks, for small n, you can't recover your data. Is that n larger for LT? If I could maliciously corrupt your LT blocks, would I have to corrupt more with LT?

Hypothesis: If I randomly corrupt blocks with Reed-Soloman, does my likelihood of not being able to recover all my data drop faster than if I did so with LT?

Can someone say authoritatively?

[+] jsyedidia|14 years ago|reply

The advantage is you don't need to know the channel characteristics ahead of time. In particular, if you used a Reed-Solomon code, you'd use one designed for one particular erasure rate. But that code would be too weak for a channel with a higher erasure rate (it would fail to decode) and it would be too strong for a channel with a lower erasure rate (it would be wasteful and use too many parity bits). The advantage of fountain codes is that they work efficiently whatever the channel's erasure rate.

By the way, I've been talking about channels with an erasure rate (the Binary Erasure Channel) because fountain codes were originally invented in that context. But you can find in the literature papers on "Rateless codes for noisy channels" that show that the fountain codes also work fine for bit-flipping channels like the binary symmetric channel and the additive white Gaussian noise channel.

[+] jacquesgt|14 years ago|reply

Reed-Solomon and similar codes correct for bit errors in a particular block by transmitting extra bits in each block. Fountain codes correct for missing blocks by transmitting extra blocks.

If you just used RS, the only way to overcome really long bursts of errors (thousands or millions of bits long) would be to either to request acknowledgements from the receivers and retransmit unacknowledged blocks. If acknowledgements weren't possible, you would need to transmit everything multiple times. And, if you had enough receivers, all missing random packets, you would end up retransmitting everything even if you had acknowledgements. By using fountain codes, you can transmit an extra few percent of blocks and still succeed in the face of long dropouts.

[+] wmf|14 years ago|reply

Parity codes (of which fountains are a subset) are based on XOR, making them much faster than RS's Galois field arithmetic. Parity codes are generally less efficient than RS (you have to receive 5-10% extra data to recover), but the net performance is better due to faster decoding.

jsyedidia explained the benefits of ratelessness, so I won't repeat it.

[+] phzbOx|14 years ago|reply

Didn't know about that algorithm, pretty cool. Also, the way the author explained it made it a joy to read.. clear, simple examples, meaningful graphics. Can't wait to start reading the previous blog posts (About other cool algorithms).

[+] calloc|14 years ago|reply

Seems the website has gone over quota. Shame, because the content is absolutely fantastic. Had it opened and had begun reading it on my computer at home.

[+] Edootjuh|14 years ago|reply

Here's a mirror: http://blog.notdot.net.nyud.net/2012/01/Damn-Cool-Algorithms...

[+] kcl|14 years ago|reply

It wasn't immediately clear to me the ideal soliton distribution always summed to one. There is a simple proof by induction. I wrote it up here: http://kevinlawler.com/soliton

[+] baruch|14 years ago|reply

It is a damn cool algorithm, I was introduced to it in a startup I worked in and we actually tried to make use of it. The algorithm as described has a problem to converge fast enough and we had to make some extra structure on top that made it converge very fast and in a very consistent way. We used it for multicast communications where feedback to the server is impractical and such codes are making it a very effective way to distribute the data.

[+] derfclausen|14 years ago|reply

Luby had a startup named Digital Fountain in SF in the late 90's where they attempted to productize this, primarily for video streaming.

Looks like they were acquired by Qualcomm a few years back: http://www.qualcomm.com/solutions/broadcast-streaming/media-...

[+] sethbuzz|14 years ago|reply

Fountain codes look very interesting. There was a project at OLPC, I think written in Forth for OpenFirmware, that used something like fountain codes to reflash massive numbers of computers. It worked like this. Turn one XO-1 on in OFW mode, insert a USB with the disk image you wanted to be on all machines, and turn it on as the source server. Boot into OFW, the other XO-1s you want to reflash, and tell them to receive a new image. I think there was a key shortcut of some kind that told a machine to receive. The server XO would continue to send packets until all other machines had received enough packets to make up a complete disk image. If I remember correctly, they used it to reflash 3000 machines in the field that I heard about. There may have been more elsewhere or since.

[+] evmar|14 years ago|reply

Gray on black text. :~(

To make the page readable, pull up the dev console and run `document.body.style.color='black'`

[+] Lagged2Death|14 years ago|reply

Or you could zap colors:

http://www.russellbeattie.com/blog/the-zap-colors-bookmarkle...

[+] karpathy|14 years ago|reply

It's somewhat intuitive, but I feel like there is some cool math or attempt at explanation behind this that was left out.

Why is this better than just broadcasting messages that always contain a single block? What do you gain by forming combinations? You still have an expectation of eventually getting all the information. I assume it just turns out to take longer on average because your expected gain in information is lower from single chunks once you already have many of them?

[+] JoshTriplett|14 years ago|reply

If you only broadcast messages that contain a single block, anyone who already has that block gets no value out of the message. On the other hand, if you broadcast messages generated via fountain codes, messages give you almost a full block worth of information on average.

If you assume the same network environment that the article does, namely one where no feedback occurs from the recipient to the sender, then if you miss a single block in the message, you'll have to wait for the sender to send that one block again. Since the sender has no information about what block you missed, you'll have to wait on average for half the blocks to get retransmitted, resulting in 50% overhead or more. It gets worse if you miss enough blocks that you have a non-trivial chance of missing those blocks again. (On top of that, some transmission mechanisms have properties that make particular data patterns particularly prone to loss, which makes it significantly more likely that you'll lose particular blocks repeatedly.)

With fountain codes and similar mechanisms, on the other hand, every message provides several chances to obtain the missing blocks, resulting in much less overhead.

[+] jws|14 years ago|reply

One wonders how it does rate limiting to avoid pointless resource consumption of faster, near links when the packets are going to be dumped by a slower, farther link.

In particular, if you intend to coexist fairly with TCPIP on an internet you might find your rate limiting code carries all the most of the work and timing issues of TCPIP anyway.

[+] baruch|14 years ago|reply

The fountain code itself is not concerned with it, it's merely the coding layer. There needs to be a communications layer that takes care of that and the way it does that will need to consider co-habitating with TCP.

In the application I worked on we used multicast and implemented a form of multicast congestion control that was orthogonal to the coding but exploited the fact that we can send however much data as we wanted and the only thing that changed was the time it will take to complete the transfer. It worked beautifully with the coding and co-habitated nicely with TCP.

[+] loeg|14 years ago|reply

This is not relevant to fountain codes.

[+] unknown|14 years ago|reply

[deleted]

[+] kareemsabri|14 years ago|reply

Isn't this network coding?

http://en.wikipedia.org/wiki/Network_coding http://research.microsoft.com/pubs/67246/tr-2004-80.pdf

[+] aidenn0|14 years ago|reply

This seems most useful for broadcast. You could just broadcast the fountain code of a file out as long as you have listening receivers; the only feedback needed would be "send to me" and "done"

[+] invalidOrTaken|14 years ago|reply

Thanks for this post. Patented or not, this is damn cool.

[+] beambot|14 years ago|reply

Arg. I thought algorithms weren't supposed to be patentable. I know there are workarounds (like referring to the "algorithm" as a "system"). But it's frustrating nonetheless.

For the curious, Knuth wrote an interesting letter on the topic back in 1994: http://www.pluto.it/files/meeting1999/atti/no-patents/brevet...

[+] cma|14 years ago|reply

heavily patented

62 comments