top | item 40460353

Neuralink Compression Challenge

18 points| crakenzak | 1 year ago |content.neuralink.com

27 comments

order

Terrk_|1 year ago

Apparently, someone solved it and achieved an 1187:1 compression ratio. These are the results:

All recordings were successfully compressed. Original size (bytes): 146,800,526 Compressed size (bytes): 123,624 Compression ratio: 1187.47

The eval.sh script was downloaded, and the files were decode and encode without loss, as verified using the "diff" function.

What do you think? Is this true?

https://www.linkedin.com/pulse/neuralink-compression-challen... context: https://www.youtube.com/watch?v=X5hsQ6zbKIo

djdyyz|1 year ago

Bogus. But a nice spoof.

djdyyz|1 year ago

Analyzing the data it becomes clear that the A/D used by Neuralink is defective, i.e. very poor accuracy. The A/D introduces a huge amount of distortion, which in practice manifests as noise.

Until this A/D linearity problem is fixed, there is no point pursuing compression schemes. The data is so badly mangled it makes it pretty near impossible to find patterns.

djdyyz|1 year ago

It's actually amazing that Neuralink can use this badly distorted data. I imagine that fixing the A/D would improve their results dramatically -- lower latency and higher precision. Why Neuralink has continued work with such an obvious hardware defect is a serious question. Do they actually analyze the A/D to make sure its working properly?

palaiologos|1 year ago

they're looking for a compressor that can do more than 200MB/s on a 10mW machine (that's including radio, so it has to run on a CPU clocked like original 8086) and yield 200x size improvement. speaking from the perspective of a data compression person, this is completely unrealistic. the best statistical models that i have on hand yield ~7x compression ratio after some tweaking, but they won't run under these constraints.

iamcreasy|1 year ago

I thought 200x is too extreme as well. In compression literature, is there a way to estimate the upper limit on lossless compressibility of a given data set?

ClassyJacket|1 year ago

So, they're asking skilled engineers to do work for them for free, and just email it in?

Why didn't every other company think of this?

rl3|1 year ago

>So, they're asking skilled engineers to do work for them for free, and just email it in?

Yup:

"Submit with source code and build script."

But hey, the reward is a job. Maybe.

I mean, not everyone can be privileged enough to experience Ultra Hardcore™ toxic work culture.

djdyyz|1 year ago

200X is possible.

The sample data compresses poorly, getting down to 4.5 bits per sample easily with very simple first-order difference encoding and an decent Huffman coder.

However, lets assume there is massive cross-correlation between the 1024 channels. For example, in the extreme they are all the same, meaning if we encode 1 channel we get the other 1023. That means a lower limit of 4.5/1024 = about 0.0045 bits per sample, or a compression rate of 2275. Viola!

If data patterns exist and can be found, then more complicated coding algorithms could achieve better compression, or tolerate more variations (i.e. less cross-correlation) between channels.

We may never know unless Neuralink releases a full data set, i.e. 1024 channels at 20KHz and 10 bits for 1 hour. That's a lot of data, but if they want serious analysis they should release serious data.

Finally, enforcing the requirement for lossless compression has no apparent reason. The end result -- correct data to control the cursor and so on -- is the key. Neuralink should allow challengers to submit DATA to a test engine that compares cursor output for noiseless data to results for the submitted data, and reports the match score, and maybe a graph or something. That sort of feedback might allow participants to create a satisfactory lossy compression scheme.

djdyyz|1 year ago

Sorry, corrected an error.

It's 2275X

That's the compression ratio for complete cross correlation. It's (10 bits uncompressed / 4.5 bits compressed on 1 channel) * 1024 channels

crakenzak|1 year ago

This reminds me a lot of the Hutter Prize[1]. Funnily enough, the Hutter Prize shifted my thinking 180 degrees towards intelligence ~= compression, because to truly compress information well you must understand its nuanced.

[1]http://prize.hutter1.net/

codingdave|1 year ago

And in exchange for solving their problem for them, you get... ???

I'm all for challenges, but it is fairly standard to have prizes.

occamschainsaw|1 year ago

Probably the Turing award for discovering a breakthrough compression scheme.

davikr|1 year ago

> apparently the best submissions get fast tracked to an onsite if you want a job

iamcreasy|1 year ago

< 10mW, including radio

Does it mean radio is using portion of this 10mW? If so, how much?

jappgar|1 year ago

why should it be lossless when presumably there is a lot of noise you don't really need to preserve

p0nce|1 year ago

exactly, when you look at the data it looks entirely like noise without any signal, why transmit that in the first place. And why losslessly.