top | item 41777963

(no title)

nowayno583 | 1 year ago

Does anyone understand why they are taking the difference between transformers instead of the sum? It seems to me that in a noise reducing solution we would be more interested in the sum, as random noise would cancel out and signal would be constructive.

Of course, even if I'm right proper training would account to that by inverting signs where appropriate. Still, it seems weird to present it as the difference, especially seeing as they compare this directly to noise cancelling headphones, where we sum both microphones inputs.

discuss

order

aDyslecticCrow|1 year ago

The noise isn't truly random; it's just a matrix of small values that shouldn't be taken into account. Subtracting them cancels them out.

As pointed out by a different comment, it's actually the attention we are interested in that is cancelled out *if they are both equal*. This is what the paper mentions in its abstract;

> promoting the emergence of sparse attention patterns

In theory, it is quite clever, and their results seem to back it up.

thegeomaster|1 year ago

I suspect that plus vs minus is arbitrary in this case (as you said, due to being able to learn a simple negation during training), but they are presenting it in this way because it is more intuitive. Indeed, adding two sources that are noisy in the same way just doubles the noise, whereas subtracting cancels it out. It's how balanced audio cables work, for example.

But with noise cancelling headphones, we don't sum anything directly---we emit an inverted sound, and to the human ear, this sounds like a subtraction of the two signals. (Audio from the audio source, and noise from the microphone.)

nowayno583|1 year ago

Oh! It's been a good while since I've worked in noise cancelling. I didn't know current tech was at the point where we could do direct reproduction of the outside noise, instead of just using mic arrays! That's very cool, it used to be considered totally sci fi to do it fast enough in a small headset.