Ask HN: How do browsers isolate internal audio from microphone input?
238 points| dumbest | 1 year ago
Does anyone know how Chrome and Chromium achieve this audio isolation?
Given that Chromium is open source, it would be helpful if someone could point me to the specific part of the codebase that handles this. Any insights or technical details would be greatly appreciated!
[+] [-] padenot|1 year ago|reply
Within a single process, or tree of processes that can cooperate, this is straightforward (modulo the actual audio signal processing which isn't) to do: keep what you're playing for a few hundreds milliseconds around, compare to what you're getting in the microphone, find correlations, cancel.
If the process aren't related there are multiple ways to do this. Either the OS provides a capture API that does the cancellation, this is what happens e.g. on macOS for Firefox and Safari, you can use this. The OS knows what is being output. This is often available on mobile as well.
Sometimes (Linux desktop, Windows) the OS provides a loopback stream: a way to capture the audio that is being played back, and that can similarly be used for cancellation.
If none of this is available, you mix the audio output and perform cancellation yourself, and the behaviour your observe happens.
Source: I do that, but at Mozilla and we unsurprisingly have the same problems and solutions.
[+] [-] Johnie|1 year ago|reply
>The missile knows where it is at all times. It knows this because it knows where it isn't. By subtracting where it is from where it isn't, or where it isn't from where it is (whichever is greater), it obtains a difference, or deviation
https://knowyourmeme.com/memes/the-missile-knows-where-it-is
[+] [-] wormius|1 year ago|reply
Here's a short historical interview with Harold Black from AT&T on his discovery/invention of the negative feedback technique for noise reduction. It's not super explanatory but a nice historical context: https://youtu.be/iFrxyJAtJ7U?si=8ONC8N2KZwq3Jfsq
Here's a more indepth circuit explanation: https://youtu.be/iFrxyJAtJ7U?si=8ONC8N2KZwq3Jfsq
IIRC the issue was AT&T was trying to get cross-country calling, but to make the signal carry further you needed a louder signal. Amplifying the signal also the distortion.
So Harold came up with this method that ultimately allowed enough signal reduction to allow calls to cross the country within the power constraints available.
For some reason I recall something about transmission about Denver being a cut off point before the signal was too degraded... But I'm too old and forgetful so I could be misremembering something I read a while ago. If anyone has more specific info/context/citations that'd be great. Since this is just "hearsay" from memory, but I think it's something like this.
[+] [-] gpvos|1 year ago|reply
[+] [-] Log_out_|1 year ago|reply
[+] [-] generalizations|1 year ago|reply
[+] [-] umutisik|1 year ago|reply
[+] [-] Sponge5|1 year ago|reply
[+] [-] sojuz151|1 year ago|reply
[+] [-] meindnoch|1 year ago|reply
Can't tell you anything else due to NDAs.
[+] [-] Wowfunhappy|1 year ago|reply
(I realize this situation isn't up to you and I appreciate that you chimed in as you could!)
[+] [-] geor9e|1 year ago|reply
[+] [-] codetrotter|1 year ago|reply
https://news.ycombinator.com/item?id=39669626
> I've been working on an audio application for a little bit, and was shocked to find Chrome handles simultaneous recording & playback very poorly. Made this site to demo the issue as clearly as possible
https://chrome-please-fix-your-audio.xyz/
[+] [-] filleokus|1 year ago|reply
> <[email protected]>
> Status: Won't Fix (Intended Behavior)
> Looking at the sample in https://chrome-please-fix-your-audio.xyz, the issue seems to be that the constraints just aren't being passed correctly [...]
> If you supply the constraints within the audio block of the constraints, then it seems to work [...]
> See https://jsfiddle.net/40821ukc/4/ for an adapted version of https://chrome-please-fix-your-audio.xyz. I can repro the issue on the original page, not on that jsfiddle.
https://issues.chromium.org/issues/327472528#comment14
[+] [-] supriyo-biswas|1 year ago|reply
It's a fairly common problem in signal processing, and comes up in "simple" devices like telephones too.
[1] https://www.mathworks.com/help/audio/ug/acoustic-echo-cancel...
[+] [-] mananaysiempre|1 year ago|reply
[+] [-] kajecounterhack|1 year ago|reply
[+] [-] atoav|1 year ago|reply
This is needed because many people don't use headphones and if you have more than one endpoint with mic and speakers open you will get feedback gallore if you don't do something to suppress it.
[+] [-] j45|1 year ago|reply
I'd say it depends on the combination of the hardware/software/OS that does pieces of it on how audio routing comes together.
Generally you have to see what's available, how it can or can't be routed, what software or settings could be enabled or added to introduce more flexibility in routing, and then making the audio routing work how you want.
More specifically some datapoints:
SOUND DRIVERS: Part of this can be managed by the sound drivers on the computer. Applications like web browsers can access those settings or list of devices available.
Software drivers can let you pick what's that's playing on a computer, and then specifically in browsers it can vary.
CHANNELS: There are often different channels for everything. Physical headphone/microphone jacks, etc. They all become devices with channels (input and output).
ROUTING: The input into a microphone can be just the voice, and/or system audio. System audio can further be broken down to be specific ones. OBS has some nice examples of this functionality.
ADVANCED ROUTING: There are some audio drivers that are virtual audio drivers that can also help you achieve the audio isolation or workflow folks are after.
[+] [-] alihesari|1 year ago|reply
[+] [-] _flux|1 year ago|reply
E.g. PulseAudio and Pipewire have a module for echo cancellation.
[+] [-] danhau|1 year ago|reply
There‘s a similar question on SO: https://stackoverflow.com/questions/21795944/remove-known-au...
[+] [-] exabrial|1 year ago|reply
What's really interesting is I can get the algorithm to "mess up" by using external speakers a foot or two away from my computer's mic! Just that little bit of travel time is enough to screw with the algo.
[+] [-] meatmanek|1 year ago|reply
It might be that whatever program you're using doesn't know the difference between speakers and headphones (possibly because you're using the 3.5mm jack?)
[+] [-] cbracketdash|1 year ago|reply
[+] [-] hpen|1 year ago|reply
[+] [-] glii|1 year ago|reply
[+] [-] bigbones|1 year ago|reply
[+] [-] sciencesama|1 year ago|reply
[+] [-] blharr|1 year ago|reply
[+] [-] sciencesama|1 year ago|reply
[+] [-] lowdownbutter|1 year ago|reply
[+] [-] sociorealist|1 year ago|reply
[deleted]