top | item 22752299

Improving Audio Quality in Duo with WaveNetEQ

93 points| theafh | 6 years ago |ai.googleblog.com

35 comments

order
[+] trishume|6 years ago|reply
I can't wait for gradual improvements in deep learning allow video calls to extrapolate my presence for longer and longer time segments, until my connection can drop out for minutes and WebRTC will make it convincingly seem like I'm still participating in the meeting, making reasonable suggestions for possible code architectures and screen sharing hallucinated PowerPoint presentations.

More seriously I worry about cases where it hallucinates the wrong part of a word and changes the meaning of what I was trying to say. Although we kind of already live in this world with phone autocorrect and it's mostly fine. I'm weird though and disable autocorrect on my phone so that when I make a typo it's obviously a typo instead of just saying a plausible but wrong thing.

[+] jjoonathan|6 years ago|reply
It was easy to tell when autocorrect switched to machine learning approaches: instead of getting the basics correct and occasionally missing long words or proper names, it immediately and aggressively began to attack correct sentences, committing grade-school grammar mistakes on your behalf if you weren't vigilant enough to catch it in the act and revert its "helpful" changes.
[+] 77ko|6 years ago|reply
Duo is sad, and I feel Google is missing the point with all their tech improvements with Duo but no usability improvements.

I bought a Nest Hub Max for kids and grandparentst for video calls using Duo with family, and for one on one calls it's great especially for kids who move around, but with quantine the whole world (my version!) has quickly moved on to group video calls which ppl can enter and leave as they please.

Consequently everyone has stopped using Duo as it doesn't support that, and turns out you need to be able to easily chat with your calling group, another thing duo doesn't support.

Zoom, WhatsApp, Facebook Messenger all work much better from a talking to a family group with varying tech know-how.

Zoom is a dedicated video but works much better than duo as you can make a link and drop it in a chat group for ppl to join. Duo has no such features!

So my fancy duo called machine is now just a photo display.

[+] lern_too_spel|6 years ago|reply
Worse, Hangouts supported that at the time Duo launched. Duo was a downgrade.
[+] tacomonstrous|6 years ago|reply
Duo supports group calls on mobile devices and browsers FYI.
[+] srameshc|6 years ago|reply
I use Signal and Duo a lot to talk to my Dad in India and I can say even with low bandwidth when nothing else works, Duo audio calls are crisp clear.
[+] pthatcherg|6 years ago|reply
I work on calling at Signal. If Signal worked for calling your dad in India, would use it instead of Duo? If so, would you mind helping us improve it by providing feedback for new builds and that sort of thing? If so, email me at [email protected]. Detailed feedback from you might help a lot.
[+] fulafel|6 years ago|reply
They lead with some very interesting numbers:

" 99% of Google Duo calls need to deal with packet losses, excessive jitter or network delays. Of those calls, 20% lose more than 3% of the total audio duration due to network issues, and 10% of calls lose more than 8%."

This sounds very puzzling. If those are packet loss numbers, those conditions shouldn't be able to carry normal TCP (eg HTTP/HTTPS) traffic properly. And these are the people ambitious enough to try Duo so probably 20% would be a lower bound for the unusably-bad-network user share.

Is there published research to correlate these numbers with?

Can there be other explanations? Eg people on mobile connections that just cut out for long periods of time in middle of calls?

[+] nobrains|6 years ago|reply
TCP is self correcting. So in the end the receiver gets the whole data, correctly. But it happens with a delay.

For audio you need real-time synchronous data feed, and cannot wait for the error correction round trip to happen.

[+] yufeng66|6 years ago|reply
I can't tell if this a April fool joke or real
[+] ttul|6 years ago|reply
Google isn’t doing April Fools this year.
[+] jepcommenter|6 years ago|reply
Next step is to build speaker model on device and send it to peer during call setup
[+] josteink|6 years ago|reply
Not to sound negative, but what’s the point?

Does people really still use a Google-based communication program after all the ones Google has killed in the past? Is there an actual user base to reach using them?

Personally I’m avoiding them all and never investing in any of these again.

Also: We’re a few years down Duo still has fewer features than the other Google-owned IM-product it “replaced” ffs.

Rant over. (And yes, I’m still somewhat sore about that whole Gchat thing and how that went down.)

[+] bilal4hmed|6 years ago|reply
Google Pay, Youtube and Duo are very very popular in India. Try thinking global and not so US focussed for everything
[+] recursive|6 years ago|reply
> Does people really still use a Google-based communication program

Yes.

[+] tylerchilds|6 years ago|reply
The thing about Google is that they're pretty R&D heavy. Making Duo into a killer video app isn't the point, it's to handle all the technical edge cases in a problem space and then apply that to every system at Google.

I use Google Fi for phone calls and it seems like it's fairly lossy, but they can take their improvements, such as this from Duo, and apply it using software to make it less noticeable to me, a user on a completely different product line.

[+] freepor|6 years ago|reply
I use them, because when they disappear I can quickly move to a new one. If I'm building a data "asset" then I am careful about using something that will be around for the long term but if it's just for an ephemeral video call I'll use Joe's until he goes out of business then I'll use Bob's.