(no title)
Taek | 15 days ago
Gemini 3 Pro Preview has superlative audio listening comprehension. If I send it a recording of myself in a car, with me talking, and another passenger talking to the driver, and the radio playing, me in English, the radio in Portuguese, and the driver+passenger in Spanish, Gemini can parse all 4 audio streams as well as other background noises and give a translation for each one, including figuring out which voice belongs to which person, and what everyone's names are (if it's possible to figure that out from the conversation).
I'm sure it would have superlative audio generation capabilities too, if such a feature were enabled.
nickpsecurity|15 days ago