top | item 44960613

(no title)

ryeguy_24 | 6 months ago

Exactly. This was my point. Televisions can upconvert from 720p to 4k. In the same sense, the machine learning model would fill in the waveform and mimic a high powered mic. It can do this at the connection point (iPhone / computer).

discuss

bigyabai|6 months ago

Televisions have considerably more temporal data to work with than an audio stream does. It's very easy to hack together interpolated images, not so easy to predict/denoise/upres time-series audio information.

Past a certain point it's probably easier/more efficient to use the Airpods as a speech-to-text mic and then infer a "high quality" text-to-speech version on your connected device.