The audio signal is encoded in AM signal by effectively multiplying a high-frequency "carrier" signal (like a 1.4 MHz sine wave) with the amplitude of the audio signal (some squiggles that match the air pressure waves we perceive as sound, around 0-20 KHz).
As a result, the peak of the voltage on the tower is changing with rise and fall of the audio signal. And that peak voltage change is changing the spark in some way that makes it get hotter or cooler really fast (or something), which causes the air that is getting zapped to expand and contract, resulting in an air pressure wave that is relative to the original audio.
There's probably a pile of distortion due to all of the physics of the arc (high frequencies seem louder), but humans are pretty good at hearing the human voice through all of that anyways.
zeroping|2 years ago
As a result, the peak of the voltage on the tower is changing with rise and fall of the audio signal. And that peak voltage change is changing the spark in some way that makes it get hotter or cooler really fast (or something), which causes the air that is getting zapped to expand and contract, resulting in an air pressure wave that is relative to the original audio.
There's probably a pile of distortion due to all of the physics of the arc (high frequencies seem louder), but humans are pretty good at hearing the human voice through all of that anyways.