(no title)
zebproj | 2 years ago
The underlying signal processing used for both is different, but both use a source-filter mechanism.
zebproj | 2 years ago
The underlying signal processing used for both is different, but both use a source-filter mechanism.
blincoln|2 years ago
But I'm still unsure why those two things sound so similar to each other, and formant/LPC chips sound so similar to each other, but the two groups of things sound so dissimilar (at least, IMO).
I have a background in electronic music, so I'm pretty familiar with additive, subtractive, and other types of synthesis.
I'm especially surprised about the physical modelling sounding more like a formant chip, because a guitar "talk box" gives a sound exactly like a vocoder, and that should be almost the same thing, just with a real human mouth instead of a model.
TheOtherHobbes|2 years ago
The bandpass filters have a steeper cutoff than usual and are flatter at the top of the passband than usual. And the centre frequencies aren't linearly spaced. But otherwise - it's just a fancy graphic EQ.
The formant approach uses dynamic filters. It's more like an automated parametric EQ. Each formant is modelled with a variable BPF with its own time-varying level, frequency, and possibly Q. You apply that to a simply buzzy waveform and get speech-like sounds out. If you vary the pitch of the buzz you can make the output "sing."
LPC uses a similar model but it applies data compression to estimate future changes for each formant band. So instead of having to control all the parameters at or near audio rate, you can drop the control rate right down and still get something that can be understood.
There are more modern systems. FOF and FOG use granular synthesis to create formant sounds directly. Controlling the frequency and envelope of the grains is equivalent to filtering a raw sound, but is more efficient.
FOF and FOG evolved into PSOLA which is basically real-time granulated formant synthesis and pitch shifting.
zebproj|2 years ago
In general, tract physical models have never sounded all that realistic. The one big thing they have going for them is control. Compared to other speech synthesis techniques, they can be quite malleable. Pink Trombone [1] uses a physical model under the hood. While it's not realistic sounding, the interface is quite compelling.
1: https://dood.al/pinktrombone/