top | item 29501382

(no title)

gbh444g | 4 years ago

Hi HN!

I used the meow sounds from https://soundspunos.com/animals/10-cat-meow-sounds.html. I expected to see very little variability in the meows, maybe just 4-5 different types for basic emotions. To my surprise, each “cat meow” has astonishingly colorful, complex and unique structure, unlike human vowels that follow a more or less predictable pattern: https://soundshader.github.io/vowels.

The algorithm behind these images is fairly simple. It computes FFT to decompose the sound into a set of A·cos(2πwt+φ) waves and drops the phase φ to align all cos waves together. This is known as the auto-correlation function (ACF). Before merging them back, it colorizes each wave using its frequency w: the A notes (432·2ⁿ Hz) become red, C notes - green, E notes - blue, and so on. Finally, it merges the colored and aligned cos waves back, using the amplitude A for color opacity, and renders them in polar coordinates, where the radial coordinate is time.

discuss

order

dsizzle|4 years ago

I think it'd be useful to have the same graphs for some non-cat sounds, like the human vowels that you mention. It's not clear if there's something particularly interesting about cats or not here.

vanusa|4 years ago

I would agree that the images are rather intriguing, but ... what does all this visual structure actually mean?

I'm guessing some kind of overtone structure in these sounds (perhaps decipherable to cats, but not to us)?

I await your insight.

nixpulvis|4 years ago

I know from personal experience, cat's use a lot of inflection in their voices. I'm not at all surprised by the images (though I don't know exactly what they mean). This inflection directly effects the image, because it modulates the pitch of the meow. Another factor, completely unrepresented in these images is the lower frequency components connected by seconds of silence. Most cats are pretty quiet, but sometimes they get the oral equivalent of the zoomies.

I would love to see more examples across species and variants of felines, under controlled conditions. Could you figure out an appropriate color map?

Not to start a dog vs cat war, but as someone who loves both, I think I can safely say that cats put much more information in their voices than dogs, for example.

gbh444g|4 years ago

Interpreting ACF images:

1. Time progresses from the center to the edge of the circle.

2. Color means note, e.g. A4=432Hz is red, but so is A1, A2 and all other A notes. B is orange, C is yellow, D is green and so on.

3. The amount of fine details is frequency: the higher the frequency, the more fine details you see. If notes of different colors and different frequencies sound simultaneously, e.g. a A2 with a G5, you’ll see a red belt with a few repetitions mixed with a blue belt with 8x more repetitions, so the result will be a purple belt with a fine structure.

For example, on one image below there is a green belt with 10 repetitions. One repetition correponds to 13.5 Hz here (55296 Hz sample rate, 4096 FFT bins), so 10 repetitions is 135 Hz, which corresponds to C3. On another image there is a curious red cross in the center, it’s a red belt with 2 repetitons. That’s 27 Hz, or A0, almost infrasound.

SomeHacker44|4 years ago

I would think that the absolute notes of the sounds are not the relevant metric (that was colored here). I naively imagine it is the relative tonal structure. That is, meaning does not come from particular frequencies, but from the relationship of the frequencies. I wonder, if so, how that might be represented and normalized. Just like we can understand two people speaking with different basic pitches, then add meaning when they add shifts from those basic pitches.

Either way super pretty visualization!