One big issue with gathering this kind of data over the phone is the frequency cutoff on voice-only lines—above a certain frequency (I want to say 4kHz? maybe I'm misremembering), the information is lost. It's basically as if you took a Fourier transform, zeroed out everything above the threshold, and then transformed back.[0] For humans (and even computers) trying to interpret the sound as language, that's not a huge problem, although you might lose some of the higher formants. But for an acoustic analysis that's trying to do voiceprinting—in this case to detect Parkinson's—this could be a big problem.
(I'm also irritated by the glib "99 percent success rate" but I just ranted about that on a different HN post so I won't go into it here.)
[0] Why do this? So the phone company can compress and send a lot more data over the same amount of internal bandwidth. Come to think of it, it's kind of related to how wavelet-based compression works.
The sampling frequency doesn't directly effect the audio frequencies it can encode. Telephones do PCM encoding (meaning it has data representing the graph of the sound wave) at 8 kh/z. Following the nyquist sampling theorem (cut your rate by 2), this can allow frequencies up to 4 kh/z (as you said). It's not a hard cutoff though, you can still get most of the sounds above that pitch, they'll just sound pretty weird (as if you were talking on the telephone!)
> So the phone company can compress and send a lot more data
Actually, the limits date back to analog phone lines with circuit switching of a century ago. Back then there was a wire going through switches from one phone to another.
The quality requirements were that those lines had to pass 300 Hz to 3.4 kHz or so. That often required "pupinization" - adding inductive coils to tune the line's frequency response.
If you look at a spectrogram of your voice, there's very little power above 1.5 or 2 kHz. However, it seems the high frequency part is important to understandability, including perception of emotional overtones.
(Just the other day, playing with modems, we found a weird case - voice being pumped through before the call was considered completed - which I suspect is the persistence in digital protocols of the analog behavior of a century ago.)
It's a very interesting field. Can insurance companies detect the early stages of Parkinson's when you call their call-centers and change your rates accordingly? Here's a related dissertation on detecting mental health condition by using voice recordings http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-55... (haven't read it yet, but was meaning to for quite a while)
While 8 kHz is low if you look at the carrier of the speech signal, which is something like the pitch of the speech, it is pretty high compared to the envelope frequency of the speech signal. You do not move your jaw, tounge and lips at anywhere near 8000 times per second. The algorithm looks at things like how the jaw, tounge and lips move during speech, and I guess 8 kHz is quite enough for those.
I've never had poor cell phone reception change the frequency of the sound, only silence, choppiness, or complete garbage being inserted - all of which should be detectable via typical DSP algos.
If this works successfully then it would be great but as a 28 year old with Parkinson's, I'm skeptical.
It took approximately 12 months, numerous blood tests, MRIs and doctor visits to diagnose me as having YOPD, so I'm finding it hard to believe that this could be replaced with a telephone call.
I wonder what the false-negative rate of this algorithm is. 99% (which I assume is the true-positive rate) is certainly impressive for such a simple test, though.
It would be all kinds of awesome if this turns into something that can be done reliably and routinely!
Yeah, no offense, but all you've written is either widely known (Amphetamine-induced DA depletion inducing Parkinson's-like motor symptoms and so on), based on misunderstandings (some of your graph readings are horrifically beside the point), or simply wrong.
Grab a book on neuropharmacology, or even just a neuroscience intro ("Fundamental Neuroscience" by Squire and colleagues is a good one), and accept that armchair-bullshitting is not how progress is made.
Wonder whether it's possible for them to crowdsource this via web instead of the phone lines, which would make it easier for more people to participate.
[+] [-] martingoodson|13 years ago|reply
They are using random forest out-of-sample error as a metric but doing feature selection before this step (see table 6).
As far as I can make out from a quick reading they are essentially making the error described here: http://www-stat.stanford.edu/%7Etibs/sta306b/cvwrong.pdf
and elegantly described in this recent blog post: http://blog.kaggle.com/2012/07/06/the-dangers-of-overfitting...
On a sample size of only 42 people, overfitting seems very likely.
[+] [-] blahedo|13 years ago|reply
(I'm also irritated by the glib "99 percent success rate" but I just ranted about that on a different HN post so I won't go into it here.)
[0] Why do this? So the phone company can compress and send a lot more data over the same amount of internal bandwidth. Come to think of it, it's kind of related to how wavelet-based compression works.
[+] [-] rachelbythebay|13 years ago|reply
http://en.wikipedia.org/wiki/DS0
Sure, we can do more now, but this isn't exactly new. Consider the age in which it was conceived and it may seem more reasonable.
[+] [-] NinetyNine|13 years ago|reply
[+] [-] ableal|13 years ago|reply
Actually, the limits date back to analog phone lines with circuit switching of a century ago. Back then there was a wire going through switches from one phone to another. The quality requirements were that those lines had to pass 300 Hz to 3.4 kHz or so. That often required "pupinization" - adding inductive coils to tune the line's frequency response.
If you look at a spectrogram of your voice, there's very little power above 1.5 or 2 kHz. However, it seems the high frequency part is important to understandability, including perception of emotional overtones.
(Just the other day, playing with modems, we found a weird case - voice being pumped through before the call was considered completed - which I suspect is the persistence in digital protocols of the analog behavior of a century ago.)
[+] [-] rouli|13 years ago|reply
[+] [-] justinph|13 years ago|reply
That's pretty amazing if they can detect it within the poor 8khz of a phone. I wonder how awful cell phone reception changes the accuracy.
[+] [-] tspiteri|13 years ago|reply
[+] [-] chime|13 years ago|reply
[+] [-] Jim_Neath|13 years ago|reply
It took approximately 12 months, numerous blood tests, MRIs and doctor visits to diagnose me as having YOPD, so I'm finding it hard to believe that this could be replaced with a telephone call.
[+] [-] DigitalJack|13 years ago|reply
[+] [-] azza-bazoo|13 years ago|reply
It would be all kinds of awesome if this turns into something that can be done reliably and routinely!
Edit: seems like 99% is only for later stages of Parkinson's, and the accuracy number is just off a 50-person sample. Less impressive, but still cool. http://www.forbes.com/sites/singularity/2012/07/03/new-softw...
[+] [-] pbhjpbhj|13 years ago|reply
[+] [-] SoftwareMaven|13 years ago|reply
[+] [-] Urgo|13 years ago|reply
[+] [-] nathan87|13 years ago|reply
Michael J Fox's Parkinson's Disease may have been caused by amphetamine use and/or sleep deprivation http://www.nathanwailes.com/forum/viewtopic.php?f=3&t=22...
Sleep Deprivation May Be a Common Cause of Parkinson's Dis. http://www.nathanwailes.com/forum/viewtopic.php?f=3&t=24...
[+] [-] apl|13 years ago|reply
Grab a book on neuropharmacology, or even just a neuroscience intro ("Fundamental Neuroscience" by Squire and colleagues is a good one), and accept that armchair-bullshitting is not how progress is made.
[+] [-] laktek|13 years ago|reply
[+] [-] rada|13 years ago|reply
[+] [-] timClicks|13 years ago|reply
[+] [-] copyrightip|13 years ago|reply
http://www.youtube.com/watch?v=VYKBPd2Kowo
[+] [-] cteng04|13 years ago|reply