Given the Baidu work, I think we can safely say that Hinton et al's forecasts 4 years ago were on the money. Deep approaches are now clearly dominant and have yielded fantastic performance.
The linked paper is 4 years old. DNNs have been dominant in speech since 2012. No one uses GMM systems anymore.
Baidu's approach isn't even the best (IBM's system tends to beat theirs on accuracy, and google tends not to publish numbers on known benchmarks), it's notable mostly for its use of RNNs to do pronunciation and language modeling (although they also tack on a mod-KN LM).
praccu|10 years ago
The linked paper is 4 years old. DNNs have been dominant in speech since 2012. No one uses GMM systems anymore.
Baidu's approach isn't even the best (IBM's system tends to beat theirs on accuracy, and google tends not to publish numbers on known benchmarks), it's notable mostly for its use of RNNs to do pronunciation and language modeling (although they also tack on a mod-KN LM).