(no title)
tekacs | 25 days ago
The thing that makes it particularly misleading is that models that do transcription to lowercase and then use inverse text normalization to restore structure and grammar end up making a very different class of mistakes than Whisper, which goes directly to final form text including punctuation and quotes and tone.
But nonetheless, they're claiming such a lower error rate than Whisper that it's almost not in the same bucket.
tekacs|25 days ago
There's a reason that quite a lot of good transcribers still use V2, not V3.
satvikpendem|25 days ago