top | item 47174931

(no title)

SchemaLoad | 3 days ago

I'd say speech to text is unsolvable for a more fundamental reason that it's hard to actually speak out an entire document flawlessly in one take.

Spoken language is very different to written language, which is why for example you can easily tell when an article is transcribing a spoken interview.

discuss

order

asdff|2 days ago

Even today seems like speech to texts works like it did 25 years ago where its breaking up sentences into individual words and trying to match the individual words. So you might get these stupid nonsense sentences from similar sounding words. It isn't like an old school human transcriber where they might miss words on the recording but they can fill in the blanks using their own knowledge of the language or how the speaker talks.

jamilton|3 days ago

Yes, it's a UX thing. You'd still have to edit it by typing afterwards as well.

Similarly, raw LLM/chat interfaces are usually not the best option.