(no title)
adamesque | 10 months ago
But I’ve noticed/learned that I can’t dictate written content. My brain just does not work that way at all — as I write I am constantly pausing to think, to revise, etc and it feels like a completely different part of my brain is engaged. Everything I dictated with Aqua I had to throw away and rewrite.
Has anyone had similar problems, and if so, had any success retraining themselves toward dictation? There are fleeting moments where it truly feels like it would be much faster.
SCdF|10 months ago
It's very hard, and I wouldn't do it if I didn't have to.
(which is why I'm always perplexed by these apps which allow voice dictation or voice control, but not as a complete accessibility package. I wouldn't be using my voice if my hands worked!)
It's also critically important (and after 3-4 years of this I still regularly fail at this) to actually read what you've written, and edit it before send, because those chunks don't always line up into something that I'd consider acceptably coherent. Even for a one sentence slack message.
(also, I have a kiwi accent, and the dictation software I use is not always perfect at getting what I wanted to say on the page)
e12e|10 months ago
In my experience LLM can be quite forgiving when given some unfinished input and asked to expand/clean up?
noahjk|10 months ago
1. like you mentioned, the second I start talking about something, I totally forget where I'm going, have to pause, it's like my thoughts aren't coming to me. Probably some sort of mental feedback loop plus, like you mentioned, different method of thinking.
2. in the back of my mind, I'm always self-conscious that someone is listening, so it's a privacy / being judged / being overheard feeling which adds a layer of mental feedback.
There's also not great audio clues for handling on-the-fly editing. I've tried to say "parentheses word parentheses" and it just gets written out. I've tried to say "strike that" and it gets written out. These interfaces are very 'happy path' and don't do a lot of processing (on iOS, I can say "period" and get a '.' (or ?,!) but that's about the extent).
I have had some success with long-form recording sessions which are transcribed afterwards. After getting over the short initial hump, I can brain-dump to the recording, and then trust an app like Voice Notes or Superwhisper to transcribe, and then clean up after.
The main issue I run into there, though, is that I either forget to record something (ex. a conversation that I want to review later) or there is too much friction / I don't record often enough to launch it quickly or even remember to use that workflow.
I get the same feeling with smart home stuff - it was awesome for a while to turn lights on and off with voice, but lately there's the added overhead of "did it hear me? do I need to repeat myself? What's the least amount of words I can say? Why can't I just think something into existence instead? Or have a perfect contextual interface on a physical device?"
the_king|10 months ago
1. The models weren't ready.
2. The interactions were often strained. Not every edit/change is easy to articulate with your voice.
If 1 had been our only problem, we might have had a hit. In reality, I think optimizing model errors allowed us to ignore some fundamental awkwardness in the experience. We've tried to rectify this with v2 by putting less emphasis on streaming for every interaction and less emphasis on commands, replacing it with context.
Hopefully it can become a tool in the toolbox.
adamesque|10 months ago
unknown|10 months ago
[deleted]
jmcintire1|10 months ago
Voice is great for whenever the limiting factor to thought is speed of typing.
cloogshicer|10 months ago