top | item 43901398

(no title)

riquito | 10 months ago

Very cool, thanks for sharing.

A couple questions: - any thought about wake word engines, to have something that listen without consuming all the time? The landscape for open solutions doesn't seem good - any plan to allow using external services for stt/tts for the people who don't have a 4090 ready (at the cost of privacy and sass providers)?

discuss

TeMPOraL|9 months ago

FWIW, wake words are a stopgap; if we want to have a Star Trek level voice interfaces, where the computer responds only when you actually meant to call it, as opposed to using the wake word as a normal word in the conversation, the computer needs to be constantly listening.

A good analogy here is to think of the computer (assistant) as another person in the room, busy with their own stuff but paying attention to the conversations happening around them, in case someone suddenly requests their assistance.

This, of course, could be handled by a more lightweight LLM running locally and listening for explicit mentions/addressing the computer/assistant, as opposed to some context-free wake words.

Dr4kn|9 months ago

Home Assistant is much nearer to this than other solutions.

You have a wake word, but it can also speak to you based on automations. You come home and it could tell you that the milk is empty, but with a holiday coming up you probably should go shopping.

Dlemo|9 months ago

I want that for privacy reasons and for resource reasons.

And having this as a small hardware device should not add relevant latency to it.

koljab|9 months ago

That would be quite easy to integrate. RealtimeSTT already has wakeword support for both pvporcupine and openwakewords.

justlikereddit|9 months ago

Modify it with an ultra light LLM agent that always listens that uses a wake word to agentically call the paid API?

Dr4kn|9 months ago

You could use open wake word. Which Home Assistant developed for its own Voice Assistant