Ask HN: Private Alternatives to Alexa?
We've had Amazon Alexa for five years, since the gen 1 device, and now find it to be an increasing invasion to the sanctity of the home.
We find it particularly annoying that nowadays, when you ask Alexa to do something, several times per week it will suggest some annoying upsell crap you don't care about. It used to suggest things once per quarter tops, that was fine.
HN, what private alternatives to Alexa may exist?
For example, does anyone make a system that's relatively polished and operates entirely in the home, with no audio sent to the cloud? I'd be happy to run a hub/box for the system.
[+] [-] lcvw|4 years ago|reply
To be clear, I’m not concerned about cloud processing or even mining my data. Just something I can have a reasonable amount of control over and that doesn’t constantly enable features I don’t want on its own.
[+] [-] mensetmanusman|4 years ago|reply
[+] [-] BiteCode_dev|4 years ago|reply
People said again and again it's going to be abused, and you had the technical knowledge to evaluate the risk.
Yet you went with it.
If it was not your scenario, it would have been another one. There will be other ones. Hell, with things like PRISM, that's inviting 3 letters agencies in your bedrooms.
Since the pandora box is opened and people are going to do it anyway, we should at least militate for having hardware switches on any sensors for any device.
[+] [-] jvolkman|4 years ago|reply
For what it's worth, Google Homes don't have this drop-in feature. You can make calls and make announcements, but both require the person on the other end to actively answer or respond.
[+] [-] jboy55|4 years ago|reply
[+] [-] kello|4 years ago|reply
[+] [-] follower|4 years ago|reply
While I've not used the entire Rhasspy project myself (but trying it out is on the long list of things to do :) ) I have used the offline Text-To-Speech sub-project Larynx...
...and it is amazing!
Larynx is significantly ahead in terms of quality of output & variety of voices (fifty--across multiple languages, accents & genders) of any other FLOSS Text-To-Speech project I've tried.
I think the relative new-ness of the project is part of the reason Larynx (https://github.com/rhasspy/larynx/) currently flies under the radar.
If the rest of Rhasspy is as good as Larynx I'd imagine it's worth trying out.
Larynx demo video: https://www.youtube.com/watch?v=hBmhDf8cl0k
Samples of pre-trained voices: https://rhasspy.github.io/larynx/#en-us
[+] [-] lukifer|4 years ago|reply
The only thing I struggled with was getting the wake-word config right: I could never find the right balance point where it responded every time, without also having annoying false positives, so I ended up turning it off. It does support multiple wake-word engines; I'm gonna have another go with Picovoice Porcupine now that they're opened up custom wake-word training for free.
I'm most heavily experienced with Rhasspy's sister project, voice2json [1], which I used to build a voice-controlled car jukebox [2], and it's been working fantastically. (It triggers from a Bluetooth remote, so no wake-word issues.) The two projects share the same core engine.
For hardware, Raspberry 3/4 perform quite well, and strong recommend for ReSpeaker [3] for audio (either usb or 4-mic hat).
[0] https://nodered.org/
[1] http://voice2json.org/
[2] https://github.com/lukifer/voicetunes
[3] https://www.seeedstudio.com/category/Speech-Recognition-c-44...
[+] [-] 2Gkashmiri|4 years ago|reply
check the neural then british, Amy. the "smoothness" is uncanny. the samples are "almost" there with larynx you linked. good but take kathleen (glow_tts), there is "still" some robotic in there. is this something that can be improved by tweaking the training ? this sounds really cool to be used at home
[+] [-] 3np|4 years ago|reply
https://news.ycombinator.com/item?id=29565983
https://homeintent.io/
[+] [-] Semaphor|4 years ago|reply
You want a case, but then from my research, cases can easily interfere with those arrays. So you need one that’s custom-made for the array you are getting. But no array I’ve seen does actually come with such a case.
Back when I asked (relevant subreddits and on tildes before I deleted my account), no one could tell me that any of my research had been wrong, but no one had a solution either. I posted the threads almost 2 years ago, so maybe things changed? I’m currently still using Alexa, but besides privacy reasons I’d also love custom software that can take the idiocy out of my assistant (mainly by using pre-configured commands that do what I want instead of sometimes guessing what I want; also for on-the-fly language switching, Alexa is atrocious when you want to request a band that’s not in your primary language)
I could probably get away with the kitchen and office assistant using a normal microphone, but both bedroom and living room need to recognize voices from most directions (and in the case of the living room, also have decent recognition through music playing).
If anyone has any solutions, I’d love to hear them.
[+] [-] jffry|4 years ago|reply
It seems like they have an open source client that you can run on your own hardware but it's still dependent on the backend services that Mycroft-the-company provides. Perhaps their privacy stance is more palatable to you, though?
I think there used to be a project that was fully on-device but it got bought and consumed by Sonos
[+] [-] AlotOfReading|4 years ago|reply
* changing the wake word was a pain
* having a Home account is essentially mandatory (kind of defeating the point)
* speech synthesis was really bad
* It needed a lot of rhetorical help to get useful responses.
Just getting to that point involved several hours of fairly hardcore debugging and even then basic issues like reliable mic input still existed.
I also couldn't find reasonably priced (<$100) speakers/mics in an Alexa form factor, but it wouldn't be fair to blame that on the mycroft team.
[+] [-] qchris|4 years ago|reply
On the flip side, I do not, I repeat, DO NOT recommend giving their team money for pretty much any reason. They've been struggling to put together any working hardware (and promising the opposite) for years through a genuine comedy of errors. I think the most recent design iteration of MycroftV2 is something like a Raspberry Pi on a daughterboard in a 3D-printed case. I still check in on their blog and subreddit every few months to see what's changed, and everytime I'm rewarded with additional details on what seems to be one of the most incompetent engineering organizations I've ever heard of.
[+] [-] russnes|4 years ago|reply
After you get it to work, I'm not sure if your server would have a worse data set for speech recognition etc, maybe, maybe not. I'm guessing it should be all right because they are using a Mozilla free data set (which you contribute to by default I think, if you use their server).
[+] [-] michaelnik|4 years ago|reply
Commands which work nicely which we use all the time: "Hi Mycroft, set timer to X minutes." "play news", "set alarm to 6 am", "what's the temperature?", "will it rain?", "what's the time in Paris?".
I wrote METAR and TAF module for it to get more detailed weather (and learn some Python).
[+] [-] bluGill|4 years ago|reply
I backed the mark2 on kickstarter which they have been promising is better and will ship anytime now - for a few years...
[+] [-] totetsu|4 years ago|reply
[+] [-] 3np|4 years ago|reply
First off, unless you want to really DIY the glue-code, you want to use HomeAssistant (huge community) or NodeRED.
The only part I'm uncertain about and haven't explored properly myself is the voice-to-text part. Any solution should be pluggable into HA or NR.
Relevant threads in HA voice-assistant sub-forum:
* https://community.home-assistant.io/t/replacing-alexa-or-goo... (Ada, Rhasspy and other FOSS alternatives)
* https://community.home-assistant.io/t/local-voice-control/29... (you can apparently use Alexa for voice control even if it does not have any internet access)
* https://community.home-assistant.io/t/best-option-for-local-... (local TTS)
[+] [-] LeoPanthera|4 years ago|reply
Although audio is sent to the cloud, you can choose whether it is stored or not, and whatever you pick, Apple's privacy policy is very strict.
HomePods are mostly advertised as music playback devices, but I mostly use mine as a HomeKit control device.
Homebridge allows you to control non-HomeKit devices via Siri: https://homebridge.io
[+] [-] colordrops|4 years ago|reply
Apple has broken trust multiple times, including the FBI back door, and the CSAM fiasco. They are a public company beholden to shareholders, not your trusted friend from college. Even China has strong leverage over them due to their manufacturing and market there. They are fundamentally no different from Amazon.
The poster is asking for a private solution this is emphatically not it.
[+] [-] bonniemuffin|4 years ago|reply
[+] [-] nullc|4 years ago|reply
You choose to tell them to store it or not. Due to third party doctrine you have very little legal protection for material you voluntarily provide to a third party. Even if apple behaves honestly and has no vulnerabilities or compromises they may be forced to hand the data over.
And the fact that you don't know this says to me that Apple is in practice unethically exaggerating the level of privacy they're able to provide.
[+] [-] danielheath|4 years ago|reply
[+] [-] tzs|4 years ago|reply
If that works, that might be a solution for people who want the convenience of voice control but do not want an always-listening microphone in their house.
[+] [-] izacus|4 years ago|reply
[deleted]
[+] [-] Jarvy|4 years ago|reply
It uses Rhasspy under the covers, and automatically imports lights, switches, fans, etc and sets up sentences and intents for you. After initial setup, it can be used without an active internet connection.
All you need is some container knowledge, an extra Pi, and a good speakerphone (like a Jabra Speak 410) to get going.
Disclosure: I am the main author of Home Intent.
[+] [-] psandersen|4 years ago|reply
I'd primarily like to be able to turn on/off a few specific switches, set a few specific scenes and ideally play a few playlists (squeezebox integrated with home assistant). Would creating scripts in HA and then calling them from Hi be the way to go for customising without building custom components?
If this works well I'll be keen to add a few cheap pis plus ps3 eye cameras around the house.
[+] [-] 3np|4 years ago|reply
My family back home is hooked on Google Home but sister agreed to switch off for the kids - but her requirement is to be able to control Spotify via voice.
Given I set aside enough of my time to help them out and figure out appropriate hardware, do you think it's worth a shot with HI?
[+] [-] khimaros|4 years ago|reply
[+] [-] the100rabh|4 years ago|reply
[+] [-] Havoc|4 years ago|reply
Most things seem to assume a raspberry as the base hardware which somewhat implies cloud usage.
There is also hive mind which seems to get around the above issue (basically mycroft except with a more satellite mentality to mic placement)
Also been wondering whether it is possible to just use the cloud TTS since their voices are quite good. That should fall under GCP/AWS etc terms which seems a little more privacy friendly than straight alexa & friends.
Planning on having another go at this, but first tackling presence detection which is a minefield of note too
[+] [-] noitpmeder|4 years ago|reply
Essentially you can program your house to coded knocks. Lights/music/door/... all by sequences of knocks.
[+] [-] 0xdeadb00f|4 years ago|reply
[+] [-] thebiblelover7|4 years ago|reply
[+] [-] mattowen_uk|4 years ago|reply
Personally, I live fine without a voice controlled home assistant. I am however able bodied, and I can press buttons and flick switches without much thought. If that were to change, then yeah, I can see a need for these things, but they really need to be able to work offline with no internet connection should you wish have them configured like that.
[+] [-] abetusk|4 years ago|reply
Mozilla has DeepSpeech [0] and, while not as advanced as the stuff from Google or Amazon, my experimentation left me feeling pretty hopeful that it could reliably recognize at least keywords.
The Raspberry Pi is quite capable though you'll probably need some dedicated microphone to reliably catch voice data. I know ReSpeaker [1] but maybe some off the shelf conference USB microphones would work as well.
[0] https://github.com/mozilla/DeepSpeech
[1] https://wiki.seeedstudio.com/ReSpeaker_4_Mic_Array_for_Raspb...
[+] [-] mataug|4 years ago|reply
So if you've got a few apple devices such as a homepod / apple watch / iPhone, consider the homekit ecosystem, its been working well for me.
[+] [-] rcarmo|4 years ago|reply
The good thing about it is that it takes only a couple of minutes to install via Docker and has all the base niceties out of the box (trigger word, intent detection, speech synthesis).
I would say it is good enough to tinker with, although clearly not yet up to par (mind you, I am also trying it in English).
[+] [-] hipitihop|4 years ago|reply
[+] [-] 1vuio0pswjnm7|4 years ago|reply
[+] [-] betwixthewires|4 years ago|reply
There are open source speech to text engines, text to speech engines and assistant software with APIs, you could probably build something with a raspberry pi, I looked into it a while back but I don't really mind light switches.
"Open Assistant" should get the ball rolling for you, search that and dive down the rabbit hole of open source home automation.
[+] [-] 3a2d29|4 years ago|reply
I have never had an Alexa but surely there isn't that much time saved by having it turn off lights. Siri on your iPhone can set alarms and shopping lists and presumably your phone it always around you.
[+] [-] Talanes|4 years ago|reply
[+] [-] lixtra|4 years ago|reply
You should try.
It’s a trade off: You choose when to spend more time to set up the assistant, in order to save time setting an alarm while you have your hands in the dough.
[+] [-] neetrain|4 years ago|reply
[+] [-] gnicholas|4 years ago|reply
I set up my phone so I can say "hey Siri ok google" and then it asks me what I want to tell google. I then say any command supported by the Google Assistant, and it passes the command through.
Technically I think it's supposed to work all in one utterance, but I have found that it never works that way. I always have to split it up. Even still, this is a pretty handy way to be able to ask for information (Siri's knowledge seems quite limited compared to Alexa/Google) without having any always-listening devices in my home.
[+] [-] mibollma|4 years ago|reply
[+] [-] chmac|4 years ago|reply
[+] [-] edude03|4 years ago|reply