Ask HN: Private Alternatives to Alexa?

[+] lcvw|4 years ago|reply

I’m also very interested in an answer. We use Alexa a lot, but this week I had to unplug every Alexa in the house because a distant family member had gained access to the family Amazon account and was trying to use the “drop in” feature to listen to our conversations. In the course of our investigation I found out that Alexa does not log privacy related events at all (there is no record of a drop in stored anywhere), and the UI for locking down which profiles and contacts have access to which feature is unbelievably bad. At this point I can’t prove that this person doesn’t still have access somehow (every single contact taken from your phone has unique permissions to drop in) and I can’t delete the profile this person created without contacting customer support. So the devices are going to stay unplugged until I have time to nuke the Amazon account and create a new one.

To be clear, I’m not concerned about cloud processing or even mining my data. Just something I can have a reasonable amount of control over and that doesn’t constantly enable features I don’t want on its own.

[+] mensetmanusman|4 years ago|reply

That deserves its own post. This reminds me of when Samsung sent the press release that people shouldn’t discuss personal things near their television, because an unpatchable product of their allowed anyone who knew how to tap into the tv mics…

[+] BiteCode_dev|4 years ago|reply

Well you installed an always on microphone in your house, attached to a processing unit connected to internet all the time.

People said again and again it's going to be abused, and you had the technical knowledge to evaluate the risk.

Yet you went with it.

If it was not your scenario, it would have been another one. There will be other ones. Hell, with things like PRISM, that's inviting 3 letters agencies in your bedrooms.

Since the pandora box is opened and people are going to do it anyway, we should at least militate for having hardware switches on any sensors for any device.

[+] jvolkman|4 years ago|reply

What a bizarre situation.

For what it's worth, Google Homes don't have this drop-in feature. You can make calls and make announcements, but both require the person on the other end to actively answer or respond.

[+] jboy55|4 years ago|reply

Contact customer support about the deleting the profile, these things usually are taken very seriously, especially if it still is happening. If you don't hear anything back, call CS back and complain, shit will hit the fan then. I used to work in Alexa, and if we ignored a CS complaint like this, it would get escalated quickly and highly. Some of your assumptions about the back end are wrong too. As much as you hear bad news on how Amazon treats their employees, they really do care a lot about their customers.

[+] kello|4 years ago|reply

I had no idea this was even a "feature" of Alexa. Super glad I never got one.

[+] follower|4 years ago|reply

It's unfortunate that another comment that links to Rhasspy has been downvoted (I assume because it lacked any other context) so I wanted to mention the project with some additional context: https://rhasspy.readthedocs.io/

While I've not used the entire Rhasspy project myself (but trying it out is on the long list of things to do :) ) I have used the offline Text-To-Speech sub-project Larynx...

...and it is amazing!

Larynx is significantly ahead in terms of quality of output & variety of voices (fifty--across multiple languages, accents & genders) of any other FLOSS Text-To-Speech project I've tried.

I think the relative new-ness of the project is part of the reason Larynx (https://github.com/rhasspy/larynx/) currently flies under the radar.

If the rest of Rhasspy is as good as Larynx I'd imagine it's worth trying out.

Larynx demo video: https://www.youtube.com/watch?v=hBmhDf8cl0k

Samples of pre-trained voices: https://rhasspy.github.io/larynx/#en-us

[+] lukifer|4 years ago|reply

I can vouch for Rhasspy, it's an amazing and flexible piece of software, though it does require some setup and tech knowledge (albeit with a usable web GUI); and it's very DIY on defining the actual voice commands. I recommend pairing it with Node-RED [0] for routing commands to devices, it has plugins for most things.

The only thing I struggled with was getting the wake-word config right: I could never find the right balance point where it responded every time, without also having annoying false positives, so I ended up turning it off. It does support multiple wake-word engines; I'm gonna have another go with Picovoice Porcupine now that they're opened up custom wake-word training for free.

I'm most heavily experienced with Rhasspy's sister project, voice2json [1], which I used to build a voice-controlled car jukebox [2], and it's been working fantastically. (It triggers from a Bluetooth remote, so no wake-word issues.) The two projects share the same core engine.

For hardware, Raspberry 3/4 perform quite well, and strong recommend for ReSpeaker [3] for audio (either usb or 4-mic hat).

[0] https://nodered.org/

[1] http://voice2json.org/

[2] https://github.com/lukifer/voicetunes

[3] https://www.seeedstudio.com/category/Speech-Recognition-c-44...

[+] 2Gkashmiri|4 years ago|reply

https://ai-service-demos.go-aws.com/polly

check the neural then british, Amy. the "smoothness" is uncanny. the samples are "almost" there with larynx you linked. good but take kathleen (glow_tts), there is "still" some robotic in there. is this something that can be improved by tweaking the training ? this sounds really cool to be used at home

[+] 3np|4 years ago|reply

This is still buried too but someone just released a promising HA integration of Rhasspy

https://news.ycombinator.com/item?id=29565983

https://homeintent.io/

[+] Semaphor|4 years ago|reply

Software has been mostly solved for a while. My issue is hardware. The way I read, is that to get proper voice recognition in many circumstances, you need a microphone array (for every DIY Alexa and Dot). Now you have a PI, the array installed on it and… it just stands around looking ugly and accumulating dust?

You want a case, but then from my research, cases can easily interfere with those arrays. So you need one that’s custom-made for the array you are getting. But no array I’ve seen does actually come with such a case.

Back when I asked (relevant subreddits and on tildes before I deleted my account), no one could tell me that any of my research had been wrong, but no one had a solution either. I posted the threads almost 2 years ago, so maybe things changed? I’m currently still using Alexa, but besides privacy reasons I’d also love custom software that can take the idiocy out of my assistant (mainly by using pre-configured commands that do what I want instead of sometimes guessing what I want; also for on-the-fly language switching, Alexa is atrocious when you want to request a band that’s not in your primary language)

I could probably get away with the kitchen and office assistant using a normal microphone, but both bedroom and living room need to recognize voices from most directions (and in the case of the living room, also have decent recognition through music playing).

If anyone has any solutions, I’d love to hear them.

[+] jffry|4 years ago|reply

I've heard of Mycroft before but have no personal experience with it: https://mycroft.ai

It seems like they have an open source client that you can run on your own hardware but it's still dependent on the backend services that Mycroft-the-company provides. Perhaps their privacy stance is more palatable to you, though?

I think there used to be a project that was fully on-device but it got bought and consumed by Sonos

[+] AlotOfReading|4 years ago|reply

I'm in the same situation as OP and spent a weekend chasing this particular rabbit hole a few months ago. Mycroft was the best option I found, but:

* changing the wake word was a pain

* having a Home account is essentially mandatory (kind of defeating the point)

* speech synthesis was really bad

* It needed a lot of rhetorical help to get useful responses.

Just getting to that point involved several hours of fairly hardcore debugging and even then basic issues like reliable mic input still existed.

I also couldn't find reasonably priced (<$100) speakers/mics in an Alexa form factor, but it wouldn't be fair to blame that on the mycroft team.

[+] qchris|4 years ago|reply

I played around with their open source code a few years ago, and found it, well, operable, if not particularly useful.

On the flip side, I do not, I repeat, DO NOT recommend giving their team money for pretty much any reason. They've been struggling to put together any working hardware (and promising the opposite) for years through a genuine comedy of errors. I think the most recent design iteration of MycroftV2 is something like a Raspberry Pi on a daughterboard in a 3D-printed case. I still check in on their blog and subreddit every few months to see what's changed, and everytime I'm rewarded with additional details on what seems to be one of the most incompetent engineering organizations I've ever heard of.

[+] russnes|4 years ago|reply

Technically you should be able to run your own back-end, but I ended up giving up on this endeavor when I tried a few months ago. It should be doable if you give it a few good days, but it wasn't very well documented at least when I tried, and there's a lot of different parts you have to get working.

After you get it to work, I'm not sure if your server would have a worse data set for speech recognition etc, maybe, maybe not. I'm guessing it should be all right because they are using a Mozilla free data set (which you contribute to by default I think, if you use their server).

[+] michaelnik|4 years ago|reply

I use MyCroft/PiCroft (Pi4) + PS2 mic array (15$), it is cheap and relatively easy to hack in Python/Git. I believe it uses Mozilla's voice service. Has no ads.

Commands which work nicely which we use all the time: "Hi Mycroft, set timer to X minutes." "play news", "set alarm to 6 am", "what's the temperature?", "will it rain?", "what's the time in Paris?".

I wrote METAR and TAF module for it to get more detailed weather (and learn some Python).

[+] bluGill|4 years ago|reply

I have a mycroft mark1. It is based on the rasberry pi, and as they are wont to do the SD gave out (or got corrupted in a power outage?) and I haven't tried to fix it after a couple years. It works, but it is tricky to get it to wake up, you have to be close to it and speak up, no using it across the room. It is a neat trick, but I never found a use for it that was worth making it work.

I backed the mark2 on kickstarter which they have been promising is better and will ship anytime now - for a few years...

[+] totetsu|4 years ago|reply

I tried it on a RPI, but at that time it was limited by not being able to use microphone from bluetooth.

[+] 3np|4 years ago|reply

I've been wanting pretty bad to help my Google-Home-dependent family back home with this. They mostly need voice-controlled music and lights.

First off, unless you want to really DIY the glue-code, you want to use HomeAssistant (huge community) or NodeRED.

The only part I'm uncertain about and haven't explored properly myself is the voice-to-text part. Any solution should be pluggable into HA or NR.

Relevant threads in HA voice-assistant sub-forum:

* https://community.home-assistant.io/t/replacing-alexa-or-goo... (Ada, Rhasspy and other FOSS alternatives)

* https://community.home-assistant.io/t/local-voice-control/29... (you can apparently use Alexa for voice control even if it does not have any internet access)

* https://community.home-assistant.io/t/best-option-for-local-... (local TTS)

[+] LeoPanthera|4 years ago|reply

Maybe this is an obvious answer, and therefore not one you're interested in, but there's Siri.

Although audio is sent to the cloud, you can choose whether it is stored or not, and whatever you pick, Apple's privacy policy is very strict.

HomePods are mostly advertised as music playback devices, but I mostly use mine as a HomeKit control device.

Homebridge allows you to control non-HomeKit devices via Siri: https://homebridge.io

[+] colordrops|4 years ago|reply

This is a terrible suggestion. "Taking Apple's word on it" is not a valid privacy strategy. Companies have shown time and again that they lie about what is happening behind the scenes, or even just change their policies for the worst. Then there are rogue employees, such as the Ubiquiti fiasco, and secret government data collection, such as PRISM, of which Apple was a participant.

Apple has broken trust multiple times, including the FBI back door, and the CSAM fiasco. They are a public company beholden to shareholders, not your trusted friend from college. Even China has strong leverage over them due to their manufacturing and market there. They are fundamentally no different from Amazon.

The poster is asking for a private solution this is emphatically not it.

[+] bonniemuffin|4 years ago|reply

+1. I have Siri, Alexa, and Google home devices in my house, and Siri is the best in terms of competence and privacy. Alexa feels like being bugged by a creepo who keeps trying to sell me stuff. (We got rid of the Alexa.)

[+] nullc|4 years ago|reply

> you can choose whether it is stored or not

You choose to tell them to store it or not. Due to third party doctrine you have very little legal protection for material you voluntarily provide to a third party. Even if apple behaves honestly and has no vulnerabilities or compromises they may be forced to hand the data over.

And the fact that you don't know this says to me that Apple is in practice unethically exaggerating the level of privacy they're able to provide.

[+] danielheath|4 years ago|reply

I've never used Alexa and sort of assumed it was the same thing as Siri, but I'd permanently abandon the iOS ecosystem the first time Siri tried an upsell.

[+] tzs|4 years ago|reply

Would it work to turn off the microphones on your HomePods, turn off listening for "Hey Siri" on your phone, iPad, and Macs, and set your Apple watch to the "raise to speak" setting, and then do all your voice control through the watch?

If that works, that might be a solution for people who want the convenience of voice control but do not want an always-listening microphone in their house.

[+] izacus|4 years ago|reply

[deleted]

[+] Jarvy|4 years ago|reply

If you use Home Assistant and want something that is easy to setup, there is Home Intent: https://homeintent.io

It uses Rhasspy under the covers, and automatically imports lights, switches, fans, etc and sets up sentences and intents for you. After initial setup, it can be used without an active internet connection.

All you need is some container knowledge, an extra Pi, and a good speakerphone (like a Jabra Speak 410) to get going.

Disclosure: I am the main author of Home Intent.

[+] psandersen|4 years ago|reply

Wow, this is exactly what I'm looking for, thanks so much for developing it. I'm going to give it a shot over the holidays, have a pretty big home assistant install I'd love to control via voice.

I'd primarily like to be able to turn on/off a few specific switches, set a few specific scenes and ideally play a few playlists (squeezebox integrated with home assistant). Would creating scripts in HA and then calling them from Hi be the way to go for customising without building custom components?

If this works well I'll be keen to add a few cheap pis plus ps3 eye cameras around the house.

[+] 3np|4 years ago|reply

Wow, this looks really promising! Thanks for doing this.

My family back home is hooked on Google Home but sister agreed to switch off for the kids - but her requirement is to be able to control Spotify via voice.

Given I set aside enough of my time to help them out and figure out appropriate hardware, do you think it's worth a shot with HI?

[+] khimaros|4 years ago|reply

https://rhasspy.readthedocs.io/ https://github.com/rhasspy/rhasspy

[+] the100rabh|4 years ago|reply

Somehow had never heard about Rhasspy till today, seems pretty awesome.

[+] Havoc|4 years ago|reply

I've toyed around with mycroft + raspberry + respeaker array. Works well enough that I could get it to figure out wolfram style questions fairly well.

Most things seem to assume a raspberry as the base hardware which somewhat implies cloud usage.

There is also hive mind which seems to get around the above issue (basically mycroft except with a more satellite mentality to mic placement)

Also been wondering whether it is possible to just use the cloud TTS since their voices are quite good. That should fall under GCP/AWS etc terms which seems a little more privacy friendly than straight alexa & friends.

Planning on having another go at this, but first tackling presence detection which is a minefield of note too

[+] noitpmeder|4 years ago|reply

Although not exactly what you're looking for, I've heard really interesting things about https://knocki.com/

Essentially you can program your house to coded knocks. Lights/music/door/... all by sequences of knocks.

[+] 0xdeadb00f|4 years ago|reply

Actually not a bad idea. Seems pretty DIY-able too.

[+] thebiblelover7|4 years ago|reply

Never thought or seen this, but really like the idea!

[+] mattowen_uk|4 years ago|reply

The harder problem isn't the software - that's solvable to a degree by most of us on this thread - it's the hardware. Alexa et al. are packed full of hardware tweaks such as array based far-field microphones which can determine where the voice commands are coming from, and isolate them from background noise and their own audio output. Then there's the built-in proprietary Alexa/Google-home support in a lot of consumer goods as well. There's hardware widgets that you can get to effectively rebuild Alexa type hardware with an off the shelf SBC, but I think I'm on the ball here when I say most people don't want to do that, and/or don't have the hardware skills either.

Personally, I live fine without a voice controlled home assistant. I am however able bodied, and I can press buttons and flick switches without much thought. If that were to change, then yeah, I can see a need for these things, but they really need to be able to work offline with no internet connection should you wish have them configured like that.

[+] abetusk|4 years ago|reply

This is more of a "DIY" approach but all the tools are there for FOSS and OSHWA solutions.

Mozilla has DeepSpeech [0] and, while not as advanced as the stuff from Google or Amazon, my experimentation left me feeling pretty hopeful that it could reliably recognize at least keywords.

The Raspberry Pi is quite capable though you'll probably need some dedicated microphone to reliably catch voice data. I know ReSpeaker [1] but maybe some off the shelf conference USB microphones would work as well.

[0] https://github.com/mozilla/DeepSpeech

[1] https://wiki.seeedstudio.com/ReSpeaker_4_Mic_Array_for_Raspb...

[+] mataug|4 years ago|reply

I'm not aware of any private alternatives, but I've been annoyed by the same issue, and since switched to using Siri.

So if you've got a few apple devices such as a homepod / apple watch / iPhone, consider the homekit ecosystem, its been working well for me.

[+] rcarmo|4 years ago|reply

I have set up a Rhasspy instance on a Pi 3A (using a Seedstudio microphone HAT) and it works decently, although it is by no means Siri/Alexa grade.

The good thing about it is that it takes only a couple of minutes to install via Docker and has all the base niceties out of the box (trigger word, intent detection, speech synthesis).

I would say it is good enough to tinker with, although clearly not yet up to par (mind you, I am also trying it in English).

[+] hipitihop|4 years ago|reply

Consider Project Alice. OSS runs on Raspberry Pi or AMD container. https://github.com/project-alice-assistant/ProjectAlice

[+] 1vuio0pswjnm7|4 years ago|reply

Dumb question: Why does Alexa need to connect to the internet to turn lights on/off or run timers. What happens when it cannot connect to the Amazon mothership.

[+] betwixthewires|4 years ago|reply

Polished? No

There are open source speech to text engines, text to speech engines and assistant software with APIs, you could probably build something with a raspberry pi, I looked into it a while back but I don't really mind light switches.

"Open Assistant" should get the ball rolling for you, search that and dive down the rabbit hole of open source home automation.

[+] 3a2d29|4 years ago|reply

I don't want to sound like a smart ass, but I feel like a good alternative to Alexa is just doing all that yourself.

I have never had an Alexa but surely there isn't that much time saved by having it turn off lights. Siri on your iPhone can set alarms and shopping lists and presumably your phone it always around you.

[+] Talanes|4 years ago|reply

The most important function my Alexa serves is acting as the alarm that isn't my phone/requires a separate skill set to disable. Simultaneously talking and manipulating a phone is a magnitude harder to sleep through than just the phone.

[+] lixtra|4 years ago|reply

> I have never had an Alexa but surely there isn't that much time saved by having it turn off lights

You should try.

It’s a trade off: You choose when to spend more time to set up the assistant, in order to save time setting an alarm while you have your hands in the dough.

[+] neetrain|4 years ago|reply

It makes sense, but I never realized how much my brain resources has wasted for finding the remotes for lights and an air conditioner before I tried alexa.

[+] gnicholas|4 years ago|reply

If you can get comfortable with Siri (which does send audio to the cloud, though with iOS 15 there's more on-device capability), you should experiment with controlling Google Assistant and Amazon Alexa via Siri.

I set up my phone so I can say "hey Siri ok google" and then it asks me what I want to tell google. I then say any command supported by the Google Assistant, and it passes the command through.

Technically I think it's supposed to work all in one utterance, but I have found that it never works that way. I always have to split it up. Even still, this is a pretty handy way to be able to ask for information (Siri's knowledge seems quite limited compared to Alexa/Google) without having any always-listening devices in my home.

[+] mibollma|4 years ago|reply

You could also keep using Alexa and add a do-it-yourself privacy enhancement called Alias: https://www.instructables.com/Project-Alias/

[+] chmac|4 years ago|reply

This is freakin EPIC. A hat that captures audio and then decides whether or not to relay it to the physical Alexa / Google / etc speaker. I struggle to believe it'd be super reliable, but the concept is ultra cool!

[+] edude03|4 years ago|reply

Haven’t seen it mentioned - I think https://mycroft.ai/ is exactly the sort of thing the original poster as well as the general HN crowd is looking for

156 comments