Show HN: Pi-C.A.R.D, a Raspberry Pi Voice Assistant
344 points| nkaz123 | 1 year ago |github.com
It uses distributed models so latency is something I'm working on, but I am curious on where this could go, if anywhere.
Very much a WIP. Feedback welcome :-)
rkagerer|1 year ago
Props, and thank you for this.
pyaamb|1 year ago
squarefoot|1 year ago
A question: does it run only on the Pi5 or other (also non Raspberry Pi) boards?
rob74|1 year ago
But seriously - the name got my attention, then I read the introduction and thought "hey, Alexa without uploading everything you say to Amazon? This might actually be something for me!".
> The default wake word is "hey assistant" - I would suggest "Computer" :) And of course it should have a voice that sounds like https://en.wikipedia.org/wiki/Majel_Barrett
ornornor|1 year ago
pawelduda|1 year ago
This project seems to be ticking most, if not all of the boxes, compared to anything else I've seen. Good job!
While at it, can someone drop a recommendation for a Rpi-compatible mic for Alexa-like usecase?
baobun|1 year ago
You won't get anything practically useful running LLMs on the 4B but you also don't strictly need LLM-based models.
In the Rhasspy community, a common pattern is to do (cheap and lightweight) wake-word detection locally on mic-attached satellites (here 4B should be sufficient) and then stream the actual recording (more computational resources for better results) over the local network to a central hub.
8xeh|1 year ago
yangikan|1 year ago
jhbruhn|1 year ago
NabuCasa employed the Rhasspy main dev to work on these functionalities and they are progressing with every update.
gazelle21|1 year ago
[deleted]
eddieroger|1 year ago
Missed opportunity for LCARS - LLM Camera Audio Recognition Service, responding to the keyword "computer," naturally. I guess if this ran elsewhere from a Pi, it could be LCARS.
rkagerer|1 year ago
MisterTea|1 year ago
bdcravens|1 year ago
LOCUTUS
layer8|1 year ago
ethagnawl|1 year ago
About a year ago, my family was really keen on getting an Alexa. I don't want Bezos spy devices in our home, so I convinced them to let me try making our own. I went with Mycroft on a Pi 4 and it did not go well. The wake word detection was inconsistent, the integrations were lacking and I think it'd been effectively abandoned by that point. I'd intended to contribute to the project and some of the integrations I was struggling with but life intervened and I never got back to it. Also, thankfully, my family forgot about the Alexa.
genewitch|1 year ago
It used some google thing on the backend, and it was really frustrating to get set up and keep working - but it did work.
i have two of those devices, so i've been waiting for something to come that would let me self-host something similar.
nkaz123|1 year ago
knodi123|1 year ago
I've googled it before, but the space is crowded and the caveats are subtle.
CaptainOfCoit|1 year ago
But overall, Pi-C.A.R.D seems to be using Python and cpp so shouldn't be any issues whatsoever to run this on whatever Python and cpp can be run/compiled on.
MH15|1 year ago
robbyiq999|1 year ago
piltdownman|1 year ago
genewitch|1 year ago
anyhow the GPU would sit on a small PCB that would connect to an even smaller PCB in the PCIe slot on the motherboard, via USB3 cable. My point here is merely that whatever PCIe is, it can be transported to a GPU to do work via USB3 cables.
harwoodr|1 year ago
nkaz123|1 year ago
Additionally, since I'm streaming the LLM response, it won't take long to get your reply. Since it does it a chunk at a time, there's occasionally only parts of words that are said momentarily. Also of course depends on what model you use or what the context size is for how long you need to wait.
stereosteve|1 year ago
dragonwriter|1 year ago
Completely OT, but, most likely, he doesn’t. Lots of people in the show direct with the replicators more fluidly; “Tea, Earl Grey, Hot” seems like a Picard quirk, possibly developed with more primitive food/beverage units than the replicators on the Enterprise-D.
lttlrck|1 year ago
Will there still be lawsuits in the post-scarcity world? Probably.
TeMPOraL|1 year ago
Most of Starfleet folks seem to not know how to use their replicators well anyway. For all the smarts they have, they use it like a mundane appliance they never bothered to read the manual for, miss 90% of its functionality, and then complain that replicated food tastes bad.
cheeseomlit|1 year ago
unknown|1 year ago
[deleted]
pkaye|1 year ago
aci_12|1 year ago
knodi123|1 year ago
dasl|1 year ago
nkaz123|1 year ago
I also tried using Vulkan, which is supposedly faster, but the times were a bit slower than normal CPU for Llama CPP.
zenkalia|1 year ago
The readme mentions a memory that lasts as long as each conversation which seems like such a hard limitation to live with.
cmcconomy|1 year ago
knodi123|1 year ago
8mobile|1 year ago
unknown|1 year ago
[deleted]
ghnws|1 year ago
nkaz123|1 year ago
Whisper tiny is multi lingual (though I am using the english specific variant) and I believe llama 3 is technically capable of multi-lingual, but not sure of any benchmarks.
I think it could be made better, but for now focus is english. I'll add this to the readme though. Thanks!
timendum|1 year ago
alexalx666|1 year ago
nl|1 year ago
I think it uses local models, right?
kazinator|1 year ago
ddingus|1 year ago
nickthegreek|1 year ago
yumong|1 year ago
[deleted]
wwryuryrf|1 year ago
[deleted]