top | item 45107487

(no title)

sagarkava | 6 months ago

Hi HN — I’m launching an open-source WhatsApp AI Voice Agent for phone calls.

Tech stack: It runs on VideoSDK for the SIP gateway, bridging WebRTC ↔ SIP under the hood. For the AI side you can plug in whatever stack you prefer (LLM + STT + TTS). The repo includes example configs.

Why open-source? Most WhatsApp/voice AI projects out there are closed or tied to a single vendor. I wanted something people can actually hack on, fork, and extend — whether that’s experimenting with different voices, building domain-specific agents, or integrating with CRMs.

Performance: End-to-end round-trip latency is ~400–600ms in typical setups. With faster STT/TTS backends there’s headroom to improve this.

I’d love feedback on use cases you’d actually want to build with this: customer support lines, personal AI assistants, language tutors, appointment scheduling, etc. Curious what directions the HN crowd would push this in.

GitHub Repo: https://github.com/videosdk-community/videosdk-whatsapp-ai-c...

Video demo: https://youtu.be/KWfCWE8S_4U?si=yb5WWr4J4n2dgBm8

I’d love feedback: what use cases would you build with this? Customer support, personal AI assistants, language tutors… or something else?

discuss

No comments yet.