(no title)
sagarkava | 6 months ago
Tech stack: It runs on VideoSDK for the SIP gateway, bridging WebRTC ↔ SIP under the hood. For the AI side you can plug in whatever stack you prefer (LLM + STT + TTS). The repo includes example configs.
Why open-source? Most WhatsApp/voice AI projects out there are closed or tied to a single vendor. I wanted something people can actually hack on, fork, and extend — whether that’s experimenting with different voices, building domain-specific agents, or integrating with CRMs.
Performance: End-to-end round-trip latency is ~400–600ms in typical setups. With faster STT/TTS backends there’s headroom to improve this.
I’d love feedback on use cases you’d actually want to build with this: customer support lines, personal AI assistants, language tutors, appointment scheduling, etc. Curious what directions the HN crowd would push this in.
GitHub Repo: https://github.com/videosdk-community/videosdk-whatsapp-ai-c...
Video demo: https://youtu.be/KWfCWE8S_4U?si=yb5WWr4J4n2dgBm8
I’d love feedback: what use cases would you build with this? Customer support, personal AI assistants, language tutors… or something else?
No comments yet.