(no title)
kwindla | 2 months ago
(If you do need SIP, this Asterisk project looks really great.)
Pipecat has 90 or so integrations with all the models/services people use for voice AI these days. NVIDIA, AWS, all the foundation labs, all the voice AI labs, most of the video AI labs, and lots of other people use/contribute to Pipecat. And there's lots of interesting stuff in the ecosystem, like the open source, open data, open training code Smart Turn audio turn detection model [2], and the Pipecat Flows state machine library [3].
[1] - https://docs.pipecat.ai/guides/telephony/twilio-websockets [2] - https://github.com/pipecat-ai/pipecat-flows/ [3] - https://github.com/pipecat-ai/smart-turn
Disclaimer: I spend a lot of my time working on Pipecat. Also writing about both voice AI in general and Pipecat in particular. For example: https://voiceaiandvoiceagents.com/
ldenoue|2 months ago
That’s why I created a stack entirely in Cloudflare workers and durable objects in JavaScript.
Providers like AssemblyAI and Deepgram now integrate VAD in their realtime API so our voice AI only need networking (no CPU anymore).
nextworddev|2 months ago
e.g. Deepgram (STT) via websocket -> DO -> LLM API -> TTS?
nextworddev|2 months ago
In your opinion, how close is Pipecat + OSS to replacing proprietary infra from Vapi, Retell, Sierra, etc?
kwindla|2 months ago
The integrated developer experience is much better on Vapi, etc.
The goal of the Pipecat project is to provide state of the art building blocks if you want to control every part of the multimodal, realtime agent processing flow and tech stack. There are thousands of companies with Pipecat voice agents deployed at scale in production, including some of the world's largest e-commerce, financial services, and healthtech companies. The Smart Turn model benchmarks better than any of the proprietary turn detection models. Companies like Modal have great info about how to build agents with sub-second voice-to-voice latency.[1] Most of the next-generation video avatar companies are building on Pipecat.[2] NVIDIA built the ACE Controller robot operating system on Pipecat.[3]
[1] https://modal.com/blog/low-latency-voice-bot - [2] https://lemonslice.com/ = [3] https://github.com/NVIDIA/ace-controller/