(no title)
huac
|
1 year ago
real-time full duplex like OpenAI GPT-4o is pretty expensive. cascaded approaches (usually about 800ms - 1 second delay) are slower and worse, but very very cheap. when I built this a year ago, I estimated the LLM + TTS + other serving costs to be less than the Twilio costs.
jackphilson|1 year ago