top | item 46223630

(no title)

AndreSlavescu | 2 months ago

We actually deployed working speech to speech inference that builds on top of vLLM as the backbone. The main thing was to support the "Talker" module, which is currently not supported on the qwen3-omni branch for vLLM.

Check it out here: https://models.hathora.dev/model/qwen3-omni

discuss

order

sosodev|2 months ago

Is your work open source?

AndreSlavescu|2 months ago

At the moment, no unfortunately. However, to my recent knowledge of open source alternatives, the vLLM team published a separate repository for omni models now:

https://github.com/vllm-project/vllm-omni

I have not yet tested out if this does full speech to speech, but this seems like a promising workspace for omni-modal models.

red2awn|2 months ago

Nice work. Are you working on streaming input/output?

AndreSlavescu|2 months ago

Yeah, that's something we currently support. Feel free to try the platform out! No cost to you for now, you just need a valid email to sign up on the platform.