(no title)
itissid | 17 days ago
I have just a list of chat sessions on the web app on all my projects. The webapp is modified to launch claude code daemons (borrowed from humanlayer/codelayer) and exposes the outbound STT from the WebRTC into a chat session.
- MCP Auth is via auth0
- Webapp itself is gated by a Bearer token.
This itself gets me pretty far. I am not sure what more this is offering?
My TTS/STT models are local by Kyutai and the voice agent's LLM between STT and TTS is used to determine some basic context: e.g. what project directories, mcp servers to select and what skills to use for launching the daemons.
kmansm27|17 days ago
itissid|17 days ago
I spend my time trying to tuning the voice+webapp experience: i.e. how it can explain things, can it surface thinking tokens from claude tools properly etc. The sweat, blood, voice go into `/create_research -> /create_plan` loop before the `/implement_plan`. Sometimes I copy the research and paste it into chatGPT for review or comments as well.
I generally use the MCP to get it to follow commands and explain things to me to make progress in this cycle, and I often pause it and ask for drawing me a mermaid a sequence diagram for events or a block diagram showing how pieces go together.