top | item 42590476

(no title)

cr4zy | 1 year ago

Wow, this is awesome! Thanks for building. I didn't realize there was a protocol for streaming while rendering, though I noticed sumo.ai doing something similar for audio. Gemini with grounding is new to me also, very nice!

discuss

zan2434|1 year ago

thanks! Streaming was actually pretty hard to get working, but it goes roughly like this as a streaming pipeline:

- The LLM is prompted to generate an explainer video as sequence of small Manim scene segments with corresponding voiceovers

- LLM streams response token-by-token as Server-Sent-Events

- Whenever a complete Manim segment is finished, send it to Modal to start rendering

- Start streaming the rendered partial video files from manim as they are generated via HLS