Show HN: DeskSlice – controlling a VS Code agent from my phone
3 points| frudas24 | 1 month ago |github.com
The problem I wanted to solve was very practical: I wanted to comfortably interact with a local VS Code agent (read outputs, scroll, and type prompts) from my phone, without reimplementing the UI or relying on editor internals or private APIs.
Instead of building a full remote desktop, DeskSlice streams only a calibrated slice of the desktop where the agent UI lives, and maps touch gestures back to mouse and keyboard input on the host.
I originally implemented this using WebRTC, but after hitting reliability and complexity issues (signaling, renegotiation, RTP quirks), I pivoted to MJPEG over HTTP. For LAN use, MJPEG turned out to be much simpler, easier to debug, and reliable enough for UI-driven workflows.
Key ideas: - Manual fullscreen calibration to select the exact agent panel, input area, and scroll area - Cropped video stream (not the full desktop) - Touch-first interaction model (tap, drag-scroll, typing) - No UI scraping, no state persistence — it operates the real VS Code agent UI - Simple password gate for LAN use
This is intentionally not a general-purpose remote desktop. It’s a focused control surface for interacting with a local AI agent through its existing UI.
Sean-Der|1 month ago
thanks
frudas24|1 month ago
What we still need to debug to make WebRTC solid:
Capture-side: full ffmpeg stderr logs + exact args when it goes black. RTP ingest: log SSRC/PT/seq gaps and verify SPS/PPS are regularly re-sent (e.g., with every keyframe). WebRTC states: log signaling/ICE/connection state transitions to catch races and “remote description not set” timing. Confirm whether the black screen is a capture issue vs a decode/packetization issue (capture works via MJPEG, so likely the latter).