Show HN: Webctl – Browser automation for agents based on CLI instead of MCP
134 points| cosinusalpha | 1 month ago |github.com
I initially built this to solve a personal headache: I wanted an AI agent to handle project management tasks on my company’s intranet. I needed it to persist cookies across sessions (to handle SSO) and then scrape a Kanban board.
Existing AI browser tools (like current MCP implementations) often force unsolicited data into the context window—dumping the full accessibility tree, console logs, and network errors whether you asked for them or not.
webctl is an attempt to solve this with a Unix-style CLI:
- Filter before context: You pipe the output to standard tools. webctl snapshot --interactive-only | head -n 20 means the LLM only sees exactly what I want it to see.
- Daemon Architecture: It runs a persistent background process. The goal is to keep the browser state (cookies/session) alive while you run discrete, stateless CLI commands.
- Semantic targeting: It uses ARIA roles (e.g., role=button name~="Submit") rather than fragile CSS selectors.
Disclaimer: The daemon logic for state persistence is still a bit experimental, but the architecture feels like the right direction for building local, token-efficient agents.
It’s basically "Playwright for the terminal."
binalpatel|1 month ago
(my one of many contribution https://github.com/caesarnine/binsmith)
cosinusalpha|1 month ago
Nevertheless, I prefer the CLI for other reasons: it is built for humans and is much easier to debug.
fudged71|1 month ago
0x696C6961|1 month ago
desireco42|1 month ago
the_mitsuhiko|1 month ago
dtkav|1 month ago
Plus, now it is personal software... just keep asking it to improve the skill based on you usage. Bake in domain knowledge or business logic or whatever you want.
I'm using this for e2e testing and debugging Obsidian plugins and it is starting to understand Obsidian inside and out.
cosinusalpha|1 month ago
kinduff|1 month ago
gregpr07|1 month ago
cosinusalpha|1 month ago
I actually tried a raw HTML when I was exploring solutions. It worked for "one-off" tasks, but I ran into major issues with replayability on modern SPAs.
In React apps, the raw DOM structure and auto-generated IDs shift so frequently that a script generated from "Raw HTML" often breaks 10 minutes later. I found ARIA/semantics to be the only stable contract that persists across re-renders.
You mentioned the raw HTML approach is "expensive". Did you feed the full HTML into the context, or did you create a BS4 "tool" for the LLM to query the raw HTML dynamically?
TheTaytay|1 month ago
I’d like to see this other browser plugin’s API be exposed via your same CLI, so I don’t have to only control a separate browser instance. https://github.com/remorses/playwriter (I haven’t investigated enough to know how feasible it is, but as I was reading about your tool, I immediately wanted to control existing tabs from my main browser, rather than “just” a debug-driven separate browser instance.)
cosinusalpha|1 month ago
But I agree, attaching to the OS "daily driver" instance specifically would be a nice addition.
randito|1 month ago
Video: https://youtu.be/ojL_VHc4gLk?t=2132
More discussion: https://simonwillison.net/2025/Jun/23/phoenix-new/
unknown|1 month ago
[deleted]
renegat0x0|1 month ago
https://github.com/rumca-js/crawler-buddy
More like a framework for other mechanisms
philipbjorge|1 month ago
How is it different?
cosinusalpha|1 month ago
The main difference is likely the targeting philosophy. webctl relies heavily on ARIA roles/semantics (e.g. role=button name="Save") rather than injected IDs or CSS selectors. I find this makes the automation much more robust to UI changes.
Also, I went with Python for V1 simply for iteration speed and ecosystem integration. I'd love to rewrite in Rust eventually, but Python was the most efficient way to get a stable tool working for my specific use case.
hugs|1 month ago
"browser automation for ai agents" is a popular idea these days.
desireco42|1 month ago
cosinusalpha|1 month ago
grigio|1 month ago
cosinusalpha|1 month ago
An objective benchmark is a great idea, especially to compare webctl against other similar CLI-based tools. I'll definitely look into how to set that up.
unknown|1 month ago
[deleted]
Agent_Builder|1 month ago
[deleted]
cosinusalpha|1 month ago
AI-love|1 month ago
[deleted]