(no title)
fazlerocks | 8 months ago
Like just telling it "navigate to settings and enable dark mode" instead of writing fragile selectors… that's the dream :D
But the current implementation has some issues that make it tough for real use ~
2-5 second latency per action is brutal. A simple login flow would take forever vs traditional automation.
The bigger thing is reliability… how do you actually verify the LLM did what you asked vs what it thinks it did? With normal automation you get assertions and can inspect elements. Here you're kinda flying blind.
Also "vision optional" makes me think it's not great at understanding complex UIs yet… which defeats the main selling point.
That said this feels like where things are headed long term. As LLMs get faster and better at visual stuff, this approach could eventually beat traditional automation for maintainability. Just not quite ready for production yet.
jeomon27|8 months ago