top | item 47040802

(no title)

arjunchint | 13 days ago

Thanks for taking a look!

So our core technical moat is building up an agentic harness that can represent and take actions on any webpage without any screenshots. With this approach we even beat custom trained models like OpenAI Operator and Anthropic CUA: https://www.rtrvr.ai/blog/web-bench-results

Everyone else in the space just takes a screenshot and asks a model what coordinates to click, our core thesis is that LLMs understand semantic representations fundamentally better than vision. But with this DOM approach there is a long tail of HTML/DOM edge cases to cover that we have built out for with the 20k+ users bringing these edge cases.

Soon you will be able to record demonstration tasks via our partner Chrome Extension as well as setup knowledge bases scraped by our Cloud browsers to provide additional context to the agent. So there is a platform moat as well.

The audience is website owners who want to increase visitor engagement and conversion via a conversational interface for users.

This is more for our cloud browser platform where we launch cloud browsers for vibe scraping controlled via a custom extension instead of CDP. You can try it out at rtrvr.ai/cloud, where we can get back some strong antibot detection sites like google.com

discuss

order

No comments yet.