top | item 44948273

(no title)

So what I meant is this: When you run our Cyberdesk agent the first time, it runs with the computer use agent. But then once that’s complete, we cache every exact step it took to successfully complete that task (every click, type, scroll) and then simply replay that the next time.

But during that replayed action, we do bring in smaller LLMs to just keep in check to see if anything unexpected happened (like a popup). If so, we fall back to computer use to take it home.

Does that make sense? At the end of the day, our agent compiles down to Pyautogui, with smart fallback to the agent if needed.

discuss

mattfrommars|6 months ago

Hi,yes. It makes sense now. To cache the steps and reply. Very efficient strategy than to run the step each time using LLM