top | item 42007491

Show HN: Cerebellum – Open-Source Browser Control with Claude 3.5 Computer Use

42 points| theredsix | 1 year ago |github.com

Hi HN! I was mesmerized by the Claude Computer Use reveal last week and was specifically impressed by how well it navigated websites. This motivated me to create Cerebellum, a library that lets an LLM take control of a browser.

Here is a demo of Cerebellum in action, performing the goal “Find a USB C to C cable that is 10 feet long and add it to cart” on amazon.com:

https://youtu.be/xaZbuaWtVkA?si=Tq9lE6BXv9wjZ-qC

Currently, it uses Claude 3.5 Sonnet’s newly released computer use ability, but the ultimate goal is to crowdsource a high quality set of browser sessions to train an open source local model.

Checkout the MIT licensed repo on github (https://github.com/theredsix/cerebellum) or install the library from npm (https://www.npmjs.com/package/cerebellum-ai)

Looking for feedback from the HN community, especially on: What browser tasks would you use an LLM to complete? Thanks again for taking a look!

17 comments

order

its_down_again|1 year ago

> but the ultimate goal is to crowdsource a high quality set of browser sessions to train an open source local model.

Could you say more on this? I see that it's an open-source implementation of PLAN with Selenium and Claude's Cursor, but where will the "successes" of browser sessions be stored? Also, will it include an anonymization feature to remove PII from authenticated use cases?

theredsix|1 year ago

The next step will be adding functionality to convert and save a BrowserStep[] into a portable file format and addition conversation functions to turn those files into .jsonl that can be fed into the transformers library etc. For the PII piece, there's no current plans to introduce anonymization features but open to suggestions.

imvetri|1 year ago

You don't need LLM.

Build interface to build knowledge graph.

Nodes containing words, verbs are action, nouns are past verb. Action is movement on space.

Jayakumark|1 year ago

Can this work with local models ?

theredsix|1 year ago

Not at the moment, since you need a local model with strong segmentation capabilities (x, y) and none exist ATM. We hope to train one in the future and one of Cerebellum's roadmap items is to create a the ability to save your sessions as a training dataset.

theredsix|1 year ago

OP here, happy to answer any questions you may have!

philonoist|1 year ago

What do you think about this tool changing the landscape of software testing?

I think you could change the roles of SDETs and other quality assurance jobs dominated by Selenium and Playwright. I mean think about it. It would half the number of testers needed to do the same work.

david_shi|1 year ago

Any plans for a python version?

hugs|1 year ago

Thanks for using Selenium!

0x3331|1 year ago

Very cool!