(no title)
arcb | 8 months ago
We're evaluating Cua (https://www.ycombinator.com/companies/cua) to containerize our agents; am a fan so far. We're also putting Computer Use agents from (OAI and Anthropic) to the test. Many legacy ERPs don't run in the browser and we have to meet them there. I think we're a few months away from things working reliably and efficiently.
We're evaluating several of the top models (both open and closed) for browser navigation (claude's winning atm) and PDF extraction. Since we're performing repetitive tasks, the goal is make our workflows RL-able. Being able to rely on OSS models will help a lot here.
We're building our own data sets and evaluations for many of the subtasks. We're using openai's evals (https://github.com/openai/evals) as a framework to guide our own tooling.
Apart from that, we write in Typescript, Python, and Golang. We use Postgres for persistence (nothing fancy here). We host on AWS, and might go on premises for some customers. We plan on investing a lot into our workflow system as the backbone of our product.
I prefer open source when possible. Everything's new and early, and many things require source changes that others might not be able to prioritize.
Edit - one thing I'd love to find a good solution for is reliably extracting handwriting from PDF documents. Clinicians have to do this a ton to keep the trains running on time, and being able to digitize that knowledge on the go will be huge.
Very open to ideas here. We're seeing great tools and products come up by the day, including from our own YC batch.
tjsk|8 months ago
arcb|8 months ago