top | item 41567387

Show HN: Finic – Open source platform for building browser automations

143 points| jasonwcfan | 1 year ago |github.com

Last year we launched a project called Psychic that did moderately well on hacker news, but was a commercial failure. We were able to find customers, but none with compelling and overlapping use cases. Everyone who was interested was too early to be a real customer.

This was our launch: https://news.ycombinator.com/item?id=36032081

We recently decided to revive and rebrand the project after seeing a sudden spike in interest from people who wanted to connect LLMs to data - but specifically through browsers. It's also a problem we've experienced firsthand, having built scraping features into Psychic and previously working on bot detection at Robinhood.

If you haven’t built a web scraper or browser automation before, you might assume it’s very straightforward. People have been building scrapers for as long as the internet has existed, so there must be many tools for the job.

The truth is that web scraping strategies need to constantly adapt as web standard change, and as companies that don’t want to be scraped adopt new technologies to try and block it. The old standards never completely go away, so the longer the internet exists, the more edge cases you’ll need to account for. This adds up to a LOT of infrastructure that needs to be set up and a lot of schlep developers have to go through to get up and running.

Scraping is no easier today than it was 10 years ago - the problems are just different.

Finic is an open source platform for building and deploying browser agents. Browser agents are bots deployed to the cloud that mimic the behaviour of humans, like web scrapers or remote process automation (RPA) jobs. Simple examples include scripts that scrape static websites like the SEC's EDGAR database. More complex use cases include integrating with legacy applications that don’t have public APIs, where the best way to automate data entry is to just manipulate HTML selectors (EHRs for example).

Our goal is to make Finic the easiest way to deploy a Playwright-based browser automation. With this launch, you can already do so in just 4 steps. Check out our docs for more info: https://docs.finic.io/quickstart

77 comments

order

ghxst|1 year ago

Cool service but how will you deal / how do you plan to deal with anti scraping and anti bot services like Akamai, Arkose, Cloudflare, DataDome etc.? Automation of the web isn't solved by another playwright or puppeteer abstraction, you need to solve more fundemental problems in order to mitigate the issues you run into at scale.

jasonwcfan|1 year ago

I mentioned this in another comment, but I know from experience that it's impossible to reliably differentiate bots from humans over a network. And since the right to automate browsers has survived repeated legal challenges, all vendors can do is make it incrementally harder to weed out the low sophistication actors.

This actually creates an evergreen problem that companies need to overcome, and our paid version will probably involve helping companies overcome these barriers.

Also I should clarify that we're explicitly not trying to build a playwright abstraction - we're trying to remain as unopinionated as possible about how developers code the bot, and just help with the network-level infrastructure they'll need to make it reliable and make it scale.

It's good feedback for us, we'll make that point more clear!

suriya-ganesh|1 year ago

I've been working on browser agent the last week[1]. So this is very exciting. There are also browser agent implementations like Skyvern[2] (Also YC backed) ,or Tarsier[3] Seems like, finic is providing a way to scale/schedule these agents? If that's the case what's the advantage over something like airflow or windmill ?

If I remember correctly, Skyvern also has an implementation of scaling these browser tasks built in.

ps. Is it not called Robotic Process Automation? First time I'm hearing it as Remote process Automation.

[1]https://github.com/ProductLoft/arachne

[2]https://www.skyvern.com/

[3]https://github.com/reworkd/tarsier

mdaniel|1 year ago

https://github.com/reworkd/tarsier/pull/115/files represents someone who does not know what git is used for

  Cloning into 'tarsier'...
  remote: Enumerating objects: 15238, done.
  remote: Counting objects: 100% (1613/1613), done.
  remote: Compressing objects: 100% (929/929), done.
  Receiving objects: 100% (15238/15238), 3.01 GiB | 14.82 MiB/s, done.

ayanb9440|1 year ago

Yup that's right its Robotic Process Automation.

Based on the feedback in this thread we're going to be releasing an updated version that focuses more around tooling for the browser agents themselves as opposed to scaling/scheduling, so stay tuned for that!

dataviz1000|1 year ago

I build browser automation systems with either Playwright or Chrome Extensions. The biggest issue with automating 3rd party websites is knowing when the 3rd party developer pushes changes which break the automation. The way I dealt with that is run a headless browser in the cloud which checks the behavior of the automated site periodically sending emails and sms messages when it breaks.

If you don't already have this feature for your system, I would recommend it.

ghxst|1 year ago

IO between humans and websites can be broken down to only a few fundamental pieces (or elements I should say). This is actually where AI has a lot of opportunity to add value as it has the capability of significantly reducing the possibilty of breakage between changes.

ayanb9440|1 year ago

That's a great suggestion! Essentially a cron job to check for website changes before your automation runs and possibly breaks.

What does this check look like for you? Do you just diff the html to see if there are any changes?

Oras|1 year ago

Don't take this as a negative thing, but I'm confused. Is it a playwright? Is it a residential proxy? It's not clear from your video.

jasonwcfan|1 year ago

Proxies are definitely on our roadmap, but for now it just supports stock Playwright.

Thanks for the feedback! I just updated the repo to make it more clear that it's Playwright based. Once my cofounder wakes up I'll see if he can re-record the video as well.

mdaniel|1 year ago

> Finic uses Playwright to interact with DOM elements, and recommends BeautifulSoup for HTML parsing.

I have never, ever understood anyone who goes to the trouble of booting up a browser, and then uses a python library to do static HTML parsing

Anyway, I was surfing around the repo trying to find what, exactly "Safely store and access credentials using Finic’s built-in secret manager" means

ayanb9440|1 year ago

We're in the middle of putting this together right now but it's going to be a wrapper around Google Secret Manager for those that don't want to set up a secrets manager themselves.

0x3444ac53|1 year ago

Often times websites won't load the HTML without executing the JavaScript. or uses JavaScript running client side to generate the entire page.

msp26|1 year ago

What would you recommend for parsing instead?

krick|1 year ago

Does anyone know solid (not SaaS, obviously) solution for scraping these days? It's getting pretty hard to get around some pretty harmless cases (like bulk-downloading MY OWN gpx tracks from some fucking fitness-watch servers), with all these js tricks, countless redirects, cloudflare and so on. Even if you already have the cookies, getting non-403 response to any request is very much not trivial. I feel like it's time to upgrade my usual approach of python requests+libxml, but I don't know if there is a library/tool that solves some of the problems for you.

_boffin_|1 year ago

- launch chrome with loading of specified data dir.

- connect to it remotely

- ghost cursor and friends

- save cookies and friends to data dir

- run from residential ip

- if get served captcha or cloudflare, direct to solver and to then route back.

- mobile ip if possible

…can’t go into anymore specifics than that

…I forget the site right now, but there a guy that gives a good rundown of this stuff. I’ll see id I can find it.

djbusby|1 year ago

I use a few things. First, I scrape from my home IP at very low rates. I drive either FF or Chrome using extension. Sometimes I have to start the session manually (not a robot) and then engage the crawler. Sometimes, site dependant, can run headless or puppeteer. But the extension in "normal" browser that goes slow has been working great for me.

It seems that some sites can determine when using headless or web-driver enabled profile.

Sometimes I'm through a VPN.

The automation is the easy part.

_boffin_|1 year ago

Heads up, requests adds some extra headers on send.

One thing I’ve also been doing recently when I find a site that I just want an api is just use python and execute a curl via python. I populate the curl from chrome’s network tab. I also have a purpose built extension I have in my browser that saves cookies to a lan Postgres DB and then the use those values for the script.

Can even probably do more by automating the browser to navigate there on failure.

bobbylarrybobby|1 year ago

On a Mac, I use keyboard maestro, which can interact with the UI (which is usually stable enough to form an interface of sorts) — wait for an graphic to appear on screen, then click it, then simulate keystrokes, run JavaScript on the current page and get a result back... looks very human to a website in a browser, and is nearly as easy to write as Python.

iansinnott|1 year ago

In short: Don't use HTML endpoints, use APIs.

This is not always possible, but if the product in question has a mobile app or a wearable talking to a server, you might be able to utilize the same API it's using:

- intercept requests from the device - find relevant auth headers/cookies/params - use that auth to access the API

whilenot-dev|1 year ago

If requests solves any 403 headaches for you, just pass the session cookies to a playwright instance, and you should be good to go. Just did that for scraping the SAP Software Download Center.

lambdaba|1 year ago

I've found selenium with undetected-chromedriver to work best.

whatnotests2|1 year ago

With agents like Finic, soon the web will be built for agents, rather than humans.

I can see a few years from now almost all web traffic is agents.

jasonwcfan|1 year ago

Yep. I used to be the guy responsible for bot detection at Robinhood so I can tell you firsthand it's impossible to reliably differentiate between humans and machines over a network. So either you accept being automated, or you overcorrect and block legitimate users.

I don't think the dead internet theory is true today, but I think it will be true soon. IMO that's actually a good thing, more agents representing us online = more time spent in the real world.

j0r0b0|1 year ago

Thank you for sharing!

Your sign up flow might be broken. I tried creating an account (with my own email), received the confirmation email, but couldn't get my account to be verified. I get "Email not confirmed" when I try to log in.

Also, the verification email was sent from accounts@godealwise.com, which is a bit confusing.

jasonwcfan|1 year ago

Oops! We tested the Oauth flow but forgot to update the email one. Thanks for the heads up, fixing this now.

ayanb9440|1 year ago

This should be fixed now

skeptrune|1 year ago

I wonder if there are hidden observality problems with scraping with ideal solutions of a different shape than a dashboard. Feels like sentry connection or other common alert monitoring solutions would combine well with the LLM proposed changes and help trams react more quickly to pipeline problems.

ayanb9440|1 year ago

We do support sentry. Finic projects are poetry scripts so you can `poetry add` any observability library you need.

computershit|1 year ago

First, nice work. I'm certainly glad to see such a tool in this space right now. Besides a UI, what does this provide that something like Browserless doesn't?

jasonwcfan|1 year ago

Thanks! Wasn't familiar with Browserless but took a quick look. It seems they're very focused on the scraping use case. We're more focused on the agent use case. One of our first customers turned us on to this - they wanted to build an RPA automation to push data to a cloud EHR. The problem was it ran as a single page application with no URL routing, and had an extremely complex API for their backend that was difficult to reverse engineer. So automating the browser was the best way to integrate.

If you're trying to build an agent for a long-running job like that, you run into different problems: - Failures are magnified as a workflow has multiple upstream dependencies and most scraping jobs don't. - You have to account for different auth schemes (Oauth, password, magic link, etc) - You have to implement token refresh logic for when sessions expire, unless you want to manually login several times per day

We don't have most of these features yet, but it's where we plan to focus.

And finally, we've licensed Finic under Apache 2.0 whereas Browserless is only available under a commercial license.

ushakov|1 year ago

I do not understand what this actually is. Any difference between Browserbase and what you’re building?

Also, curious why your unstructured idea did not pan out?

ayanb9440|1 year ago

Looking at their docs, it seems that with Browserbase you would still have to deploy your Playwright script to a long-running job and manage the infra around that yourself.

Our approach is a bit different. With finic you just write the script. We handle the entire job deployment and scaling on our end.

ilrwbwrkhv|1 year ago

Backed by YC = Not open source. Eventually pressure to exit and hyper scale will take over.

ayanb9440|1 year ago

There are quite a few open source YC startups at this point. Our understanding is that:

1. Developer tooling should be open source by default 2. Open source doesn't meaningfully affect revenue/scaling because developers that would use your self-hosted version would build in-house anyway.

yard2010|1 year ago

I'm curious, can't do both?

slewis|1 year ago

Is it stateful? Like can I do a run, read the results, and then do another run from that point?

ayanb9440|1 year ago

We currently don't save the browser state after the run has completed but that's something we can definitely add as a feature. Could you elaborate on your use case? In which scenarios would it be better to split a run into multiple steps?