top | item 42975966

(no title)

jelambs | 1 year ago

author of the post here, yeah this is a really good point. I think we're going to see more people investing in building OAuth compatible apps and more thorough APIs to support agent use cases. but of course, not every site is going to do so, so agents will in many cases just be doing screenscraping effectively. but I think overtime, users will prefer using applications that make it easier and more secure for agents to interact with them.

I was an early engineer at Plaid and I think it's an interesting parallel, financial data aggregators used to use more of a screenscraping model of integration but over the past 5+ years, it's moved almost fully to OAuth integrations. would expect the adoption curve here to be much steeper than that, banks are notoriously slow so would expect tech companies to move even more quickly towards OAuth and APIs for agents.

another dimension of this, is that it's quite easy to block ai agents screenscraping, we're able to identify with almost 100% accuracy open ai's operator, anthropic's computer use api, browswerbase, etc. so some sites might choose to block agents from screenscraping and require the API path.

all of this is still early too, so excited to see how things develop!

discuss

order

bboygravity|1 year ago

If website haven't been able to make even consistent logins and forms for humans to use, what makes you think they will be able to make usable API's for agents to use?

I've tried making a Firefox extension that fills webforms using an LLM and the things website makers come up with the break their own forms for both humans and agents are just insane.

There are probably over a 1000 different ways to ask for someone's address that an agent (and/or human) would struggle to understand. Just to name an example.

I think agents will be able to get through them easily, but NOT because the websites makers are going to do a better job at being easier to use.

danielbln|1 year ago

Interesting, what's are the heuristics for blocking? User agent? Something playwright does, metadata like resolution or actual behavior?

sethhochberg|1 year ago

The user agent is pretty low hanging fruit, but these days even your most standard captchas / bot detection algorithms are looking at things like mouse movement patterns - a simple bot controlling a mouse might be coded to move the cursor from wherever it is to the destination in the shortest path possible; a human might try for the shortest path, but actually do something that only approximates the most direct path based on their dexterity, where the cursor began, the mouse they’re using, etc.

Tools in this space rely a lot on human use of a computer being much slower, less precise, and more variable than machine use of a computer.