top | item 46706653

(no title)

tonyww | 1 month ago

One clarification since a few comments from coworkers/friends are circling this: Amazon isn’t the point here.

We used it because it’s a dynamic, hostile UI, but the design goal is a site-agnostic control plane. That’s why the runtime avoids selectors and screenshots and instead operates on pruned semantic snapshots + verification gates.

If the layout changes, the system doesn’t “half-work” — it fails deterministically with artifacts. That’s the behavior we’re optimizing for.

discuss

order

tomhow|1 month ago

Can you please clarify: is this project something that "people can play with"? I.e., can users download the code and sample data and try it out for themselves, or play with it some other way?

That's a prerequisite for Show HN.

I'm removing the Show HN prefix for now, until we get clarity. Then we can consider re-upping the post once we know exactly how to present it.

ares623|1 month ago

> If the layout changes, the system doesn’t “half-work” — it fails deterministically with artifacts. That’s the behavior we’re optimizing for.

how is this different than building a scraper script that does it traditionally?

tonyww|1 month ago

Good question. On the surface, it does look very similar to the traditional scraper/script, but there's a subtle difference in where the logic lives and how failures are handled.

A traditional scraper/script hard-codes selectors and control flow up front. When the layout changes, it usually breaks at an arbitrary line and you debug it manually.

In this setup, the agent chooses actions at *runtime* from a bounded action space, and the system uses the built-in predicates (e.g. url_changes, drawer_appeared, etc) to verify the outcomes. When it fails, it fails at a specific semantic assertion with artifacts, not a missing selector.

So it’s less “replace scripts” and more “apply test-style verification and recovery to AI-driven decisions instead of static code.”

blibble|1 month ago

it costs a lot more