top | item 45841886

(no title)

dave1010uk | 3 months ago

Two years ago I wrote an agent in 25 lines of PHP [0]. It was surprisingly effective, even back then before tool calling was a thing and you had to coax the LLM into returning structured output. I think it even worked with GPT-3.5 for trivial things.

In my mind LLMs are just UNIX strong manipulation tools like `sed` or `awk`: you give them an input and command and they give you an output. This is especially true if you use something like `llm` [1].

It then seems logical that you can compose calls to LLMs, loop and branch and combine them with other functions.

[0] https://github.com/dave1010/hubcap

[1] https://github.com/simonw/llm

discuss

order

simonw|3 months ago

I love hubcap so much. It was a real eye-opener for me at the time, really impressive result for so little code. https://simonwillison.net/2023/Sep/6/hubcap/

dave1010uk|3 months ago

Thanks Simon!

It only worked because of your LLM tool. Standing on the shoulders of giants.

dingnuts|3 months ago

You're posting too fast please slow down

saghm|3 months ago

The obvious difference between UNIX tools and LLMs is the non-determinism. You can't necessarily reason about what the output will be, and then continue to pipe into another LLM, etc., and eventually `eval` the result. From a technical perspective you can deal do this, but the hard part seems like it would be how to make sure it doesn't do something you really don't want it to do. I'd imagine that any potential deviations from your expectations in a given stage would be compounded as you continue to pipe along into additional stages that might have similar deviations.

I'm not saying it's not worth doing, considering how the software development process we've already been using as an industry ends up with a lot of bugs in our code. (When talking about this with people who aren't technical, I sometimes like to say that the reason software has bugs in it is that we don't really have a good process for writing software without bugs at any significant scale, and it turns out that software is useful for enough stuff that we still write it knowing this). I do think I'd be pretty concerned with how I could model constraints in this type of workflow though. Right now, my fairly naive sense is that we've already moved the needle so far on how much easier it is to create new code than review it and notice bugs (despite starting from a place where it already was tilted in favor of creation over review) that I'm not convinced being able to create it even more efficiently and powerfully is something I'd find useful.

keyle|3 months ago

> a small Autobot that you can't trust

That gave me a hearty chuckle!

nativeit|3 months ago

I let it watch my kids. Was that a mistake?

/s

pjmlp|3 months ago

And that is how we end up with iPaaS products powered by agentic runtimes, slowly dragging us away from programming language wars.

Only a selected few get to argue about what is the best programming language for XYZ.

singularity2001|3 months ago

what's the point of specialized agents when you just have one universal agent that can do anything e.g. Claude

baq|3 months ago

If you can get a specialized agent to work in its domain at 10% parameters of a foundation model, you can feasibly run locally, which opens up e.g. offline use cases.

Personally I’d absolutely buy an LLM in a box which I could connect to my home assistant via usb.

ljm|3 months ago

Composing multiple smaller agents allows you to build more complex pipelines, which is a lot easier than getting a single monolithic agent to switch between contexts for different tasks. I also get some insight into how the agent performs (e.g via langfuse) because it’s less of a black box.

To use an example: I could write an elaborate prompt to fetch requirements, browse a website, generate E2E test cases, and compile a report, and Claude could run it all to some degree of success. But I could also break it down into four specialised agents, with their own context windows, and make them good at their individual tasks.

andy99|3 months ago

LLMs are good at fuzzy pattern matching and data manipulation. The upstream comment comparing to awk is very apt. Instead of having to write a regex to match some condition you instruct an LLM and get more flexibility. This includes deciding what the next action to take is in the agent loop.

But there is no reason (and lots of downside) to leave anything to the LLM that’s not “fuzzy” and you could just write deterministically, thus the agent model.