You posted on X a while back asking for a crowdsourced definition of what an "agent" was and I regularly cite that thread as an example of the fact that this word is so blurry right now.
>"The two main categories I see are people who think AI agents are obviously things that go and act on your behalf—the travel agent model—and people who think in terms of LLMs that have been given access to tools which they can run in a loop as part of solving a problem."
This is exactly the problem and these two categories nicely sum up the source of the confusion.
I consider myself in the former camp. The AI needs to determine my intent (book a flight) which is a classification problem, extract out the relevant information (travel date, return date, origin city, destination city, preferred airline) which is a Named Entity Recognition problem, and then call the appropriate API and pass this information as the parameters (tool usage). I'm asking the agent to perform an action on my behalf, and then it's taking my natural language and going from there. The overall workflow is deterministic, but there are elements within it that require some probabilistic reasoning.
Unfortunately, the second camp seems to be winning the day. Creating unrealistic expectations of what can be accomplished by current day LLMs running in a loop while simultaneously providing toy examples of it.
The language problem around agents is that most companies are describing them solely from a human/UX perspective.
'You ask it to do something, and it does it'
That makes it difficult to differentiate the more critical 'how' options in the execution process. From that perspective: deterministic integrations, LLM+tools, LAM, etc are more descriptive categories, each with their own capabilities, strengths, and weaknesses.
Or to put it a different way, if the term doesn't tell you what something is good and bad at, it's probably an underspecified term.
It's been blurry for a long time, FWIW. I have books on "Agents" dating back to the late 90's or early 2000's in which the "Intro" chapter usually has a section that tries to define what an "agent" is, and laments that there is no universally accepted definition.
To illustrate: here's a paper from 1996 that tries to lay out a taxonomy of the different kinds of agents and provide some definitions:
The technical difference between agents then and agents now are the fuzzy parameter mapping capabilities of LLMs, if used.
Scaling agent capability requires agents that are able to auto-map various tools.
If every different tool is a new, custom integration, that must be written by a person, then we end up where we are today -- specialized agents where there exists enough demand and stability to write and maintain those integrations, but no general purpose agents.
Ultimately, parameter mapping in a sane, consistent, globally-applicable way is the key that unlocks an agentic future, or a failure that leads to its demise.
simonw|1 year ago
DebtDeflation|1 year ago
This is exactly the problem and these two categories nicely sum up the source of the confusion.
I consider myself in the former camp. The AI needs to determine my intent (book a flight) which is a classification problem, extract out the relevant information (travel date, return date, origin city, destination city, preferred airline) which is a Named Entity Recognition problem, and then call the appropriate API and pass this information as the parameters (tool usage). I'm asking the agent to perform an action on my behalf, and then it's taking my natural language and going from there. The overall workflow is deterministic, but there are elements within it that require some probabilistic reasoning.
Unfortunately, the second camp seems to be winning the day. Creating unrealistic expectations of what can be accomplished by current day LLMs running in a loop while simultaneously providing toy examples of it.
swyx|1 year ago
ethbr1|1 year ago
'You ask it to do something, and it does it'
That makes it difficult to differentiate the more critical 'how' options in the execution process. From that perspective: deterministic integrations, LLM+tools, LAM, etc are more descriptive categories, each with their own capabilities, strengths, and weaknesses.
Or to put it a different way, if the term doesn't tell you what something is good and bad at, it's probably an underspecified term.
sanjin|1 year ago
adpirz|1 year ago
jvans|1 year ago
mindcrime|1 year ago
To illustrate: here's a paper from 1996 that tries to lay out a taxonomy of the different kinds of agents and provide some definitions:
https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&d...
And another from the same time-frame, which makes a similar effort:
https://www.researchgate.net/profile/Stan-Franklin/publicati...
ethbr1|1 year ago
Scaling agent capability requires agents that are able to auto-map various tools.
If every different tool is a new, custom integration, that must be written by a person, then we end up where we are today -- specialized agents where there exists enough demand and stability to write and maintain those integrations, but no general purpose agents.
Ultimately, parameter mapping in a sane, consistent, globally-applicable way is the key that unlocks an agentic future, or a failure that leads to its demise.
yyyyz|1 year ago
[deleted]