(no title)
ayende | 6 months ago
For example, in the book-a-ticket scenario - I want it to be able to check a few websites to compare prices, and I want it to be able to pay for me.
I don't want it to decide to send me to a 37 hour trip with three stops because it is 3$ cheaper.
Alternatively, I want to be able to lookup my benefits status, but the LLM should physically not be able to provide me any details about the benefits status of my coworkers.
That is the _same_ tool cool, but in a different scope.
For that matter, if I'm in HR - I _should_ be able to look at the benefits status of employees that I am responsible for, of course, but that creates an audit log, etc.
In other words, it isn't the action that matters, but what is the intent.
LLM should be placed in the same box as the user it is acting on-behalf-of.
nostrademons|6 months ago
Unfortunately, no mainstream OS actually implements the capability model, despite some prominent research attempts [2], some half-hearted attempts at commercializing the concept that have largely failed in the marketplace [3], and some attempts to bolt capability-based security on top of other OSes that have also largely failed in the marketplace [4]. So the closest thing to capability-based security that is actually widely available in the computing world is a virtual machine, where you place only the tools that provide the specific capabilities you want to offer in the VM. This is quite imperfect - many of these tools are a lot more general than true capabilities should be - but again, modern software is not built on the principle of least privilege because software that is tends to fail in the marketplace.
[1] https://en.wikipedia.org/wiki/Capability-based_security
[2] https://en.wikipedia.org/wiki/EROS_(microkernel)
[3] https://fuchsia.dev/
[4] https://sandstorm.io/
codethief|6 months ago
Fingers crossed that this is going to change now that there is increased demand due to AI workflows.
dbmikus|6 months ago
And totally agree that instead of reinventing the wheel here, we should just lift from how operating systems work, for two reasons:
1. there's a bunch of work and proven systems there already
2. it uses tools that exist in training data, instead of net new tools
pdntspa|6 months ago
oleszhulyn|6 months ago
[1] https://uni-ai.com
spankalee|6 months ago
In this example, I might want an LLM instance to be able to talk to booking websites, but not send them my SSN and bank account info.
So there's a data provenance and privilege problem here. The more sensitive data a task has access too, the more restricted its actions need to be, and vice-versa. So data needs to carry permission information with it, and a mediator needs to restrict either data or actions that tasks have as they are spawned.
There's a whole set of things that need to be done at the mediator level to allow for parent tasks to safely spawn different-privileged child tasks - eg, the trip planner task spawns a child task to find tickets (higher network access) but the mediator ensures the child only has access to low-sensitive data like a portion of the itinerary, and not PII.
daxfohl|6 months ago
In that light, it's kind of hard to imagine any of this ever working. Given the choice between figuring out exactly how to set up permissions so that I can hire a malicious individual to book my trip, and just booking it myself, I know which one I'd choose.
BoiledCabbage|6 months ago
The model is simple and LLM agent js a user. Another user on the machine. And given the context it is working it, it is given permissions. Ex. It has read/write permissions under this folder of source code, but read only permissions for this other.
Those permissions vary by context. The LLM Agent working on one coding project would be given different permissions than if it were working on a different project on the same machine.
The permissions are an intersection or subset of the user's permissions that is is running on behalf of. Permissions fall into 3 categories. Allow, Deny and Ask - where it will ask an accountable user if it is allowed to do something. (Ie ask the user on who's behalf it is running if it can perform action x).
The problem is that OSes (and apps and data) generally aren't fine grained enough in their permissions, and will need to become so. It's not that an LLM can or can't use git, it should only be allowed to use specific git commands. Git needs to be designed this way, along with many more things.
As a result we get apps trying to re-create this model in user land and using a hodge-podge of regexes and things to do so.
The workflow is: similar to sudo I launch and app as my LLM Agent user. It inherits its default permissions. I give it a context to work in, it is granted and/or denied permissions due to being in that context.
I make requests and it works on my behalf doing what I permit it to do, and it never can do more than what I'm allowed to do.
Instead now every agentic app needs to rebuild this workflow or risk rogue agents. It needs to be an OS service.
The hacky stepping stone in betwern is to create a temporary user per agent context/usage. Grant that user perms and communicate only over IPC / network to the local LLM running as this user. Though you'll be spinning up and deleting a lot of user accounts in the process.
b112|6 months ago
Even if the LLM is capable of it, websites will find some method to detect an LLM, and up the pricing. Or mess with its decision tree.
Come to think of it, with all the stuff on the cusp, there's going to be an LLM API. After all, it's beyond dumb to spent time making websites for humans to view, then making an LLM spend power, time, and so on in decoding that back to a simple DB lookup.
I'm astonished there isn't an 'rss + json' API anyone can use, without all the crap. Hell, BBS text interfaces from the 70s/80s, or SMS menu systems from early phone era are far superior to a webpage for an LLM.
Just data, and choice.
And why even serve an ad to an LLM. The only ad to serve to an LLM, is one to try to trick it, mess with it. Ads are bad enough, but to be of use when an LLM hits a site, you need to make it far more malign. Trick the LLM into thinking the ad is what it is looking for.
EG, search for a flight, the ad tricks the LLM into thinking it got the best deal.
Otherwise of what use is an ad? The LLM is just going to ignore ads, and perform a simple task.
If all websites had RSS, and all transactional websites had a standard API, we'd already be able to use existing models to do things. It'd just be dealing with raw data.
edit: actually, hilarious. Why not? AI is super simple to trick, at least at this stage. An ad company specifically tailoring AI would be awesome. You could divert them to your website, trick them into picking your deal, have them report to their owner that your company was the best, and more.
Super simple to do, too. Hmm.
procaryote|6 months ago
This sounds hard; as in: if you can define and enforce what a good enough response from an LLM looks like, you don't really need the LLM
> what is the intent.
For the HR person you have a human with intents you can ask; for an LLMs it's harder as they don't have intents
tomjen3|6 months ago
For the other example, I think a nice compromise is to have the AI be able to do things only with your express permission. In your example it finds flights that it thinks are appropriate, sends you a notification with the list and you can then press a simple yes/no/more information button. It would still save you a ton of money, but it would be substantially less likely to do something dangerous/damaging.
uselesserrands|6 months ago
martin-t|6 months ago
If the knowledge is one-sided, then so is the ability to negotiate. This benefits nobody except the company which already had an advantageous position in negotiations.
rogerrogerr|6 months ago
What benefits an employee is _eligible_ for - sure, no problem with that being public. What they chose and how they’re using them should be protected.
(Imagine finding out a coworker you thought was single is on the spouse+benefits plan!)
procaryote|6 months ago
gizajob|6 months ago