Removing the risk for most jobs should be possible. Just build the same cages other apps already have. Also add a bit more transparency, so people know better what the machine is doing, maybe even with a mandatory user-acknowledge for potential problematic stuff, similar to how we have root-access-dialogues now. I mean, you don't really need access to all data, when you are just setting a clock, or playing music.
I'd be super interested in more information on this! Do you mean abandoning unsupervised learning completely?
Prompt Injection seems to me to be a fundamental problem in the sense that data and instructions are in the same stream and there's no clear/simple way to differentiate between the two at runtime.
PurpleRamen|25 days ago
larodi|25 days ago
Ono-Sendai|25 days ago
arw0n|24 days ago
Prompt Injection seems to me to be a fundamental problem in the sense that data and instructions are in the same stream and there's no clear/simple way to differentiate between the two at runtime.