top | item 35936723

(no title)

mattalex | 2 years ago

It's actually a lot worse than that: Just redesigning LLMs to have separate input channels for prompts and data doesn't solve the problem either, since this would be impossible to train.

Effectively you would need to filter all incoming data into "data" and "prompt" parts, because otherwise the model would learn to also follow instructions put into the "data" path. However, this split between data and prompt does not exist in natural language. You can even think of sentences that might act as both depending on the context and interpretation you put on them. So getting this sort of split without tainting the data channel is intractable.

discuss

No comments yet.