It has a fixed capacity of how many different things it can pay close attention to. If it fails on a seemingly less important but easy to follow instruction it is an indicator that it has reached capacity. If the instruction seems irrelevant it is probably prioritized to be discarded, hence a canary that the capacity has been reached.
inopinatus|3 months ago
So when you tell it that it made a mistake, or is stupid, then those things are now prompting it to be more of the same.
And only slightly more obliquely: if part of the context includes the LLM making mistakes, expect similar activations.
Best results come if you throw away such prompts and start again. That is, iterate outside the function, not inside it.
fspeech|3 months ago
parineum|3 months ago
Source, all the way down to the ability to "pay attention to" part.
atakan_gurkan|3 months ago