(no title)
spyder | 17 days ago
I remember some papers about earlier models having around 15% prompt variability, and with different tool use sometimes there are even more significant jumps. And if I remember correctly the reasoning models improve some of these because lot of the early prompting tricks is included in them like "thinking step-by-step", "think carefully" and some other "magic" methods. Also another trick is to ask the models to rephrase the prompt with their own words because that may produce prompt that better align with their training prompts. For sure the big model developers are aware of these and constantly improving it, I just don't see too much discussion or numbers about it.
andai|17 days ago