(no title)
ragibson | 2 months ago
"It is worth noting that the instruction to "ignore internal knowledge" played a role here. In cases like the shutters puzzle, the model did seem to suppress its training data. I verified this by chatting with the model separately on AI Studio; when asked directly multiple times, it gave the correct solution significantly more often than not. This suggests that the system prompt can indeed mask pre-trained knowledge to facilitate genuine discovery."
hypron|2 months ago
jdiff|2 months ago
It's a big part of why search overview summaries are so awful. Many times the answers are not grounded in the material.
stavros|2 months ago
Instead, what can happen is that, like a human, the model (hopefully) disregards the instruction, making it carry (close to) zero weight.
brianwawok|2 months ago