top | item 42834921

(no title)

kozikow | 1 year ago

> the harder it is for me to use these tools in a way that doesn’t feel like too much blind faith (even if it works!)

I tend to ask multiple models and if they all give me roughly the same answer, then it's probably right.

discuss

Also keeping context short. Virtually all my cases of bad hallucinations with o1 have been when I've provided too much context or the conversation has been going on for too long. Starting a new chat fixes it.

You can see this effect in the ARC-AGI evals, too much context impacts even o3(high).

aquafox|1 year ago

> if they all give me roughly the same answer, then it's probably right.

... or they had a lot of overlapping training data in that area.

otabdeveloper4|1 year ago

Or maybe they were just trained on the same (incorrect) dataset.