They both make stuff up and make very obvious mis-interpretations of evidence. If you take the output of an LLM, and ask another LLM to check it, this dramatically reduces this. Even if you do it with the same LLM but without the existing context. I was able to write a detailed analysis of a rule system by doing this with 3 steps, claude -> chatgpt -> gemini3. It caught all the mistakes, including overstatements and vague statements. It wasn't perfect, but even after one review the # of mistakes or stupid statements was almost 0.
I’d save a lot of time from not choosing to smugly telling the AI how wrong it was just for my own reassurances that at least for now I’m still more useful than it is.
ltbarcly3|2 months ago
erichocean|2 months ago
geophph|2 months ago