(no title)
wbogusz | 1 year ago
The issue is not in the verification system, but in putting quantifiable bounds on your answer set. If I ask an LLM to multiply large numbers together I can also very easily verify the generated answer by topping it with a deterministic function.
I.e. rather than hoping that an LLM can accurately multiply two 10 digit numbers, I have a much easier (and verified) solution by instead asking it to perform this calculation using python and reading me the output
No comments yet.