(no title)
yatz | 1 year ago
1) Send the same prompt twice, including "Can you double check?" in the second prompt to force GPT to verify the answer. 2) If both answers are the same, you got the correct answer. 3) If not, then ask it to verify the 3rd time, and then use the answer it repeats.
Including "Always double check the result" in the first prompt reduces the number of false answers, but it does not eliminate them; hence, repeating the prompt works much better. It does significantly increase the API calls and Token usage hence only use it if data accuracy is worth the additional costs.
groby_b|1 year ago
That is only true if you stay within the same chat. It is not true across chats. Context caching is something that a lot of folks would really really like to see.
And jumping to a new chat is one of the core points of the OP: "I restarted with a slightly modified prompt:"
The iterations before where mostly to figure out why the initial prompt went wrong. And AFAICT there's a good insight in the modified prompt - "Make no assumptions". Probably also "ensure you fully understand how it's labelled".
And no, asking repeatedly doesn't necessarily give different answers, not even with "can you double check". There are quite a few examples where LLMs are consistently and proudly wrong. Don't use LLMs if 100% accuracy matters.
yatz|1 year ago
Here are a few examples where it does not consistently give you the same answer and helps by asking it to retry or double-check:
1) Asking gpt to find something, e.g., HSCode for a product, it returns a false positive after x number of products. Asking it to double-check almost always corrects itself.
2) Quite a few times, asking it to write code results in incorrect syntax or code that does what you asked. Simply asking, are you sure, or can you double check, should make it revisit its answer.
3) Ask it to find something from an attachment, e.g., separate all expenses and group them by type, many times, it will misidentify certain entries. However, asking to double-check fixes it.
sharemywin|1 year ago
wahnfrieden|1 year ago
BOOSTERHIDROGEN|1 year ago
kbenson|1 year ago
mckirk|1 year ago
falcor84|1 year ago
https://qntm.org/vhitaos