top | item 40931601

(no title)

bpp | 1 year ago

I work in AI product eng for a larger company. The honest answer is that with good RAG and few-shot prompting, we can consider actual incorrect output to be a serious and reproducible bug. This means that when we call LLMs in production, we get about the same wrong-answer rate as we do any other kind of product engineering bug.

discuss

No comments yet.