(no title)
bpp
|
1 year ago
I work in AI product eng for a larger company. The honest answer is that with good RAG and few-shot prompting, we can consider actual incorrect output to be a serious and reproducible bug. This means that when we call LLMs in production, we get about the same wrong-answer rate as we do any other kind of product engineering bug.
No comments yet.