(no title)
itrummer | 4 months ago
In general, when using LLMs, there are no formal guarantees on output quality anymore (but the same applies when using, e.g., human crowd workers for comparable tasks like image classification etc.).
Having said that, we did some experiments evaluating output accuracy for a prior version of ThalamusDB and the results are here: https://dl.acm.org/doi/pdf/10.1145/3654989 We will actually publish more results with the new version within the next few months as well. But, again, no formal guarantees.
satisfice|4 months ago
But LLMs routinely make errors that if made by a human would cause us to believe that human is utterly incompetent, acting in bad faith, or dangerously delusional. So we should never just shrug and say nobody’s perfect. I have to be responsible for what my product does.
Thanks for the link!