top | item 41727417

OpenAI Eval

3 points| fzaninotto | 1 year ago |platform.openai.com

1 comment

Evaluating the quality of the responses of AI agents used to be tricky. It required knowledge of eval criteria as well as third-party tools like promptfoo, ragas or prometheus. Now openAI makes it ridiculously easy with a new API endpoint. It can grade a completion against a reference response, assess its format and tone, and you can even promt the eval to add your own criteria.