top | item 41727418

(no title)

fzaninotto | 1 year ago

Evaluating the quality of the responses of AI agents used to be tricky. It required knowledge of eval criteria as well as third-party tools like promptfoo, ragas or prometheus. Now openAI makes it ridiculously easy with a new API endpoint. It can grade a completion against a reference response, assess its format and tone, and you can even promt the eval to add your own criteria.

discuss

order

No comments yet.