top | item 47148430

(no title)

alexhans | 4 days ago

I wrote https://ai-evals.io (community site) to make the concept approachable no matter what tools you choose to use.

You can learn about them evaluating that site https://github.com/Alexhans/eval-ception and then the pattern should be easy to test on your own thing.

discuss

order

skybrian|4 days ago

Doing an eval on itself is clever but confusing for the reader. How about a tutorial explaining how to do an evals on something more normal?

alexhans|4 days ago

I'd be happy to. One thing that is tough is knowing what will resonate with the audience and not being too simple or too complex.

What do you think would resonate with you or with the audience you're thinking about?

That repo also has an illustrative eval for Agent Skill in Airflow for Localization

https://github.com/Alexhans/eval-ception/tree/main/exams/air...