WingNews logo WingNews
top | new | best | ask | show | jobs
top | item 46813074

(no title)

smithclay | 1 month ago

We need more rigorous benchmarks for SRE tasks, which is much easier said that done.

The only other benchmark I've come across is https://sreben.ch/ ... certainly there must be others by now?

discuss

order

nyellin|1 month ago

We publish the benchmarks for HolmesGPT (CNCF sandbox project) at https://holmesgpt.dev/development/evaluations/
powered by hn/api // news.ycombinator.com