top | item 45404634

(no title)

drc500free | 5 months ago

That's not 50% success rate at completing the task, that's the win rate of a head-to-head comparison of an algorithm and an expert. 50% means the expert and the algorithm each "win" half the time.

discuss

order

viernullvier|5 months ago

For the METR rating (first half of the article), it is indeed 50% success rate at completing the task. The win rate only applies to the GDPval rating (second half of the article).