top | item 47149327

AI Agent Reliability Tracker

1 points| smartmic | 4 days ago |hal.cs.princeton.edu

1 comment

order

chrisjj|4 days ago

> recent capability gains have yielded only small improvements in reliability.

Have I missed something? Why would one expect capability gain to make any such improvement?