top | item 35694776 (no title) amrb | 2 years ago I'd like to see a yearly benchmark for models, could be logic puzzles or a suit of tasks but as it stands there is not good way to measure the ability of models. discuss order hn newest No comments yet.
No comments yet.