top | item 42145249

(no title)

fchollet | 1 year ago

I believe the MindsAI solution does feature novel ideas that do indeed lead to better generalization (test-time fine-tuning). So it's definitely the kind of research that ARC was supposed to incentivize -- things are working as intended. It's not a "hack" of the benchmark.

And yes, they do use a lot of synthetic pretraining data, which is much less interesting research-wise (no progress on generalization that way...) but ultimately it's on us to make a robust benchmark. MindsAI is playing by the rules.

discuss

No comments yet.