top | item 43132619

(no title)

rosstaylor90 | 1 year ago

What's your AIME 2025 score? https://gr.inc/RJT1990/AIME2025/

discuss

order

nyrikki|1 year ago

The is the point of the AIME, it is a 3 hour closed book examination in which each answer is an integer number from 0 to 999 and should only depend on pre-calc...for a human with no calculator, notes, or internet access.

The concepts are heavily covered in the training corpus, and if people were allowed to take it more than once, with even a book let alone access to the internet it wouldn't be very hard.

Examples:

1) Find the sum of all integer bases $b>9$ for which $17_b$ is a divisor of $97_b.$

In the corpus: https://www.quora.com/In-what-bases-b-does-b-7-divide-into-9...

And one more:

3) https://artofproblemsolving.com/wiki/index.php/2025_AIME_I_P...

Is just the the number of ways to distribute k indistinguishable balls (players) into n distinguishable boxes (flavors, without exclusion, in such a way that no box is empty.

Thus in the corpus for any courses that need to cover combinatorial problems including physics, discreet math, logistics etc...

IMHO these concept classes from a typical AIME are so common, the scores you gave demonstrate that those models are doing no "general reasoning" at all and are actually failing at approximate retrieval.

rosstaylor90|1 year ago

I disagree, 10 years ago AIs nailing these types of competition would have been seen as very impressive. The fact goal posts can move on this now shows how much AI has progressed.

(Also the term “approximate retrieval” is a bad one - reasoning is inherently a process of chaining together associations. What matters is whether the reasoning reaches the right conclusions. Still some way to go, but already very impressive in tasks traditionally considered harbours of human reasoning!)