(no title)
MyFirstSass | 1 month ago
Seems exactly like the tests at my company where even frontier models are revealed to be very expensive rubber ducks, but completely fails with non experts or anything novel or math heavy.
Ie. they mirror the intellect of the user but give you big dopamine hits that'll lead you astray.
markusde|1 month ago
But speaking as a specialist in theorem proving, this result is pretty impressive! It would have likely taken me a lot longer to formalize this result even if it was in my area of specialty.
falcor84|1 month ago
How did you arrive at "ridiculous"? What we're seeing here is incredible progress over what we had a year ago. Even ARC-AGI-2 is now at over 50%. Given that this sort of process is also being applied to AI development itself, it's really not clear to me that humans would be a valuable component in knowledge work for much longer.
jacquesm|1 month ago
Recent case:
I have a bar with a number of weights supported on either end:
|---+-+-//-+-+---|
What order and/or arrangement or of removing the weights would cause the least shift in center-of-mass? There is a non-obvious trick that you can pull here to reduce the shift considerably and I was curious if the AI would spot it or not but even after lots of prompting it just circled around the obvious solutions rather than to make a leap outside of that box and come up with a solution that is better in every case.
I wonder what the cause of that kind of blindness is.
ogogmad|1 month ago
There is a starting node (L_0, R_0, {}) and an ending node ({}, {}, W) , with the latter having L=R={}.
I think you're trying to find the path (L_n, R_n, S_n) from the starting node to the ending node that minimises the maximum absolute value of c(L_n, R_n, S_n).
I won't post a solution, as requested.
jiggawatts|1 month ago
My guess is: first move the weights to the middle, and only then remove them.
However “weights” and “bar” might confuse both machines and people into thinking that this is related to weight lifting, where there’s two stops on the bar preventing the weights from being moved to the middle.
TeodorDyakov|1 month ago
krzat|1 month ago
SecretDreams|1 month ago
This hits so true to home. Just today in my field a manager without expertise in a topic gave me an AI solution to something I am an expertise in. The AI was very plainly and painfully wrong, but it comes down to the user prompting really poorly. When I gave a el formulated prompt to the same topic, I got the correct answer on the first go.
encyclopedism|1 month ago
EA-3167|1 month ago
HDThoreaun|1 month ago
Davidzheng|1 month ago
MyFirstSass|1 month ago
"Aristotle integrates three main components: a Lean proof search system, an informal reasoning system that generates and formalizes lemmas, and a dedicated geometry solver"
Not saying it's not an amazing setup, i just don't understand the word "AI" being used like this when it's the setup / system that's brilliant in conjunction with absolute experts.
anthem2025|1 month ago
[deleted]