top | item 47031928

(no title)

svara | 14 days ago

Opus 4.6:

Walk! At 50 meters, you'll get there in under a minute on foot. Driving such a short distance wastes fuel, and you'd spend more time starting the car and parking than actually traveling. Plus, you'll need to be at the car wash anyway to pick up your car once it's done.

discuss

order

crimsonnoodle58|13 days ago

That's not what I got.

Opus 4.6 (not Extended Thinking):

Drive. You'll need the car at the car wash.

almost|13 days ago

Also what I got. Then I tried changing "wash" to "repair" and "car wash" to "garage" and it's back to walking.

surgical_fire|13 days ago

That you got different results is not surprising. LLMs are non-deterministic; which is both a strength and a weakness of LLMs.

mvdtnz|13 days ago

We know. We know these things aren't determination. We know.

visarga|13 days ago

> That's not what I got.

My Opus vs your Opus, which is smarter?!

silisili|13 days ago

Am I the only one who thinks these people are monkey patching embarrassments as they go? I remember the r in strawberry thing they suddenly were able to solve, while then failing on raspberry.

GoatInGrey|13 days ago

Me:

I want to get my car's transmission fixed, the mechanic shop is 50 meters away. Would it be better to walk or to drive to a location 50 meters away from my front door?

-

Opus 4.6:

<thinking> The user is asking whether to walk or drive 50 meters to a mechanic shop, but there's a key nuance here - their transmission needs fixing, so driving might not be wise.

Done </thinking>

Walk, absolutely. Driving a car with a compromised transmission even 50 meters risks making the problem worse and turning a repair into a more expensive one. Plus, 50 meters is barely a minute on foot.

viking123|13 days ago

Lmao, and this is what they are saying will be an AGI in 6 months?

notahacker|13 days ago

There's probably a comedy film with an AGI attempting to take over the world with its advanced grasp of strategy, persuasion and SAT tests whilst a bunch of kids confuse it by asking it fiendish brainteasers about carwashes and the number of rs in blackberry.

(The final scene involves our plucky escapees swimming across a river to escape. The AIbot conjures up a speedboat through sheer powers of deduction, but then just when all seems lost it heads back to find a goat to pick up)

misnome|13 days ago

But “PhD level” reasoning a year ago.

hypeatei|13 days ago

Yes, get ready to lose your job and cash your UBI check! It's over.

prmph|13 days ago

Laughable indeed.

One thing that my use of the latest and greatest models (Opus, etc) have made clear: No matter how advanced the model, it is not beyond making very silly mistakes regularly. Opus was even working worse with tool calls than Sonnet and Haiku for a while for me.

At this point I am convinced that only proper use of LLMs for development is to assist coding (not take it over), using pair development, with them on a tight leash, approving most edits manually. At this point there is probably nothing anyone can say to convince me otherwise.

Any attempt to automate beyond that has never worked for me and is very unlikely to be productive any time soon. I have a lot of experience with them, and various approaches to using them.

moogly|13 days ago

They seem to have stopped talking about AGI and pivoted to ads and smut.

cbozeman|13 days ago

Well in fairness, the "G" does stand for "General".

stingraycharles|13 days ago

That’s without reasoning I presume?

plexicle|13 days ago

4.6 Opus with extended thinking just now: "At 50 meters, just walk. By the time you start the car, back out, and park again, you'd already be there on foot. Plus you'll need to leave the car with them anyway."

gf000|13 days ago

Not the parent poster, but I did get the wrong answer even with reasoning turned on.