top | item 47032129

(no title)

crimsonnoodle58 | 13 days ago

That's not what I got.

Opus 4.6 (not Extended Thinking):

Drive. You'll need the car at the car wash.

discuss

order

almost|13 days ago

Also what I got. Then I tried changing "wash" to "repair" and "car wash" to "garage" and it's back to walking.

surgical_fire|13 days ago

That you got different results is not surprising. LLMs are non-deterministic; which is both a strength and a weakness of LLMs.

mvdtnz|13 days ago

We know. We know these things aren't determination. We know.

visarga|13 days ago

> That's not what I got.

My Opus vs your Opus, which is smarter?!

nosuchthing|13 days ago

LLMs can't access the training data that's less than the statistically most common token, so they use a random jitter.

With that randomness comes statistically irrelevant results.

silisili|13 days ago

Am I the only one who thinks these people are monkey patching embarrassments as they go? I remember the r in strawberry thing they suddenly were able to solve, while then failing on raspberry.

plexicle|13 days ago

Nah. It's just non-deterministic. I'm here 4 hours later and here's the Opus 4.6 (extended thinking) response I just got:

"At 50 meters, just walk. By the time you start the car, back out, and park again, you'd already be there on foot. Plus you'll need to leave the car with them anyway."

mentalgear|13 days ago

They definitely do: at least openAi "allegedly" has whole teams scanning socials, forums, etc for embarrassments to monkey-patch.

groundzeros2015|13 days ago

This is part of why they need to be so secretive. If you can see the tree of hardcoded guidance for common things it won’t look as smart.

viking123|13 days ago

They should make Opus Extended Extended that routes it to actual person in a low cost country.

raincole|13 days ago

Yes, you're the only one.

chvid|13 days ago

Of course they are.

cowboylowrez|13 days ago

Thats my thought too. The chatbot bros probably feel the need to be responsive and there's probably an express lane to update a trivia file or something lol

anonym29|13 days ago

No doubt about it, and there's no reason to suspect this can only ever apply to embarassing minor queries, either.

Even beyond model alignment, it's not difficult to envision such capabilities being used for censorship, information operations, etc.

Every major inference provider more or less explicitly states in their consumer ToS that they comply with government orders and even share information with intelligence agencies.

Claude, Gemini, ChatGPT, etc are all one national security letter and gag order away from telling you that no, the president is not in the Epstein files.

Remember, the NSA already engaged in an unconstitutional criminal conspiracy (as ruled by a federal judge) to illegally conduct mass surveillance on the entire country, lie about it to the American people, and lie about it to congress. The same organization that used your tax money to bribe RSA Security to standardize usage of a backdoored CSPRNG in what at the time was a widely used cryptographic library. What's the harm in a little bit of minor political censorship compared to the unconstitutional treason these predators are usually up to?

That's who these inference providers contractually disclose their absolute fealty to.