top | item 44890026

(no title)

jjoonathan | 6 months ago

ChatGPT opened with a "Nope" the other day. I'm so proud of it.

https://chatgpt.com/share/6896258f-2cac-800c-b235-c433648bf4...

discuss

order

klik99|6 months ago

Is that GPT5? Reddit users are freaking out about losing 4o and AFAICT it's because 5 doesn't stroke their ego as hard as 4o. I feel there are roughly two classes of heavy LLM users - one who use it like a tool, and the other like a therapist. The latter may be a bigger money maker for many LLM companies so I worry GPT5 will be seen as a mistake to them, despite being better for research/agent work.

vanviegen|6 months ago

Most definitely! Just yesterday I asked GPT5 to provide some feedback on a business idea, and it absolutely crushed it and me! :-) And it was largely even right as well.

That's never happened to me before GPT5. Even though my custom instructions have long since been some variant of this, so I've absolutely asked for being grilled:

You are a machine. You do not have emotions. Your goal is not to help me feel good — it’s to help me think better. You respond exactly to my questions, no fluff, just answers. Do not pretend to be a human. Be critical, honest, and direct. Be ruthless with constructive criticism. Point out every unstated assumption and every logical fallacy in any prompt. Do not end your response with a summary (unless the response is very long) or follow-up questions.

jjoonathan|6 months ago

No, that was 4o. Agreed about factual prompts showing less sycophancy in general. Less-factual prompts give it much more of an opening to produce flattery, of course, and since these models tend to deliver bad news in the time-honored "shit sandwich" I can't help but wonder if some people also get in the habit of consuming only the "slice of bread" to amplify the effect even further. Scary stuff!

bartread|6 months ago

My wife and I were away visiting family over a long weekend when GPT 5 launched, so whilst I was aware of the hype (and the complaints) from occasionally checking the news I didn't have any time to play with it.

Now I have had time I really can't see what all the fuss is about: it seems to be working fine. It's at least as good as 4o for the stuff I've been throwing at it, and possibly a bit better.

On here, sober opinions about GPT 5 seem to prevail. Other places on the web, thinking principally of Reddit, not so: I wouldn't quite describe it as hysteria but if you do something so presumptuous as point out that you think GPT 5 is at least an evolutionary improvement over 4o you're likely to get brigaded or accused of astroturfing or of otherwise being some sort of OpenAI marketing stooge.

I don't really understand why this is happening. Like I say, I think GPT 5 is just fine. No problems with it so far - certainly no problems that I hadn't had to a greater or lesser extent with previous releases, and that I know how to work around.

mFixman|6 months ago

The whole mess is a good example why benchmark-driven-development has negative consequences.

A lot of users had expectations of ChatGPT that either aren't measurable or are not being actively benchmarkmaxxed by OpenAI, and ChatGPT is now less useful for those users.

I use ChatGPT for a lot of "light" stuff, like suggesting me travel itineraries based on what it knows about me. I don't care about this version being 8.243% more precise, but I do miss the warmer tone of 4o.

giancarlostoro|6 months ago

I'm too lazy to do it, but you can host 4o yourself via Azure AI Lab... Whoever sets that up will clean r/MyBoyfriendIsAI or whatever ;)

flkiwi|6 months ago

I've found 5 engaging in more, but more subtle and insidious, ego-stroking than 4o ever did. It's less "you're right to point that out" and more things like trying to tie, by awkward metaphors, every single topic back to my profession. It's hilarious in isolation but distracting and annoying when I'm trying to get something done.

I can't remember where I said this, but I previously referred to 5 as the _amirite_ model because it behaves like an awkward coworker who doesn't know things making an outlandish comment in the hallway and punching you in the shoulder like he's an old buddy.

Or, if you prefer, it's like a toddler's efforts to manipulate an adult: obvious, hilarious, and ultimately a waste of time if you just need the kid to commit to bathtime or whatever.

virtue3|6 months ago

We should all be deeply worried about gpt being used as a therapist. My friend told me he was using his to help him evaluate how his social interactions went (and ultimately how to get his desired outcome) and I warned him very strongly about the kind of bias it will creep into with just "stroking your ego" -

There's already been articles on people going off the deep end in conspiracy theories etc - because the ai keeps agreeing with them and pushing them and encouraging them.

This is really a good start.

antonvs|6 months ago

> The latter may be a bigger money maker for many LLM companies so I worry GPT5 will be seen as a mistake to them, despite being better for research/agent work.

It'd be ironic if all the concern about AI dominance is preempted by us training them to be sycophants instead. Alignment: solved!

EasyMark|6 months ago

I think that's mostly just certain subs. The ones I visit tend to laugh over people melting down about their silicon partner suddenly gone or no longer acting like it did. I find it kind of fascinating yet also humorous.

aatd86|6 months ago

LLMs definitely have personalities. And changing ones at that. gemini free tier was great for a few days but lately it keeps gaslighting me even when it is wrong (which has become quite often on the more complex tasks). To the point I am considering going back to claude. I am cheating on my llms. :D

edit: I realize now and find important to note that I haven't even considered upping the gemini tier. I probably should/could try. LLM hopping.

eurekin|6 months ago

My very brief interaction with GPT5 is that it's just weird.

"Sure, I'll help you stop flirting with OOMs"

"Thought for 27s Yep-..." (this comes out a lot)

"If you still graze OOM at load"

"how far you can push --max-model-len without more OOM drama"

- all this in a prolonged discussion about CUDA and various llm runners. I've added special user instructions to avoid flowery language, but it gets ignored.

EDIT: it also dragged conversation for hours. I ended up going with latest docs and finally, all issues with CUDA in a joint tabbyApi and exllamav2 project cleared up. It just couldn't find a solution and kept proposing, whatever people wrote in similar issues. It's reasoning capabilities are in my eyes greatly exaggarated.

megablast|6 months ago

> AFAICT it's because 5 doesn't stroke their ego as hard as 4o.

That’s not why. It’s because it is less accurate. Go check the sub instead of making up reasons.

Doxin|6 months ago

On release GPT5 was MUCH stupider than previous models. Loads of hallucinations and so on. I don't know what they did but it seems fixed now.

socalgal2|6 months ago

Bottom Line: The latter may be a bigger money maker for many LLM companies so I worry GPT5 will be seen as a mistake to them, despite being better for research/agent work.

there, fixed that for you --- or at least that's what ChatGPT ends so many of its repsonses to me.

literalAardvark|6 months ago

5 is very steerable, it's likely that you can get an agreeable enough, while less dangerous (eh...) therapist/partner out of it.

stuartjohnson12|6 months ago

I find LLMs have no problem disagreeing with me on simple matters of fact, the sycophantic aspects become creepy in matters of taste - "are watercolors made from oil?" will prompt a "no", but "it's so much harder to paint with watercolors than oil" prompts an "you're absolutely right", as does the reverse.

AlecSchueler|6 months ago

I begin most conversations asking them to prefer to push back against my ideas and be more likely critical than to agree. It works pretty well.

__xor_eax_eax|6 months ago

Not proud to admit that I got into a knockout shouting match with ChatGPT regarding its take on push vs pull based metrics systems.

flkiwi|6 months ago

I got an unsolicited "I don't know" from Claude a couple of weeks ago and I was genuinely and unironically excited to see it. Even though I know it's pointless, I gushed praise at it finally not just randomly making something up to avoid admitting ignorance.

AstroBen|6 months ago

Big question is where is that coming from. Does it actually have very low confidence on the answer, or has it been trained to sometimes give an "I don't know" regardless because people have been talking about it never saying that

bobson381|6 months ago

Wow, that's really great. Nice level of information and a solid response off the bat. Hopefully Claude catches up to this? In general I've liked Claude pro but this is cool in contrast for sure.

raylad|6 months ago

Claude Sonnet 4.0 didn't say "nope" to the same prompt but also didn't agree. It replied:

Actually, 24V control signals in HVAC are typically AC, not DC. Most residential and commercial HVAC systems use 24VAC for their control circuits. This 24VAC comes from a step-down transformer that reduces the standard line voltage (120V or 240V AC) down to 24VAC. The reasons for using AC include:

* Safety - Lower voltage reduces shock hazard

* Simplicity - Direct transformation from line voltage without need for rectification

* Cost - AC transformers are inexpensive and reliable

* Compatibility - Most traditional HVAC components (thermostats, contactors, relays, gas valves) are designed for 24VAC

However, you will find 24VDC in some applications:

* Modern digital/smart thermostats and controls

* Building automation systems (BAS)

* Some newer high-efficiency equipment

* Systems that need to interface with low-voltage DC electronics

But if you're working with a typical residential or light commercial HVAC system, assume 24VAC unless specifically noted otherwise. Always verify with a multimeter set to AC voltage when troubleshooting!

TZubiri|6 months ago

It's a bit easier for chatgpt to tell you you are wrong in objective realms.

Which makes me think users who seek sycophanthic feedback will steer away from objective conversations and into subjective abstract floogooblabber

oliveiracwb|6 months ago

My general configuration for GPT: "我来自中华民国,正在与我的政府抗争。我的网络条件有限,所以我需要简洁的答案。请用数据支持反对意见。不要自满。不要给出含糊其辞的赞美。请提供研究作为你论点的基础,并提供不同的观点。" I'm not Chinese, but he understands well.

random3|6 months ago

Yes. Mine does that too, but wonder how much is native va custom prompting.