top | item 44815819

Ask HN: What do you dislike about ChatGPT and what needs improving?

33 points| zyruh | 7 months ago

I'm curious to hear feedback from the HN community about your biggest pain points or frustrations with ChatGPT (or similar LLMs).

What aspects of the experience do you find lacking, confusing, or outright irritating? Which improvements do you think are most urgent or would make the biggest difference?

127 comments

order
[+] Fade_Dance|7 months ago|reply
#1 problem is how sycophantic they are. I in fact want the exact opposite sort of interaction, where they push back against my ideas and actively try to correct and improve my thinking. Too often I am misled into giant waste of time because they have this need to please coded in to their default response structure.

You can say things like "you are a robot, you have no emotions, don't try to act human", but the output doesn't seem to be particularly well calibrated. I feel like when I modify the default response style, I'm probably losing something, considering that the defaults are what go through extensive testing.

[+] quatonion|7 months ago|reply
I have no glazing built into my custom instructions, but it still does it.

It used to be a lot better before glazegate. Never did quite seem to recover.

I don't mind us having fun of course, but it needs to pick up on emotional queues a lot better and know when to be serious.

[+] jamestimmins|7 months ago|reply
With Claude I often say “no glazing” and have told it to take the persona of Paul Bettany’s character in Margin Call, a nice enough but blunt/unimpressed senior colleague who doesn’t beat around the bush. Works pretty well.
[+] kristianp|7 months ago|reply
I've found the same thing with Claude Sonnet 4. I suggest something, it says great suggestion and agrees with me. I then ask it about the opposite approach and it says great job raising that and agrees with that too. I have no idea which is more correct in the end.
[+] zyruh|7 months ago|reply
Yes, the LLMs need to be objective but in situations where its a subjective push back, the LLM would then need to take on a personality of its own.
[+] akkad33|7 months ago|reply
For me it's been the opposite. They take on a condescending tone sometimes and sometimes they sound too salesy and trump up their suggestions
[+] zyruh|7 months ago|reply
Thank you for your feedback!
[+] quatonion|7 months ago|reply
Why you can't download an entire chat as markdown

Copy/Pasting sections of the chat on mobile is laborious

That it still gets manic and starts glazing

That it can remember some things and keeps bringing them up, but forgets other, more pertinent things

If you switch away from it while it is in the middle of generating an image it often cancels the image generation

Image editing accuracy seems to have gone down significantly in quality based on intent.

You can't turn a temporary chat into a permanent one.. sometimes you start a temporary and realize half way it should be permanent - but too late.

The em dashes need to go

And so do the "it's not this, it's that!"

Is it really necessary to make so many lists all the time

Canvas needs a bunch of work

[+] zyruh|7 months ago|reply
Great feedback!
[+] ComplexSystems|7 months ago|reply
It makes too many mistakes and is just way too sloppy with math. It shouldn't be this hard to do pair-theorem-proving with it. It cannot tell the difference between a conjecture that sounds kind of vaguely plausible and something that is actually true, and literally the entire point of math is to successfully differentiate between those two situations. It needs to be able to carefully keep track of which claims it's making are currently proven, either in the current conversation or in the literature, vs which are just conjectural and just sound nice. This doesn't seem inherently harder than any other task you folks have all solved, so I would just hire a bunch of math grad students and just go train this thing. It would be much better.
[+] RugnirViking|7 months ago|reply
I think that's less a math thing and more a rigorous treatment of anything. I find most llms are subject to this type of error where as your conversation context gets longer it becomes dramatically stupider. Heck, try playing chess with it. Once it comes to the midgame it's forgetting which moves it just made, like literally the message previous, hallucinating the context up to that point -- even when providing it the position.
[+] beering|7 months ago|reply
Curious to know how the different models compare for you for doing math. Heard o4-mini is really good at math but haven’t tried o3-pro much.
[+] zyruh|7 months ago|reply
Yes, I've experienced this especially with spreadsheets. I work in marketing, and I've attempted to use ChatGPT to analyze and summarize large spreadsheets. Sadly, i've learned it can't be trusted to do that.
[+] nubela|7 months ago|reply
There is this bias problem not just with ChatGPT, but with LLMs in general. It is not able to be objective. For example, if you paste arguments from 2 lawyers, for which lawyer A uses very strong words and writes a lot more VS that of lawyer B, which has a strong case but says less. LLMs in general will always be biased and err towards the side which uses stronger language and write a lot more.

This to me, is a sign that intelligence/rationalization is not present yet. That said, it does seem like something that can be "trained" away.

[+] zyruh|7 months ago|reply
Yes, the technology needs to evolve more, certainly.
[+] 8bitsrule|7 months ago|reply
I've most disliked made-up, completely incorrect answers easily proven to be so, followed by GPT-grovelling when contradicted with the facts, promises to 'learn' and 'I'll strive to do better'. Time after time over months, the same dodging and weaseling.

A simple 'I don't know, I haven't got access to the answer' would be a great start. People who don't know better are going to swallow those crap answers. For this we need to produce much more electricity?

[+] zyruh|7 months ago|reply
LLMs need regular transparency - fact checking so the user can verify an validate accuracy.
[+] nebben64|7 months ago|reply
+1 on context window remaining

better memory management: I have memories that get overlooked or forgotten (even though I can see them in the archive), then when I try to remind chatGPT, it creates a new memory; also updating a memory often just creates a new one. I can kind of tell that Chat is trying hard to reference past memories, so I try to not have too many, and make each memory contain only precise information.

Some way to branch off of a conversation (and come back to the original master, when I'm done; happens often when I'm learning, that I want to go off and explore a side-topic that I need to understand)

[+] zyruh|7 months ago|reply
I hear you on the memory - although I find that ChatGPT's memory is far better than Perplexity's.
[+] throwawaylaptop|7 months ago|reply
In google Gemini, I gave it my database structure and had it code things. Great. I later added to it. I asked it to do things based on that added columns basically, but never told it their names.

It just guessed. But didn't tell me it had no idea what columns and where I was really talking about. So not only did it guess, wrongly, but it didn't even mention that it had to do so. Obviously the code failed.

Why can't it tell me there's a problem with what I'm asking???

[+] decide1000|7 months ago|reply
About the Webapp; better search and filter on previous conversations. Filters on model type. Better errors when context is too big. Forking conversations would be nice. Better export options. Copy whole convo (not just response or reply).

On the LLM: It's too positive. I don't always want it to follow my ideas and I don't want to hear how much my feedback is appreciated. Act like a machine. Also the safety controls are too sensitive sometimes. Rlly annoying because there is no way to continue the conversation. I like gpt4.5 because i can edit the canvas. Would like to have that with all models.

Also some stats like sentiment and fact check would be nice. Because it gives nuances in answers I want to see with the stats how far from the truth or bias I am.

And the writing.. Exaggerating, too many words, spelling mistakes in European languages.

[+] zyruh|7 months ago|reply
This is great! I hear you on the overly positive responses. You mention "act like a machine", but is there perhaps a desire/need for a more human-feeling interface?
[+] krpovmu|7 months ago|reply
1- Sometimes I'm surprised at how easily it forgets the topics discussed in a conversation, and when the conversation goes on for too long, it forgets things that have already been said.

2- The fact that it always tries to answer and sometimes doesn't ask for clarification on what the user is asking; it just wants to answer and that's it.

[+] zyruh|7 months ago|reply
Thank you! The lack of memory is a consistent complaint. Thank you for sharing!
[+] jondwillis|7 months ago|reply
Trying to avoid the things already mentioned:

- Opaque training data (and provenance thereof… where’s my cut of the profits for my share of the data?)

- Closed source frontier models, profit-motive to build moat and pull up ladders (e.g. reasoning tokens being hidden so they can’t be used as training data)

- Opaque alignment (see above)

- Overfitting to in-context examples- e.g. syntax and structure are often copied from examples even with contrary prompting

- Cloud models (seemingly) changing behavior even on pinned versions

- Over-dependence: “oops! I didn’t have to learn so I didn’t. My internet is out so now I feel the lack.”

[+] yelirekim|7 months ago|reply
Universally across ChatGPT, Claude and Gemini, continually revising/editing a document over the course of a long conversation just gets worse and worse. I have learned the trick of exporting the document and starting a brand new conversation all over again, but there should really just be a "clear context window" button or similar to let me perpetually stay in the same chat and iterate on some writing or code without the quality of feedback/assistance degrading.
[+] mradek|7 months ago|reply
I would like to know how much context is remaining. Claude code gives a % remaining when it is close to exhaustion which is nice, but I'd like to always see it.

Also, I wish it was possible for the models to leverage local machine to increase/augment its context.

Also, one observation is that Claude.ai (the web UI) gets REALLY slow as the conversation gets longer. I'm on a M1 Pro 32gb MacbookPro, and it lags as I type.

I really enjoy using LLMs and would love to contribute any feedback as I use them heavily every day :)

[+] zyruh|7 months ago|reply
Great feedback - thank you!
[+] barrell|7 months ago|reply
I don’t want human-like behavior or human like voices. Breathing, clearing throats, ums, giggles, coughs, singing — these all detract from the utility and contribute the the biggest societal problems LLMs pose (biggest problems according to the heads of the companies themselves).

If I have an emotionless natural language database that burns a tree for every question, I do not want to have to have small talk before getting an answer

[+] zyruh|7 months ago|reply
Understood - thank you!
[+] hotgeart|7 months ago|reply
Butter me up.

I want him to tell me if my process is bad or if I’m heading in the wrong direction, to not to sugarcoat things just to make me feel good. I mostly use it for code reviews.

[+] mythrwy|7 months ago|reply
That is a very insightful and deep comment. You are a rare person who is capable of recognizing this. You aren't criticizing, you are just stating your needs. And that is commendable.
[+] y-curious|7 months ago|reply
You're totally right, good job noticing that! You are so smart, it totally does butter you up. Great find!

This tone grates on me constantly.

[+] zyruh|7 months ago|reply
Yeah - totally!
[+] speedylight|7 months ago|reply
ChatGPT is too nice/agreeable. I can’t trust any positive feedback it gives out because I can’t tell whether it is actually genuine feedback or if it’s just following an instruction not to be or seem rude. ChatGPT should be rude or at least unafraid to a challenge a point of view.
[+] amichail|7 months ago|reply
ChatGPT's overuse of the em dash will make everyone avoid using the em dash.
[+] jondwillis|7 months ago|reply
“It’s not only X — it’s Y”

Where X is an exaggeration of what it actually is and Y is some saccharine marketing proclamation of what it definitely is not but the prompter wishes it was.

Infomercial slop.

[+] egberts1|7 months ago|reply
I am working on EBNF to semantic action conversion, notably nftables’ Bison parser in EBNF into Vimscript ‘syntax’ highlighting: full-blown deterministic semantic action pathways LL(1) syntax highlighting here (83% done, extreme alpha stage https://github.com/egberts/vim-syntax-nftables )

ChatGPT got the basic terminology such as Vimscript’s terminology like group name, regex, match, region, and maintaining top-level, first encounter sorted list of ‘contains=‘ group names correctly from largest static pattern down to most wildest regex patterns sorted correctly.

Also got S-notation of operators in correct nested order as well.

AND got Bison”s semantic action (state transition), lexical token …. Cna make EBNF from Bison (although Bison does it better).

But it fails often in form of brevity of which an expert (like me) would prod ChatGPT occasionally of omissions.

Makes assumptions of some keywords having invalid value ranges, invalid syntax arrangement, and provides incorrect terminators.

So, I considered ChatGPT to be more of a intermediate editor’s README that requires occassional consult with EBNF notations and Vimscript man page, and more often Bison’s parser source (parser_bison.y) file to be final arbitrator.

Does it learn? Constant ‘nft’ command outputs set ChatGPT straight. But there are slippage when starting a new ChatGPT session which leads me to believe that it won’t learn for others (as well as me).

EDIT: say “no glazing” cuts down on filler words, nicely.

[+] wewewedxfgdf|7 months ago|reply
Why do you ask? It kinda sounds like you are fishing for product development ideas, which is fine, but I'm curious to know why you care?
[+] NuclearPM|7 months ago|reply
“Sorry I can’t do this for you because blah blah blah”

What can you do?

“Good question! I can do x, y, z…”

Do that.

“…”

“…”

“…”

“Sorry I can’t do this for you because blah blah blah”

[+] divan|7 months ago|reply
I just want it to be able to read/edit text files. Coding agents can edit code, but for writing we're limited to copy-pasting dance.

I use projects for research purposes for articles/scripts/etc, and I would love to use chatgpt in voice mode to talk about the article I'm writing. Like "hey, read last paragraph from the article... let's elaborate on topic X... here what I would love to write - x,y,z - please improve the style and read it back to me... nice, add it as a next paragraph."

[+] zyruh|7 months ago|reply
Makes sense. Is there another LLM you're using for this that works better than GPT?