I did some testing of configuring Claude CLI sometime ago via .claude json config files - in particular I tested:
- defining MCP servers manually in config (instead of having the CLI auto add them)
- playing with various combinations of ’permissions` arrays
What I discovered was that Claude is not only vibe coded, but basic local logic around config reading seems to also work on the basis of "vibes".
- it seemed like different parts of the CLI codebase did or didn't adhere to the permissions arrays.
- at one point it told me it didn't have permission to read the .claude directory & as a result ran bash commands to search my entire filesystem looking for MCP server URLs for it to provide me with a list of available MCP servers
- when restricted to only be able to read from a working directory, at various points it told me I had denied it read permissions to that same working directory & also freely read from other directories on my system without prompting
- restricting webfetch permissions is extremely hit & miss (tested with Little Snitch in alert mode)
---
I have not reported any of the above as Github issues, nor do I intend to. I had a think about why I won't & it struck me that there's a funny dichotomy with AI tools:
1. all of the above are things the typical vibe coder stereotypes I've encountered simply do not really care deeply about
2. people that care about the above things are less likely to care enough about AI tools to commit their personal time to reporting & debugging these issues
There's bound to be exceptions to these stereotypes out there but I doubt there's sufficient numbers to make AI tooling good.
The permission thing is old and unresolved. Claude, at some points or stages? of vibe-coding, can be become able to execute commands that are in the Deny list (ie: rm) without any confirmation.
I highly suspect no one in claude is concerned or working on this.
Those stereotypes look more like misconceptions (to put it charitably). Vibe coding doesn't mean one doesn't care about software working correctly, it only means not caring about how the code looks.
So unless you're also happy about not reporting bugs to project managers and people using low-code tools, I urge you to reconsider the basis for your perspective.
This is why I run claude inside a thin jail. If I need it to work on some code, I make a nullfs mount to it in there.
Because indeed, one of the first times i played around with claude, I asked it to make a change to my emacs config, which is in a non-standard location. It then wanted to search my entire home directory for it(it did ask permission though).
I’d urge you to report it anyway. As someone that does use these tools I’m always on the lookout for other people pointing this type of stuff out. Like the .claude directory usage does irk me. Also the concise telegraphing on how some of the bash commands work bug me. Like why can it run some commands without asking me? I know why, I’ve seen the code, but that crap should be clearer in the UI. The first time it executed a bash command without asking me I was confused and somewhat livid because it defied my expectations. I actually read the crap it puts out because it couldn’t code its way out of a paper bag without supervision.
Not sure the comments are debating the semantics of vibe coding or confusing ourselves with generalizing anecdotal experiences (or both). So here's my two cents.
I use LLMs on a daily basis. With the rules/commands/skills in place the code generated works, the app is functional, and the business is happy it shipped today and not 6 months from now. Now, as as super senior SWE, I have learned through my professional experiences (now an expert?) to double check your work (and that of your team) to make sure the 'logical' flows are implemented to (my personal) standard of what quality software should 'look' like. I say personal standard since my colleagues have their own preferred standard, which we like to bikeshed during company time (a company standard is after all made of the aggregate agreed upon standards of the personal experiences of the experts in the room).
Today, from my own personal (expert) anecdotal experiences, ALL SOTA LLMs generate functional/working code. But the quality of the 'slop' varies on the model, prompts, tooling, rules, skills, and commands. Which boils down to "the tool is only as good as the dev that wields it". Assuming the right tool for the right job. Assuming you have the experiences to determine the right tool for the right job. Assuming you have taken the opportunities to experience multiple jobs to pair the right tool.
Which leads me to, "Vibe coding" was initially coined (IMO) to describe those without any 'expertise' producing working/functional code/apps using an LLM. Nowadays, it seems like vibe coding means ANYONE using LLMs to generate code, including the SWE experts (like myself of course). We've been chasing quality software pre-LLM, and now we adamantly yell and scream and kick and shout about quality software from the comment sections because of LLM. I'm beginning to think quality software is a mirage we all chase, and like all mirages its just a little bit further.
All roads that lead to 'shipping' are made with slop. Some roads have slop corners, slop holes, misspelled slop, slop nouns, slop verbs, slop flows and slop data. It's just with LLMs we build the roads to 'shipping' faster.
I have to chuckle that a bug like this happens after reading that other thread about the Claude Code creator running like 5 terminal agents and another 5-10 in the web UI.
I think its 25 agents now, they keep increasing. one of the agent has started posting on twitter. his productivity is up 200x, and anthropic has started making trillions in profit.
Yeah after that other thread, I feel a lot less comfortable giving Claude code access to anything that can't be immediately nuked and reloaded from a fresh copy.
We're trying to make billions of dollars here, we don't have time to do crazy things like test basic functionality before shipping changes to all live users at once
Ironically that might have passed, because this didn't break the version, this broke all versions when the global referenced changelog was published. It wasn't the new version itself that was broken.
But testing new version would have been downloading the not-yet-updated working changelog.
There are ways to deal with this of course, and I'm not defending the very vibey way that claude-code is itself developed.
I just set this up for the project I'm working on last week, and felt dirty because it took me a couple of months to get to it. There are like 5 or 6 users.
There's something so unnerving about the people pushing the AI frontier being sloppy about testing. I know, it's just a CLI wrapped around the AI itself, but it suggests to me that the culture around testing there isn't as tight and thorough as I'd like it to be.
What's funny to me is that the amount of "same here", "+1" comments are still prominent even if GitHub introduced an emoji system. It's like most people intentionally don't want to use that.
(Just kidding.) Some of it is unawareness of the 'subscribe' button I believe, occasionally you'll see someone tell people to cut it out and someone else will reply to the effect of wanting to know when it's fixed etc. But it's also just lazy participation, echoing an IRL conversation I suppose, that you see anywhere - replied instead of up votes on Reddit and to a slightly lesser extent here for example.
Problem: Claude Code 2.1.0 crashes with Invalid Version: 2.1.0 (2026-01-07) because the CHANGELOG.md format changed to include dates in version headers (e.g., ## 2.1.0 (2026-01-07)). The code parses these headers as object keys and tries to sort them using semver's .gt() function, which can't parse version strings with date suffixes.
Affected functions: W37, gw0, and an unnamed function around line 3091 that fetches recent release notes.
Fix: Wrap version strings with semver.coerce() before comparison. Run these 4 sed commands on cli.js:
CLI_JS="$HOME/.nvm/versions/node/$(node -v)/lib/node_modules/@anthropic-ai/claude-code/cli.js"
# Backup first
cp "$CLI_JS" "$CLI_JS.backup"
# Patch 1: Fix ve2.gt sort (recent release notes)
sed -i 's/Object\.keys(B)\.sort((Y,J)=>ve2\.gt(Y,J,{loose:!0})?-1:1)/Object.keys(B).sort((Y,J)=>ve2.gt(ve2.coerce(Y),ve2.coerce(J),{loose:!0})?-1:1)/g' "$CLI_JS"
# Patch 2: Fix gw0 sort
sed -i 's/sort((G,Z)=>Wt\.gt(G,Z,{loose:!0})?1:-1)/sort((G,Z)=>Wt.gt(Wt.coerce(G),Wt.coerce(Z),{loose:!0})?1:-1)/g' "$CLI_JS"
# Patch 3: Fix W37 filter
sed -i 's/filter((\[J\])=>!Y||Wt\.gt(J,Y,{loose:!0}))/filter(([J])=>!Y||Wt.gt(Wt.coerce(J),Y,{loose:!0}))/g' "$CLI_JS"
# Patch 4: Fix W37 sort
sed -i 's/sort((\[J\],\[X\])=>Wt\.gt(J,X,{loose:!0})?-1:1)/sort(([J],[X])=>Wt.gt(Wt.coerce(J),Wt.coerce(X),{loose:!0})?-1:1)/g' "$CLI_JS"
Note: If installed via different method, adjust CLI_JS path accordingly (e.g., /usr/lib/node_modules/@anthropic-ai/claude-code/cli.js).
I'm not usually one to pile on to a developer for releasing a bug but this is pretty special. The nature of the bug (a change in format for a changelog markdown file causes the entire app to break) and the testing it would have taken to uncover it (literally any) makes this one especially embarrassing for Anthropic.
In the specific commit, what seems like a bot or automated script added changelog entries for 3 new versions in a single commit, which is odd for an automated script to do. And only the latest version had the date added.
That actions-user seem to be mostly maintaining the Changelog but the commits does not seem consistent with an automated script. I see a few cases of rewriting previous change log entries or moving entries from one version to another which any kind of automation would not be doing. Seems like human error and poor testing.
They really have “anthropics” not “anthropic” on GitHub? That’s a shame, it looks like typosquatting. If people are taught to trust that it’s easier to get them to download my evil OpenA1 package.
Claude may write all the code but this is an oversight from the dev. Do people think these agents are acting independently? If they wanted or had thought of tests that would catch this then they would have them! The use or non use of LLM is irrelevant. I find the discourse around this all so strange.
On the other hand people ask "where is all the amazing software that has been vibe coded, I haven't seen it?". So Claude Code is two things at once (1) incredibly popular and innovative software that's loved by a huge amount of devs (2) vibe coded buggy crap. If you think this bug is the result of vibe coding, frankly you should look at Claude Code as a whole and be impressed with vibe coding. If Claude CLI has been "vibe coded" then vibe coding must be fine because I've been using Claude Code for probably 8 months and it's been a pretty smooth experience, and an incredibly valuable tool.
As I commented [1] on the earlier Claude Code post, there's an issue [2] that has the following comment:
> While we are always monitoring instances of this error and and looking to fix them, it's unlikely we will ever completely eliminate it due to how tricky concurrency problems are in general.
This is an extraordinary admission. It is perfectly possible (easy, even, relative to many programming challenges) to write a tool like this without getting the design so wrong that the same bug keeps happening in so many different ways that you have to publicly admit you're powerless to fix them all.
Even if it broke after some sort of vibe coding session, the fact that we’re now pushing these tools to their limits are what’s allowing Anthropic and Boris getting a lot of useful insights to improve the models and experience further! So yeah, buckle up, bumps expected
With the issues since November where one has to add environment variables, block statsig hosts, modify ~/.claude.json, etc. does anyone have experience in managed setups where versions are centrally set and bumped on company level? Is this worth the hassle?
I created a workspace local extension in VS Code that uses the VS Code API to let Claude Code open files in VS Code as tabs and save them (to apply save participants like Prettier in case it is not used via the CLI) and to get diagnostics (like for TypeScript where there is no option to get workspace-wide diagnostics and you have to go file by file). I taught Claude Code to use this extension via a skill file and it works perfectly, much more reliably than its own IDE LSP integration.
It is frustrating how often things break in CC. Luckily issues are quickly fixed, but it worries me that the QA / automated testing is brittle. Hope they get out of this start-up mode and deliver Enterprise grade software.
I read your comment as a joke, but in case if was a defense, or is taken as a defense by others, let me help you punch up your writing for you:
"[Person who is financially incentivized to make unverifiable claims about the utility of the tool they helped build] said [tool] [did an unverified and unverifiable thing] last month"
Meta comment, but the pace of this is so exciting. Feels like a new AAA MMO release or something, having such a confluence of attention and a unified front.
lucideer|1 month ago
I did some testing of configuring Claude CLI sometime ago via .claude json config files - in particular I tested:
- defining MCP servers manually in config (instead of having the CLI auto add them)
- playing with various combinations of ’permissions` arrays
What I discovered was that Claude is not only vibe coded, but basic local logic around config reading seems to also work on the basis of "vibes".
- it seemed like different parts of the CLI codebase did or didn't adhere to the permissions arrays.
- at one point it told me it didn't have permission to read the .claude directory & as a result ran bash commands to search my entire filesystem looking for MCP server URLs for it to provide me with a list of available MCP servers
- when restricted to only be able to read from a working directory, at various points it told me I had denied it read permissions to that same working directory & also freely read from other directories on my system without prompting
- restricting webfetch permissions is extremely hit & miss (tested with Little Snitch in alert mode)
---
I have not reported any of the above as Github issues, nor do I intend to. I had a think about why I won't & it struck me that there's a funny dichotomy with AI tools:
1. all of the above are things the typical vibe coder stereotypes I've encountered simply do not really care deeply about
2. people that care about the above things are less likely to care enough about AI tools to commit their personal time to reporting & debugging these issues
There's bound to be exceptions to these stereotypes out there but I doubt there's sufficient numbers to make AI tooling good.
novaleaf|1 month ago
---
(that it's a big pile of spaghetti that can't be improved without breaking uncountable dependencies)
csomar|1 month ago
I highly suspect no one in claude is concerned or working on this.
TeMPOraL|1 month ago
So unless you're also happy about not reporting bugs to project managers and people using low-code tools, I urge you to reconsider the basis for your perspective.
oa335|1 month ago
I’ve noticed the same thing and it frustrates me almost every day.
athrowaway3z|1 month ago
All the AI websites feel extremely clunky and slow.
mtlmtlmtlmtl|1 month ago
Because indeed, one of the first times i played around with claude, I asked it to make a change to my emacs config, which is in a non-standard location. It then wanted to search my entire home directory for it(it did ask permission though).
SamInTheShell|1 month ago
erikbye|1 month ago
tsarchitect|1 month ago
I use LLMs on a daily basis. With the rules/commands/skills in place the code generated works, the app is functional, and the business is happy it shipped today and not 6 months from now. Now, as as super senior SWE, I have learned through my professional experiences (now an expert?) to double check your work (and that of your team) to make sure the 'logical' flows are implemented to (my personal) standard of what quality software should 'look' like. I say personal standard since my colleagues have their own preferred standard, which we like to bikeshed during company time (a company standard is after all made of the aggregate agreed upon standards of the personal experiences of the experts in the room).
Today, from my own personal (expert) anecdotal experiences, ALL SOTA LLMs generate functional/working code. But the quality of the 'slop' varies on the model, prompts, tooling, rules, skills, and commands. Which boils down to "the tool is only as good as the dev that wields it". Assuming the right tool for the right job. Assuming you have the experiences to determine the right tool for the right job. Assuming you have taken the opportunities to experience multiple jobs to pair the right tool.
Which leads me to, "Vibe coding" was initially coined (IMO) to describe those without any 'expertise' producing working/functional code/apps using an LLM. Nowadays, it seems like vibe coding means ANYONE using LLMs to generate code, including the SWE experts (like myself of course). We've been chasing quality software pre-LLM, and now we adamantly yell and scream and kick and shout about quality software from the comment sections because of LLM. I'm beginning to think quality software is a mirage we all chase, and like all mirages its just a little bit further.
All roads that lead to 'shipping' are made with slop. Some roads have slop corners, slop holes, misspelled slop, slop nouns, slop verbs, slop flows and slop data. It's just with LLMs we build the roads to 'shipping' faster.
dotancohen|1 month ago
blks|1 month ago
songodongo|1 month ago
We vibing out here.
falloutx|1 month ago
exe34|1 month ago
wiseowise|1 month ago
smca|1 month ago
hughes|1 month ago
edit: it seems changelog.md is assumed to be structured data and parsed at startup, and there are no tests to enforce the changelog structure: https://github.com/anthropics/claude-code/issues/16671
cube00|1 month ago
agumonkey|1 month ago
unknown|1 month ago
[deleted]
Loeffelmann|1 month ago
unknown|1 month ago
[deleted]
viraptor|1 month ago
0xbadcafebee|1 month ago
eterm|1 month ago
But testing new version would have been downloading the not-yet-updated working changelog.
There are ways to deal with this of course, and I'm not defending the very vibey way that claude-code is itself developed.
steve_adams_86|1 month ago
There's something so unnerving about the people pushing the AI frontier being sloppy about testing. I know, it's just a CLI wrapped around the AI itself, but it suggests to me that the culture around testing there isn't as tight and thorough as I'd like it to be.
someguyiguess|1 month ago
Hamuko|1 month ago
stevefan1999|1 month ago
OJFord|1 month ago
(Just kidding.) Some of it is unawareness of the 'subscribe' button I believe, occasionally you'll see someone tell people to cut it out and someone else will reply to the effect of wanting to know when it's fixed etc. But it's also just lazy participation, echoing an IRL conversation I suppose, that you see anywhere - replied instead of up votes on Reddit and to a slightly lesser extent here for example.
halapro|1 month ago
motoboi|1 month ago
So what should one pick? The rocket, the thumbs up?
Also the emoji won't turn into a notification to steal the dev attention and make him fix the thing lok
wiseowise|1 month ago
jennyholzer4|1 month ago
[deleted]
phyrex|1 month ago
```
```MattDaEskimo|1 month ago
wowoc|1 month ago
unknown|1 month ago
[deleted]
mvdtnz|1 month ago
smashed|1 month ago
https://github.com/anthropics/claude-code/commit/870624fc158...
That actions-user seem to be mostly maintaining the Changelog but the commits does not seem consistent with an automated script. I see a few cases of rewriting previous change log entries or moving entries from one version to another which any kind of automation would not be doing. Seems like human error and poor testing.
habosa|1 month ago
bfeynman|1 month ago
solumunus|1 month ago
On the other hand people ask "where is all the amazing software that has been vibe coded, I haven't seen it?". So Claude Code is two things at once (1) incredibly popular and innovative software that's loved by a huge amount of devs (2) vibe coded buggy crap. If you think this bug is the result of vibe coding, frankly you should look at Claude Code as a whole and be impressed with vibe coding. If Claude CLI has been "vibe coded" then vibe coding must be fine because I've been using Claude Code for probably 8 months and it's been a pretty smooth experience, and an incredibly valuable tool.
denysvitali|1 month ago
unknown|1 month ago
[deleted]
qwertox|1 month ago
omnicognate|1 month ago
> While we are always monitoring instances of this error and and looking to fix them, it's unlikely we will ever completely eliminate it due to how tricky concurrency problems are in general.
This is an extraordinary admission. It is perfectly possible (easy, even, relative to many programming challenges) to write a tool like this without getting the design so wrong that the same bug keeps happening in so many different ways that you have to publicly admit you're powerless to fix them all.
[1] https://news.ycombinator.com/item?id=46523740
[2] https://github.com/anthropics/claude-code/issues/6836
dnw|1 month ago
nycdatasci|1 month ago
unknown|1 month ago
[deleted]
brunooliv|1 month ago
jennyholzer4|1 month ago
[deleted]
hrpnk|1 month ago
wojciech12|1 month ago
tomashubelbauer|1 month ago
stpedgwdgfhgdd|1 month ago
334f905d22bc19|1 month ago
@jayeshk29 is our hero
Finally i can finish my fizzbuzz for the interview
indigodaddy|1 month ago
nexawave-ai|1 month ago
stavros|1 month ago
someguyiguess|1 month ago
behnamoh|1 month ago
unknown|1 month ago
[deleted]
unknown|1 month ago
[deleted]
dionian|1 month ago
frays|1 month ago
midldei|1 month ago
"[Person who is financially incentivized to make unverifiable claims about the utility of the tool they helped build] said [tool] [did an unverified and unverifiable thing] last month"
kace91|1 month ago
Is anyone with or without AI approaching anywhere near that speed of delivery?
I don’t think my whole company matches that amount. It sounds super unreasonable, just doing a sanity check.
rvz|1 month ago
danielbln|1 month ago
lubasara|1 month ago
[deleted]
chuckadams|1 month ago
unknown|1 month ago
[deleted]
NickNaraghi|1 month ago