top | item 47218288

Anthropic Cowork feature creates 10GB VM bundle on macOS without warning

363 points| mystcb | 18 hours ago |github.com

182 comments

order

felixrieseberg|16 hours ago

Hi, Felix from Anthropic here. I work on Claude Cowork and Claude Code.

Claude Cowork uses the Claude Code agent harness running inside a Linux VM (with additional sandboxing, network controls, and filesystem mounts). We run that through Apple's virtualization framework or Microsoft's Host Compute System. This buys us three things we like a lot:

(1) A computer for Claude to write software in, because so many user problems can be solved really well by first writing custom-tailored scripts against whatever task you throw at it. We'd like that computer to not be _your_ computer so that Claude is free to configure it in the moment.

(2) Hard guarantees at the boundary: Other sandboxing solutions exist, but for a few reasons, none of them satisfy as much and allow us to make similarly sound guarantees about what Claude will be able to do and not to.

(3) As a product of 1+2, more safety for non-technical users. If you're reading this, you're probably equipped to evaluate whether or not a particular script or command is safe to run - but most humans aren't, and even the ones who are so often experience "approval fatigue". Not having to ask for approval is valuable.

It's a real trade-off though and I'm thankful for any feedback, including this one. We're reading all the comments and have some ideas on how to maybe make this better - for people who don't want to use Cowork at all, who don't want it inside a VM, or who just want a little bit more control. Thank you!

deaux|5 hours ago

> It's a real trade-off though and I'm thankful for any feedback, including this one.

Feedback: If your app is going to use 10GB of storage, tell the user in advance and give them a one-click way to remove it. Just basic manners. Don't pick your nose at the dinner table. It's not hard, just common decency.

> even the ones who are so often experience "approval fatigue". Not having to ask for approval is valuable.

This is by and large a short-term pro for Anthropic. It's often not one for the user, and in the long-term, often barely even for the company. In any case, it's a great example of putting Anthropic priorities above the users'. Which is fine and happens all the time, but in this case just isn't necessary. Similar to the AGENTS.md case. We're on the cusp of a pattern establishing here and that's something you'll want to stop before it's ossified.

baconner|16 hours ago

FWIW I think many of us would actually very much love to have an official (or semi official) Claude sandboxing container image base / vm base. I wonder if you all have considered making something like the cowork vm available for that?

beej71|16 hours ago

I think these are are excellent points, but the complaint talks about significant performance and power issues.

quinncom|15 hours ago

I accidentally clicked the Claude Cowork button inside the Claude desktop app. I never used it. I didn't notice anything at the time, but a week later I discovered the huge VM file on my disk.

It would be really nice to ask the user, “Are you sure you want to use Cowork, it will download and install a huge VM on your disk.”

blcknight|13 hours ago

I would look at how podman for Mac manages this; it is more transparent about what's happening and why it needs a VM. It also lets you control more about how the VM is executed.

aberoham|13 hours ago

Claude Cowork grabs local DNS resolution on macOS which conflicts with secure web gateway aka ZTNA aka SASE products such as Cloudflare Warp which do similar. The work-around is to close Cowork, let Warp grab mDNSResponder's attention first, then restart Claude Desktop, or some similar special ordering sequence. It's annoying, but you could say that about everything having to do with MITM middleboxes.

radicality|16 hours ago

I tried to use it right after launch from within Claude Desktop, on a Mac VM running within UTM, and got cryptoc messages about Apple virtualization framework.

That made me realize it wants to also run a Apple virtualization VM but can’t since it’s inside one already - imo the error messaging here could be better, or considering that it already is in a VM, it could perhaps bypass the vm altogether. Because right now I still never got to try cowork because of this error.

ukuina|16 hours ago

Can you allow placing the VM on an external disk?

Also, please allow Cowork to work on directories outside the homedir!

bachittle|16 hours ago

Do you think it would be possible in the future to maybe add developer settings to enable or disable certain features, or to switch to other sandboxing methods that are more lightweight like Apple seatbelt for example?

flatline|16 hours ago

There's a lot that's not being said in (2). That warrants more extensive justification, especially with the issues presented in the parent post.

tyfon|15 hours ago

It would be really nice to have an option to not do this since a ton of companies deny VMs in their group policies.

Terretta|14 hours ago

> real trade-off … thankful for any feedback

Speaking as a tiny but regulated SMB that's dabbling in skill plugins with Cowork: we strongly appreciate and support this stance. We hope you don't relax your standards, and need you not to. We strongly agree with (1), (2), and (3).

If working outside the sandbox becomes available, Cowork becomes a more interesting exfil vector. A vbox should also be able to be made non-optional — even if MDM allows users to elevate privileges.

We've noticed you're making other interesting infosec tradeoffs too. Your M365 connector aggressively avoids enumeration, which we figured was intentional as a seatbelt for keeping looky-loos in their lane.* Caring about foot-guns goes a long way in giving a sense of you being responsible. Makes it feel less irresponsible to wade in.

In the 'thankful for feedback' spirit, here's a concrete UX gap: we agree approval fatigue matters, and we appreciate your team working to minimize prompts.

But the converse is, when a user rejects a prompt — or it ends up behind a window — there's no clear way to re-trigger. Claude app can silently fail or run forever when it can't spin up the workspace, wasn't allowed to install Python, or was told it can't read M365 data.

Employees who've paid attention to their cyber training (reasonably!) click "No" and then they're stuck without diagnostics or breadcrumbs.

For a CLI example of this done well, see `m365-cli`'s `auth` and `doctor` commands. The tool supports both interactive and script modes through config (backed by a setup wizard):

https://pnp.github.io/cli-microsoft365/cmd/cli/cli-doctor/

Similarly, first party MCPs may run but be invisible to Cowork. Show it its own logs and it says "OK, yes, that works but I still can't see it, maybe just copy and paste your context for now." A doctor tool could send the user to a help page or tell them how to reinstall.

Minimal diagnostics for managed machines — running without local admin but able to be elevated if needed — would go a long way for the SMBs that want to deploy this responsibly.

Maybe a resync perms button or Settings or Help Menu item that calls cowork's own doctor cli when invoked?

---

* When given IDs, the connector can read anything the user can anyway. We're able to do everything we need, just had to ship ID signposts in our skill plugin that taps your connector. Preferred that hack over a third party MCP or CLI, thanks to the responsibility you look to be iteratively improving.

rvz|15 hours ago

> (2) Hard guarantees at the boundary: Other sandboxing solutions exist, but for a few reasons, none of them satisfy as much and allow us to make similarly sound guarantees about what Claude will be able to do and not to.

This is the most interesting requirement.

So all the sandbox solutions that were recently developed all over GitHub, fell short of your expectations?

This is half surprising since many people were using AI to solve the sandboxing issue have claimed to have done so over several months and the best we have is Apple containers.

What were the few reasons? Surely there has to be some strict requirement for that everyone else is missing.

But still having a 10 GB claude.vmbundle doesn't make any sense.

jccx70|16 hours ago

[deleted]

xvector|16 hours ago

Cowork has been an insane productivity boost, it is actually amazing. Thank you!

consumer451|15 hours ago

Any chance you guys could get the Claude Desktop installer fixed on Windows? It currently requires users to turn on "developer mode."

Sorry for the ask here, but unaware of other avenues of support as the tickets on the Claude Code repo keep getting closed, as it is not a CC issue.

https://github.com/anthropics/claude-code/issues/26457https:...

MarleTangible|18 hours ago

It's incredible how many applications abuse disk access.

In a similar fashion, Apple Podcasts app decided to download 120GB of podcasts for random reason and never deleted them. It even showed up as "System Data" and made me look for external drive solutions.

kace91|17 hours ago

The system data issue on macOS is awful.

I use my MacBook for a mix of dev work and music production and between docker, music libraries, update caches and the like it’s not weird for me to have to go for a fresh install once every year or two.

Once that gets filled up, it’s pretty much impossible to understand where the giant block of memory is.

dewey|17 hours ago

Don't run "du -h ~/Library/Messages" then, I've mentioned that many times before and it's crazy to me to think that Apple is just using up 100GB on my machine, just because I enable iMessage syncing and don't want to delete old conversations.

One would think that's a extremely common use case and it will only grow the more years iMessage exists. Just offload them to the cloud, charge me for it if you want but every other free message service that exists has no problem doing that.

AndroTux|17 hours ago

This one drives me nuts. Not just on Mac, also on iPhone/iPad. It's 2026, and 5G is the killer feature advertised everywhere. There's no reason to default to downloading gigabytes of audio files if they could be streamed with no issue whatsoever.

geooff_|14 hours ago

I had the same problem but with a bad time machine backup. ~300GB of my 512GB disk, just labeled the generic "System Data". I lost a day of work over it because I couldn't do Xcode builds and had to do a deep dive into what was going on.

coldtrait|17 hours ago

This seems to be a recent popular tool to handle this - https://github.com/tw93/Mole

I also prompt warp/gemini cli to identify unnecessary cache and similar data and delete them

jacquesm|16 hours ago

> Apple Podcasts app decided to download 120GB

That's one way to drive sales for higher priced SSDs in Apple products. I'm pretty sure that that sort of move shows up as a real blip on Apple's books.

jvidalv|16 hours ago

Suprisingly Claude is amazing at cleaning up your macbook. Tried, works like a charm.

chuckadams|18 hours ago

Someone actually still uses the built-in podcasts app?

blitzar|18 hours ago

The vibe coding giveth and the the vibe coding taketh away, blessed be the vibe coding

zhyder|16 hours ago

I guess it could warn about it but the VM sandbox is the best part of Cowork. The sandbox itself is necessary to balance the power you get with generating code (that's hidden-to-user) with the security you need for non-technical users. I'd go even further and make user grant host filesystem access only to specific folders, and warn about anything with write access: can think of lots of easy-to-use UIs for this.

Terretta|17 hours ago

Arguably, even without LLM, you too should be dev-ing inside a VM...

https://developer.hashicorp.com/vagrant is still a thing.

The market for Cowork is normals, getting to tap into a executive assistant who can code. Pros are running their consumer "claws" on a separate Mac Mini. Normals aren't going to do that, and offices aren't going to provision two machines to everyone.

The VM is an obvious answer for this early stage of scaled-up research into collaborative computing.

mihaelm|16 hours ago

I prefer devcontainers for more involved project setups as they keep it lighter than introducing a VM. It’s also pretty easy to work with Docker (on your host) with the docker-outside-of-docker feature.

However, I’m also curious about using NixOS for dev environments. I think there’s untapped potential there.

hirvi74|15 hours ago

I concur. I don't want to install libraries on my host machine that I won't use for anything other than development, e.g., Node.js.

On macOS, Lima has been a godsend. I have Claude Code in an image, and I just mount the directory I want the VM to have access to. It works flawlessly and has been a replacement for Vagrant for me for some time. Though, I owe a lot to Vagrant. It was a lifesaver for me back in the day.

sherburt3|12 hours ago

Do you wear a condom while you’re programming too for maximum protection?

informal007|17 hours ago

I believe that employees in Anthropocs use CC to develop CC now.

AI really give much user ability to develop a completed product, but the quality is decreasing. Professional developers will be in demand when the products/features become popular.

First batch of users of new products need to take more responsibility to test the product like a rats in lab

danny_codes|8 hours ago

I can’t see how these 1st party products can compete against open source. Why would anyone chose a shit proprietary solution when the free one is better

rvz|16 hours ago

> AI really give much user ability to develop a completed product, but the quality is decreasing. Professional developers will be in demand when the products/features become popular.

Looking at the amount of issues, outages and rookie mistakes the employees are making leads me to believe that most of them are below junior level.

If anyone were to re-interview everyone at Anthropic for their own roles with their own interview questions, I would guess that >75% of them would not pass their own interviews.

The only team the would pass them are the Bun team and some other of the recently acquired startups.

bachittle|18 hours ago

Yup it uses Apple Virtualization framework for virtualization. It makes it so I can't use the Claude Cowork within my VMs and that's when I found out it was running a VM, because it caused a nested VM error. All it does is limit functionality, add extra space and cause lag. A better sandbox environment would be Apple seatbelt, which is what OpenAI uses, but even that isn't perfect: https://news.ycombinator.com/item?id=44283454

ctmnt|15 hours ago

I don’t have an opinion on how they should handle the nested VMs probably, but I very much disagree that Seatbelt is better. Claude Code (aka `claude`) uses it, and it’s barely good for anything.

Out of curiosity, why are you running Cowork inside a VM in the first place? What does that get you that letting Cowork use its own VM wouldn’t?

j16sdiz|17 hours ago

seatbelt is largely undocumented.

atonse|17 hours ago

I literally spent the last 30 mins with DaisyDisk cleaning up stuff in my laptop, I feel HN is reading my mind :)

I also noticed this 10GB VM from CoWork. And was also surprised at just how much space various things seem to use for no particular reason. There doesn't seem to be any sort of cleanup process in most apps that actually slims down their storage, judging by all the cruft.

Even Xcode. The command line tools installs and keeps around SDKs for a bunch of different OS's, even though I haven't launched Xcode in months. Or it keeps a copy of the iOS simulator even though I haven't launched one in over a year.

hulitu|17 hours ago

Is there no crond and find on MacOSX ?

creatonez|16 hours ago

As much as an inconvenience this may be, this is exactly what "agents" should be doing. If your tool doesn't have a builtin sandbox that is intended to be used at all times, you're using something downright hazardous and WILL end up suffering data loss.

quanwinn|18 hours ago

I imagined someone at Anthropic prompted "improve app performance", and this was the result.

pncnmnp|15 hours ago

On a similar tangent, but on the opposite end of the spectrum, check out this month-old discussion on HN: https://news.ycombinator.com/item?id=46772003

ChatGPT's code execution container contains 56 vCPUs!! Back then, simonw mentioned:

> It appears to have 4GB of RAM and 56 (!?) CPU cores https://chatgpt.com/share/6977e1f8-0f94-8006-9973-e9fab6d244...

I'm seeing something similar on a free account too: https://chatgpt.com/share/69a5bbc8-7110-8005-8622-682d5943dc...

On my paid account, I was able to verify this. I was also able to get a CPU-bound workload running on all cores. Interestingly, it was not able to fully saturate them, though - despite trying for 20-odd minutes. I asked it to test with stress-ng, but it looks like it had no outbound connectivity to install the tool: https://chatgpt.com/share/69a5c698-28bc-8005-96b6-9c089b0cc5...

Anyways, that's a lot of compute. Not quite sure why its necessary for a plus account. Would love to get some thoughts on this?

tbrownaw|18 hours ago

Sure it uses a few GB just like everything else these days, but some of the comments also mention it being slow?

Aurornis|18 hours ago

The GitHub issue is AI generated. In my experience triaging these in other projects, you can’t really trust anything in them without verifying. The users will make claims and then the AI will embellish to make them sound more important and accurate.

cogman10|16 hours ago

Ok, so a lot of this boils down to the fact that this sort of software really wants to be running on linux. For both windows and mac, the only way to (really) do that is creating a VM.

It seems to me that the main issue here is painful disconnects between the VM and the host system. The kernel in the VM wants to manage memory and disk usage and that management ultimately means the host needs to grant the guest OS large blocks of disk and memory.

Is anyone thinking about or working on narrowing that requirement? Like, I may want the 99% of what a VM does, but I really want my host system to ultimately manage both memory and disk. I'd love it if in the linux VM I had a bridge for file IO which interacted directly with the host file system and a bridge in the memory management system which ultimately called the host system's memory allocation API directly and disabled the kernels memory management system.

containers and cgroups are basically how linux does this. But that's a pretty big surface area that I doubt any non-linux system could adopt.

lxgr|16 hours ago

Given that Claude Code runs without issues on macOS, I'd guess that it's more about sandboxing shell sessions (i.e. not macOS applications or single processes, for which solutions exist).

Unfortunately, unlike Linux, macOS doesn't have a great out-of-the-box story there; even Apple's first-party OCI runtime is based on per-container Linux VMs.

kccqzy|16 hours ago

It’s a solved problem in the VM world too. Memory ballooning is a technique where a driver inside the VM kernel cooperates with the hypervisor to return memory back to the host by appearing to consume the memory from the VM. And disk access is even easier; just present a network filesystem to the VM.

puppymaster|17 hours ago

macbook pro m4 bought last year. worked on so many codes and projects. never hot after closing lid. installed electron claude. closed lid and went to sleep and woke up to macbook that has been hot all night. uninstall claude. problem went away.

i kept telling myself this BUT NEVER ELECTRON AGAIN.

lxgr|15 hours ago

To be fair, ChatGPT seems to be a native app and still somehow managed to continuously burn some 30-40% of CPU on my mac that ended up being attributable to some shimmer animation for two never-loading icons.

DauntingPear7|17 hours ago

It’s not electron

bigyabai|15 hours ago

I don't know if Electron is the issue here, my Wintel machine has Claude Code running 24/7 and doesn't ever heat up.

Might be virtualization woes or something adjacent.

hulitu|17 hours ago

> woke up to macbook that has been hot all night

this is usual reason for divorce /s

exabrial|16 hours ago

I see this as a feature. The cost of isolation

brunooliv|16 hours ago

I really love Anthropic's models, but, every single product/feature I've used other than the Claude Code CLI has been terrible... The CLI just "sticked" for me and I've never needed (or arguably looked in depth) any other features. This for my professional dayjob.

For personal use, where I have a Pro subscription and adventure into exploring all the other features/products they have... I mean, the experience outside of Claude Code and the terminal has been... bad.

msp26|16 hours ago

> every single product/feature I've used other than the Claude Code CLI has been terrible

yeah they're shipping too fast and everything is buggy as shit

- fork conversation button doesn't even work anymore in vscode extension

- sometimes when I reconnect to my remote SSH in VSCode, previously loaded chats become inaccessible. The chats are still there in the .jsonl files but for some reason the CC extension becomes incapable of reading them.

yuppiepuppie|16 hours ago

I tend to agree here. Today, I tried to get the claude chat to give me a list of Jira tickets from one board (link provided) and then upload it to notion with some additional context. It glitched out after trying the prompt over again 4x. I eventually gave up and went back to the terminal.

perbu|16 hours ago

Yes. This is my experience as well. The software quality is generally horrible. It surely has improved a lot over the last couple of months, but it is still pretty horrible.

It is quite normal for me to have to force-close Claude Desktop.

pama|16 hours ago

Aren't most these people recommending random tools in the github chat for this entry just attempting to exploit naive users? Why would anyone in this day and age follow advice of new users to download new repos or click at random websites when they already attempt to use claude code or cowork?

nhubbard|16 hours ago

While I generally agree with your sentiment, these tools aren't bad ones:

- Santa is a very common tool used by macOS admins to lock down binary and file access privileges for apps, usually on managed machines

- Disk Inventory X and GrandPerspective are well-known disk space usage tools for macOS (I personally use DaisyDisk but that requires a license)

- WizTree and WinDirStat are very common tools from Windows admin toolkits

The only one here I can say is potentially suspect is ClearDisk. I haven't used it before, but it does appear to be useful for specifically tracking down developer caches that eat up disk space.

_orcaman_|13 hours ago

A better UX would be to prompt the user, asking "Would you like to use the app in a sandbox for enhanced safety?" and only then download the Ubuntu linux image used in the VM

bichonnages|12 hours ago

In the meantime, I deleted the virtual machine and the Claude application. I simply created a web app through Safari. It works very well.

Aurornis|18 hours ago

This GitHub issue itself is clearly AI slop. If you’ve been dealing with GitHub issues in the past months it will be obvious, but it’s confirmed at the end:

> Filed via Claude Code

I assume part of it is true, but determining which part is true is the hard part. I’ve lost a lot of time chasing AI-written bug reports that were actually something else wrong with the user’s computer. I’m assuming the claims of “75% faster” and other numbers are just AI junk, but at least someone could verify if the 10GB VM exists.

16bitvoid|16 hours ago

If your codebase is entirely vibe coded, I feel it only appropriate to permit issues being vibed as well. It's hypocritical otherwise.

chuckadams|18 hours ago

I wouldn't think it's inappropriate for an AI agent to file an issue against another AI agent, which itself is largely written by AI.

andresquez|17 hours ago

Way slower, but way better than chat mode. Nothing beats Claude Code CLI imo.

game_the0ry|17 hours ago

Yeah, that's why I do not install these tools on my personal devices anymore and instead play with them on a VPS.

Try this if you have claude code -- ls -a your home dir and see all the garbage claude creates.

anotheryou|16 hours ago

Mac Problems...

so crazy on a windows desktop I at most complain if it is hardcoded to the system drive (looking at you ollama)

kordlessagain|17 hours ago

The amount of bad things this companies software does is staggering. The models are amazing, the code sucks.

AlexeyBrin|17 hours ago

Their code is written by their amazing models (this is what they claim anyway).

sometimez|13 hours ago

Same thing on Windows. The VM bundle is at %AppData%\Claude\vm_bundles

elzbardico|16 hours ago

This is exactly the kind of issues we will see more and more frequently with vibe-coding.

fooker|16 hours ago

That seems somewhat reasonable.

Storage should be cheaper, complain about Apple making you pay a premium.

Robdel12|15 hours ago

Hey, they did admit that they vibed this in a week and released it to everyone.

fragmede|18 hours ago

What's funny is interacting with it in claude code. Claude-desktop-cowork can't do anything about the VM. It creates this 10 GiB VM, but the disk image starts off with something like 6-7 GiB full already, which means any of the cowork stuff you try to do has to fit into the remaining couple of gigs. It's possible to fill it up, and then claude cowork stops working. Because the disk is full. Claude cowork isn't able to fix this problem. It can't even run basic shell commands in the VM, and Opus4.6 is able to tell the user that, but isn't smart enough/empowered to do anything about it.

So contrary to the github issue, my problem is that it's not enough space. So the fix is to navigate to ~/Library/Application\ Support/Claude/vm_bundles, and then ask Claude Code to upsize the disk to a sparse 60 GiB file, giving cowork much more space to work in while not immediately taking up 60 GiB.

Bigger picture, what this teaches me though, is that my knowledge is still useful in guiding the AI to be able to do things, so I'm not obsolete yet!

pixl97|17 hours ago

So it's using it's binary disk/image as the cache/work disk also?

Yea, that's a receipt for problems.

jFriedensreich|17 hours ago

Its just another example and just a detail in the broader story: We cannot trust any model provider with any tooling or other non model layer on our machines or our servers. No browsers, no cli, no apps no whatever. There may not be alternatives to frontier models yet, but everything else we need to own as true open source trustable layer that works in our interest. This is the battle we can win.

prmph|17 hours ago

Why don't people form cooperatives, contribute to buy serious hardware and colocate them in local data centers, and run good local models like GLM on them to share?

wutwutwat|15 hours ago

Are we sure that this isn't a sparse image? It will report as the full size in finder, but it won't actually be consuming that much space if it's a sparse image

mixdup|17 hours ago

All code in Claude™ is written by Claude™

jug|17 hours ago

Also apparently eating 2 GB RAM or so to run an entire virtual machine even if you've disabled Cowork. Not sure which of this is worse. Absolute garbage.

daemonk|15 hours ago

Just write a Claude OS already.

crumpled|17 hours ago

The software seems to get into more and more and communicate about what it's doing less and less. That's the crux.

Pondering... Noodling... Some other nonsense...

bear3r|17 hours ago

[deleted]

TheRealPomax|17 hours ago

labelled "high priority" a month ago. No actual activity by Anthropic despite it being their repo. I'm starting to get the feeling they're not actually very good at this?