top | item 8896186

Moved ~/.local/share/steam. Ran steam. It deleted everything owned by user

603 points| stkhlm | 11 years ago |github.com | reply

268 comments

order
[+] Chris_Newton|11 years ago|reply
This seems like yet another good example of why robust application-level access control would be a helpful thing to build into modern operating systems, in addition to the typical user-based controls. This may have been both a rookie mistake and a regrettable failure of code review processes, but in any case it simply shouldn’t be possible for an application running on a modern system to wipe out all user data without warning in such a sweeping way.

I have often made this argument in the context of sandboxing communications software like browsers and e-mail clients, where it is relatively unusual to need access to local files except for their own data. In that context, restricting access to other parts of the filesystem unless explicitly approved would be a useful defence against security vulnerabilities being exploited by data from remote sources. It’s hard to encrypt someone’s data and hold it for ransom or to upload sensitive documents if your malware-infected process gets killed the moment it starts poking around where it has no business being.

More generally, I see no reason that we shouldn’t limit applications’ access to any system by default, following the basic security principle of least privilege. We have useful access control lists based on concepts of ownership by users and groups and reserving different parts of the filesystem for different people. Why can’t we also have something analogous where different files or other system resources are only accessible to applications that have been approved for that access?

[+] andrewfong|11 years ago|reply
Isn't this the basic idea behind the sandbox in OS X?

I think OS X (and mobile app development in general) shows both that this is great in theory and a net improvement over not having it, but that there are some common pitfalls to address.

First, there are a handful of apps where this model doesn't work so well -- e.g. text editors, FTP clients, etc. So you're inconveniencing quite a few legit apps which need broader access.

Second, as a corollary of the first, that means you're going to have a lot of apps that legitimately need to ask users to approve broader access. And as the number of apps asking for approval goes up, the more likely users are to simply ignore the warning and approve all. This is especially problematic since we can assume the average user is a good judge of which apps need which access.

Edit: One way of reducing user acceptance fatigue might to introduce greater granularity into the requested permissions and then tier the permissions requested -- e.g. commonly asked vs. uncommon. E.g. an app may legitimately need permission to write to any file in your home directory, but it's highly unlikely they'll need permission to write to more than X number of files per second. Or at least they shouldn't be able to do so without the OS throwing up lots of warnings outside of the app.

[+] falcolas|11 years ago|reply
Like AppArmor, or SELinux, or any of the other applications which have their hooks in the LSM? They do a fantastic job of this, if you can figure out how to use them.

The truth is that they are too hard for even your average Sysadmin to configure & manage, let alone your average desktop user.

setenforce=1 (yeah, right).

[+] pjc50|11 years ago|reply
Broadly I agree; the original work on access control assumed that it was users who might be untrustworthy and programs were safe, in a "classified documents" context.

However, applying program-level access control is very un-UNIX. How do you compose multiple programs with different security regimes? This bug happened because the "steam" program called the "rm" program via the "shell" program. Inheriting capabilities mostly solves this, but we're familiar with how hard selinux is to use as a result and it still doesn't save the user from command line typos.

I think it's time to make a stronger case for time-reversible filesystems. Accidental deletion matters less if you can just get in your time machine.

[+] woodman|11 years ago|reply
You are describing a problem that has been solved several times over. Blame steam, the distro devs, or the user for not implementing one of the many long existing solutions: chroot [0].

[0] http://en.wikipedia.org/wiki/Chroot

[+] vezzy-fnord|11 years ago|reply
Role-based and other access control mechanisms unfortunately come off all too frequently as bolted on and arcane hacks.

The real issue lies in the fact that the file system resides in a global namespace, when it shouldn't. Much like each process has its own environment variables, so should it have its own namespace. Linux does support so-called "mount namespaces" now, but once again they're not inherent parts of the system, but have to be tacked on through explicit unshares, and thus lose the cohesiveness of platforms such as Plan 9. [1]

[1] http://doc.cat-v.org/plan_9/4th_edition/papers/names

[+] rwallace|11 years ago|reply
As a more immediate fix with less collateral damage, since Unix programmers refuse to stop putting `rm -rf` commands in shell scripts (they seem to think the suggestion is an insult to their manhood), change the behavior of rm so that by default it either disregards -rf or moves the target files to a trash directory where they can be retrieved in the event of an error.
[+] vmarsy|11 years ago|reply
> robust application-level access control would be a helpful thing to build into modern operating systems

Something like Windows Store apps, but which ideally wouldn't require the use of a specific store? (Only Entreprise apps can bypass the store from what I know)

[+] SixSigma|11 years ago|reply
Or use a check pointed file system with off-site storage. My PC can catch fire any day and I would lose nothing.

Sure, this Steam thing sucks but your disk could die any moment; be prepared. RAID is not backup

[+] akamaka|11 years ago|reply
Here's the offending shell script code:

  # figure out the absolute path to the script being run a bit
  # non-obvious, the ${0%/*} pulls the path out of $0, cd's into the
  # specified directory, then uses $PWD to figure out where that
  # directory lives - and all this in a subshell, so we don't affect
  # $PWD
  STEAMROOT="$(cd "${0%/*}" && echo $PWD)"
  [...]
  # Scary!
  rm -rf "$STEAMROOT/"*
The programmer knew the danger and did nothing but write the "Scary!" comment. Sad, but all-too-familiar.
[+] thaumaturgy|11 years ago|reply
Yikes. That should never have passed a code review. I know mistakes happen, I often defend screw-ups, but anything in a shell script that has an "rm" command should automatically get the squinty eyes.
[+] EC1|11 years ago|reply
I thought I was a bad programmer until I saw this thread. How did this make it to prod?
[+] deeviant|11 years ago|reply
I worked at a solar SCADA company that rolled there own APT packages.

In the pre and post install deb package scripts there was all kinds of crazy shit, like upgrading grub to grub 2 and manually messing with boot sectors. All this stuff in packages innocuously named modbus_driver.deb or what have you, and all in absolutely the most archaic bash syntax possible. I did suggest, strongly to jail all application binarys that we twiddle around with, with something like chroot, but was rebuffed.

Eventually somebody mixed a rm -rf /bin/* with rm -rf / bin/*, and the rest is history. They bricked about 100 embedded PC's, all in remote locations, all powering powerplant SCADA systems that did stuff like connect to CalISO for grid management or collect billing information. It cost hundreds of thousand of dollars to fix.

[+] userbinator|11 years ago|reply
Would this have stopped things getting deleted?

    if [ -z "$STEAMROOT" ]
       # something isnt right...
[+] deeviant|11 years ago|reply
I worked at a solar SCADA company that rolled there own APT packages.

In the pre and post install deb package scripts there was all kinds of crazy shit, like upgrading grub to grub 2 and manually messing with boot sectors. All this stuff in packages innocuously named modbus_driver.deb or what have you, and all in absolutely the most archaic bash syntax possible.

Eventually somebody mixed a rm -rf /bin/* with rm -rf / bin/*, and the rest is history. They bricked about 100 embedded PC's, all in remote locations, all powering powerplant SCADA systems that did stuff like connect to CalISO for grid management or collect billing information. It cost hundreds of thousand of dollars to fix.

[+] Qantourisc|11 years ago|reply
Whenever I write something scary like that, I usually wrap it in a function, double checking if it's really the folder you wish to delete ... Often even adding a user-verification if possible.
[+] iamtew|11 years ago|reply
Since the script is written in bash they could also have used $BASH_SOURCE, which will always point to the correct path of the script being executed, so you can do something like this:

    SCRIPT="$BASH_SOURCE"
    SCRIPT_DIR="$(dirname "$BASH_SOURCE")"
[+] enneff|11 years ago|reply
Gotta question why they used -f.
[+] preconsider|11 years ago|reply
To those name-calling the author of the script:

The product/update is hyped and the release date is set in stone. Tensions are high and your boss has already let you know that you're on thin ice and not delivering on the project goals.

A last-minute showstopper bug comes in, caused by file leaks. Everyone is scrambling, and the file belongs to you so its on you to fix it alone. There is no time for code review, and delaying isn't an option (so says management). "I'm afraid if we keep seeing these delays in your components, we might have to consider rehiring your position".

The rm rf works -- it's a little bit scary, but it works. You write a test case, and everything passes. Still, you add the "scary" line for good measure. You have two more bugs for fix today and you'll be lucky if you're home by midnight and see your wife or kids. You've been stuck in the office and haven't seen them in days.

Are you an "idiot", "talentless" engineer that "deserves to have his software engineering license permanently revoked"? How do you know this wasn't the genesis of this line of code?

[+] wpietri|11 years ago|reply
Yes, that person is still an unprofessional idiot.

If a doctor accidentally removes the wrong organ because administrators have overscheduled him, "whoopsie, not my fault" is not the appropriate answer. The same applies to engineers working on bridges. Professionals take responsibility for their working conditions.

There is an enormous shortage of programmers right now. Anybody shipping stuff that is bad or dangerous is choosing that. If we drop our professional standards the moment a boss makes a sadface, then we're not professionals.

[+] Gigablah|11 years ago|reply
So I guess it's a choice between getting fired for not meeting a deadline, and getting fired for destroying customer data.

I'd rather take the first option. At least my reputation will still be somewhat intact.

And as a bonus, I can get out of that hostile environment earlier.

[+] click170|11 years ago|reply
This.

This is the reality of corporate development. I think we should be demanding more before buying software from vendors, especially when you can't just whip out the source code to audit or fix yourself.

Suff like this doesn't have to happen, but we let it.

[+] dingaling|11 years ago|reply
> Tensions are high and your boss has already let you know that you're on thin ice and not delivering on the project goals.

Valve doesn't have bosses, remember?

[+] fr0styMatt2|11 years ago|reply
Totally agreed but isn't Valve a company famous for not having all that corporate deadline stuff?
[+] cechner|11 years ago|reply
but the same wouldnt be acceptable in other lines of work right? Try a civil engineer or surgeon... what would you expect someone to do in similar situations? Probably escalate the issue and perhaps delay the release.

Pushing in shitty broken means you're not doing your job. If your company forces you to do this then they are not doing their job.

[+] aaronem|11 years ago|reply
How do you know it was?
[+] BoppreH|11 years ago|reply
A while ago I made a small program to cache Steam's grid images and search missing ones (https://github.com/boppreh/steamgrid). More of an experiment in programming in Go, really, but it works and makes Steam a little better.

When I tried on Linux, it threw a permission error. Turns out Steam installs the folder "~/.local/share/Steam/userdata/[userid]/config/grid" without the executable permission bit. Without this bit no one can create files in there, Steam included, and the custom images feature gets broken.

I reported the problem, saying they should fix their installer, and got a "have you tried reinstalling it?" spiel. When I said I did, and manually changing the permission fixes the problem, so it must really be it, they closed the ticket with "I am glad that the issue is resolved".

This was a ticket at support.steampowered.com, because I didn't know Valve had a github account. I would open an issue there, but I don't have a linux installation to test again and this sort of misunderstanding burns a lot of goodwill.

[+] NamTaf|11 years ago|reply
Vavle's customer support is known to be pretty awful. This is all too familiar an issue.

To be fair to them, when the six-sigma of your customer service is 15 year olds asking to be unbanned cause they totally didn't use hacks, your CS probably gets a bit desensitised.

[+] m_mueller|11 years ago|reply
Too many IT supporters either have no idea what they're doing or have no interest in helping improve their product. I'm experiencing the same with highly specific and expensive commercial product, so it doesn't surprise me with Steam at all.
[+] Xylemon|11 years ago|reply
Something like this with Steam happened to my friend not too long ago. It was very saddening because he literally lost years of files (including personal projects) and salvaged what he could. That was with the Steam Beta and I caught Steam doing this myself (after he told me what happened). I was lucky to stop the script and switched out of the beta. At the time he reported this to Valve themselves and said they were "investigating the issue and knew of it". Seems to still be here, sigh.

I know the morale here is keep your files backed up but come on, this is a ridiculous issue Valve still hasn't fixed.

[+] kazinator|11 years ago|reply

   rm -rf "$STEAMROOT/"*
This is why serious bash scripts use

   set -u # trap uses of unset variables
Won't help with deliberately blank ones, of course.

Scripting languages in which all variables are defined if you so much as breathe their names are such a scourge ...

I did this once in a build script. It wiped out all of /usr/lib. Of course, it was running as root! That machine was saved by a sysadmin who had a similar installation; we copied some libs from his machine and everything was cool.

[+] tomaskafka|11 years ago|reply
Another beautiful example of developer/user priority inversion.

All system architects ever:

1) System data are sacred, we must build a secure system of privileges disallowing anyone to touch them

2) User data are completely disposable, any user's program can delete anything.

All users ever:

1) What? I can reinstall that thing in 20 minutes, there's like 100 million copies worldwide of these files.

2) These are my unique and precious files worth years of work, no one can touch them without my permission!

[+] stevewilhelm|11 years ago|reply
Many of the comments mentioned this should have been caught in the code review. I suspect they don't perform code reviews.

Makes me wonder, is there a tool, system, service for auditing how many 'pair of eyes' have reviewed a given line of code. This would be hard to determine, but could be useful. I am envisioning a heatmap bar or overlay that indicates the number of reviews a line of code has received.

[+] pjc50|11 years ago|reply
For those of you worried about important files, chattr +i is a useful defence. No easy way of applying this automatically.

Long ago I had a kernel hack that would kill any process that attempted to delete a canary file. Worked OK but no chance of it ever going mainstream.

[+] jtokoph|11 years ago|reply
The biggest lesson here is that backing up your files is extremely important. Both local backups and remote backups.

I like the 3-2-1 rule:

  At least three copies,
  In two different formats,
  with one of those copies off-site.
Software is written by humans who will undoubtably miss a corner case and not think of every possible environment.
[+] Bahamut|11 years ago|reply
It should be noted, as listed in that issue thread, this is apparently also present on Windows (same bug in two different shell scripts!).
[+] lifeisstillgood|11 years ago|reply
This is a danger of me-ware being used before it becomes software. Software assumes and defends against others using and running it. Me-ware makes no assumptions because me is the only one running it.

The transition from meware to software is a hard one - and usually it's how we get terrible reputations as an industry. Basically it's a prototype till it's burned enough beta users.

Edit OMG - that is actually Steam from valve - I take it back - this is supposed to be software.

[+] chadzawistowski|11 years ago|reply
Though it turned out to be irrelevant, I really enjoyed your spiel about "me-ware". The distinction between it and finished software is too easy to forget.
[+] colanderman|11 years ago|reply

    # Scary!
    rm -rf "$STEAMROOT/"*
Anybody who writes a line like this deserves their software engineer license revoked. This isn't the first time I've seen shit like this (I've seen it in scripts expected to be run as root, no less); it makes my blood boil.

Seriously. "xargs rm -df -- < MANIFEST" is not that hard.

EDIT: I shouldn't be so harsh, if it weren't for the comment admitting knowing how poor an idea this line is.

[+] leni536|11 years ago|reply
Well I'm on the guy's side but this worth noting:

Including my 3tb external drive I back everything up to that was mounted under /media.

Well maybe it was just unfortunate and the drive just happened to be mounted or the "backup" is always online. If it is the latter it is a really bad idea. If your computer is compromised you risk all of your backup. A proper backup should protect your data from these occasions.

[+] bcantrill|11 years ago|reply
Wow, an awful bug -- and brings back memories of a very similar bug that we had back in the late 1990s at Sun. Operating system patches on Solaris were added with a program called patchadd(1M), which, as it turns out, was actually a horrific shell script, and had a line that did this:

  rm -rf $1/$2
Under certain kinds of bad input, the function that had this line would be called without any arguments -- and this (like the bug here) would become into "rm -rf /".

This horrible, horrible bug lay in wait, until one day the compiler group shipped a patch that looked, felt and smelled like an OS patch that one would add with patchadd(1M) -- but it was in fact a tarball that needed to be applied with tar(1). One of the first systems administrators to download this patch (naturally) tried to apply it with patchadd(1M), and fell into the error case above. She had applied this on her local workstation before attempting it anywhere else, and as her machine started to rumble, she naturally assumed that the patch was busily being applied, and stepped away for a cup of coffee. You can only imagine the feeling that she must have had when she returned to a system to find that patchadd(1M) was complaining about not being able to remove certain device nodes and, most peculiarly, not being able to remove remote filesystems (!). Yes, "rm -rf /" will destroy your entire network if you let it -- and you can only imagine the administrator's reaction as it dawned on her that this was blowing away her system.

Back at Sun, we were obviously horrified to hear of this. We fixed the bug (though the engineer who introduced it did try for about a second and a half to defend it), and then had a broader discussion: why the hell does the system allow itself to be blown away with "rm -rf /"?! A self-destruct button really doesn't make sense, especially when it could so easily be mistakenly pressed by a shell script.

So we resolved to make "rm -rf /" error out, and we were getting the wheels turning on this when our representative to the standards bodies got wind of our effort. He pointed out that we couldn't simply do this -- that if the user asked for a recursive remove of the root directory, that's what we had to do. It's a tribute to the engineer who picked this up that he refused to be daunted by this, and he read the standard very closely. The standard says a couple of key things:

1. If an rm(1) implies the removal of multiple files, the order of that removal is undefined

2. If an rm(1) implies the removal of multiple files, and a removal of one of those files fails, the behavior with respect to the other files is undefined (that is, maybe they're removed, maybe they're not -- the whole command fails.

3. It's always illegal to remove the current directory.

You might be able to imagine where we went with this: because "rm -rf /" always implies a removal of the current directory which will always fail, we "defined" our implementation to attempt this removal "first" and fail the entire operation if (when) it "failed".

The net of it is that "rm -rf /" fails explicitly on Solaris and its modern derivatives (illumos, SmartOS, OmniOS, etc.):

  # uname -a
  SunOS headnode 5.11 joyent_20150113T200918Z i86pc i386 i86pc
  # rm -rf /
  rm of / is not allowed
May every OS everywhere make the same improvement!
[+] fsaintjacques|11 years ago|reply
Note to all, prepend all your bash script with

"set -o errexit -o nounset -o pipefail"

It'll save you headaches.

[+] nodesocket|11 years ago|reply
What is this magic doing?

   $(cd "${0%/*}")
[+] rndn|11 years ago|reply
I don't want to blame anyone, but maybe there should be foolproof default security measures that prevent something like this from happening. For example rm -rf called on a home, documents, music, photos etc. directory could require an additional confirmation, perhaps through a GUI.