top | item 25242444

Undeleting a file overwritten with mv

268 points| todsacerdoti | 5 years ago |behind.pretix.eu | reply

85 comments

order
[+] pdkl95|5 years ago|reply
> alias mv='mv -i'

While --interactive mode is certainly useful, a better option for this problem that I don't see mentioned often is the --no-clobber option:

    -n, --no-clobber
              do not overwrite an existing file
I often add --no-clobber to scripts to make automated file movement safer. An alternate "safe mv" alias can be useful, e.g.:

    alias smv='mv --no-clobber'
(I don't recommend changing the behavior of normal 'mv' directly; getting used to assuming 'mv' means the safer 'mv --no-clobber' can be dangerous when you use use a different computer without your custom environment)
[+] __henil|5 years ago|reply
Is there a convenient way to make this type of alias work when using with _sudo_ too?

i guess one can alias for the root user too but its annoying.

[+] z92|5 years ago|reply
Here's the summary on how he did it :

- Search and list all FLV files on the disk.

- Search for all FLV file signatures on raw disk.

- For each known file on the disk, compute md5 of the first 512 bytes.

- For each FLV file signature found on raw disk exclude those that match any of the known files using those md5 values.

- That leads to only 5 files remaining.

- The original file was known to be 1.6 GB. Read 1.8 GB serially from raw disk starting from file signature and save those.

- One of these is your file.

That will work if your file isn't fragmented on the disk, I guess.

[+] AnIdiotOnTheNet|5 years ago|reply
If anyone would like to read more, this technique is called "file carving".

> That will work if your file isn't fragmented on the disk, I guess.

Even 15 years ago when I was doing digital forensics it was actually pretty uncommon to have to deal with fragmentation. After about 2000 filesystem allocation implementations started avoiding it like the plague.

[+] molticrystal|5 years ago|reply
Any reason the data couldn't be carved out with photorec[0]? It even has flv built in, but even if it didn't, creating new types isn't hard. This type of datacarving seems exactly what that software was designed for, and it does so quickly with a wizard.

The taking of the 500K bytes for the has does seem like a smart way to differentiate between the undeleted files and the deleted ones, I'll keep that in mind if I am ever in such a crazy situation and it would be helpful.

[0] https://www.cgsecurity.org/wiki/PhotoRec#How_PhotoRec_works

[+] wheybags|5 years ago|reply
Given that they mentioned there was a small bit of corruption at one point in the video, I think photorec might even do a better job.

My guess as to the cause of that corruption is that the file was slightly fragmented, but the fragments were pretty close together, and streaming video formats are resilient enough to tolerate some garbage in the middle of the stream.

IIRC photorec should be able to handle this.

[+] PKop|5 years ago|reply
Would be good to have that installed on the disk in future, but he mentioned he immediately put the disk in read only mode, so couldn't install new software
[+] wizzwizz4|5 years ago|reply
It probably could've, and PhotoRec is iirc designed to ignore not-deleted files. (This also means that PhotoRec is useless if you formatted your drive, but scalpel still works.)
[+] metafunctor|5 years ago|reply
I'm surprised that the lost file was simple continuous area on the raw device, and would've expected the pieces to be intermingled with data from other files.

Maybe if the disk is being filled for the first time, space for the file is allocated in one go (as opposed to gradually appending to the end of the file), and there's a large enough unfragmented area of free space, and the disk isn't very busy, the file would be allocated like this on the device. But I sure wouldn't count on it.

Not sure about the flv format, and how it recovers from errors, but in the worst case it might look like the correct file but might have pieces from other videos mixed in.

[+] marcan_42|5 years ago|reply
Modern file systems try fairly hard to keep files contiguous for performance, and if recordings are always written and deleted semi sequentially, you'd expect the disk to end up fairly not fragmented. That said, two or three big fragments would be pretty common in this situation.
[+] herpderperator|5 years ago|reply
I too was surprised that the article mentioned nothing about blocks or fragmentation. The deleted file has an inode yes but that inode also describes all the blocks that contain the data for that inode and it is no guarantee that the blocks are contiguous. It seems that was completely overlooked, but in this case it turns out everything apart from one part in the middle of the video was recovered leading me to believe that a few of the blocks were in fact fragmented.
[+] marcan_42|5 years ago|reply
pv is much smarter than you think. If instead of `cat|pv` you just use `pv filename` or even `pv <file`, it'll work out the file or block device size on its own.

If you forget to use pv and want a nice progress bar for an already executing process, use `pv -d <pid>`. That'll display progress bars for every open file. Works even for things like installers and servers, where you wouldn't be able to use a pv pipeline anyway.

[+] Jenda_|5 years ago|reply
By the way there is a tool called "progress" which will scan all running processes of (supported) tools like gzip, cat, grep etc., and report progress of their file operations.

If you want to emulate this by hand, first get the fd number of the file of interest by `ls -l /proc/PID/fd/` and then `cat /proc/PID/fdinfo/NUMBER`. There is a line called "pos", which is the position in the file.

[+] jeroenhd|5 years ago|reply
I did not know about `pv -d`, that's a great feature!
[+] bestboy|5 years ago|reply
Great article!

I acknowledge that this is an honest mistake that might happen to me anytime.

I guess human error is the likeliest cause of data loss nowadays... :(

Anyways for my ext4 ubuntu desktop I immediately made some changes in response to this post:

* installed ext4magic and extundelete so I don't have to do it after the accident, potentiality overwriting the deleted file

* changed my 'll' alias to 'ls -lisah' to include the inode. I guess it's very likely that one does a file listing before moving files around and this can be a live saver.

[+] sam_goody|5 years ago|reply
Of course. Obvious.

Actually, er, what?!

One of the nice things about HN is how it reminds me that I know nothing and others know a lot. ;)

Also, kudos for being brave enough to write this up. Even had I figured out the technique, I would have been so afraid that I missed some simple trick or tool, or that I would look like an imbecile for letting this mistake happen in the first place, that I would passed on a public blog post.

[+] tutfbhuf|5 years ago|reply
What irritates me is that he was willing to put so much effort into restoring the video file. Usually you would only do that if the files are important. But if the files are important, then you would definitely have a second copy of it (backup). Always have a backup is the real learning here.

It is also recommended to automate data related tasks as much as possible. If you have a human doing mv per ssh regulary or semi-regulary, then there is always the risk of a typo or some other kind of human error. I would rather expect that such an error would happen over a long enough time period, than not.

[+] daxelrod|5 years ago|reply
In this case, they explicitly called out that they decided not to make backups of these files, and maybe you’re right that they chose the wrong trade offs and the amount of engineering time they spent recovering cost more than just keeping a backup.

But let’s say they were taking backups. “Always have a backup” turns out not always be enough.

Perhaps the overwritten file was new enough that it hadn’t been backed up yet.

Perhaps they didn’t realize their mistake until after the backup process had run, and the backup no longer contained the file they had overwritten.

Perhaps they attempted to restore the overwritten file from backup and discovered that the backup process had actually been failing but they had insufficient testing or notifications.

Point is, backups have an engineering, hardware, and complexity cost, too. I don’t know enough about their tradeoffs to judge them for making the wrong decision here.

That said, I do agree that in general, the default choice should be automated backups, with multiple sets for different time intervals, in a mixture of on- and off-site storage, with regular automated restoration tests.

[+] AnIdiotOnTheNet|5 years ago|reply
I have observed that basically no one thinks about backups until they've had at least one incident where they've lost something important to them and were unable to recover it. The actual number varies from person to person, but I've never seen less than 1, though I have seen more than 5 on several occasions.
[+] Triv888|5 years ago|reply
I'm pretty sure anyone that used a computer for a while deleted a file on accident before having made a backup...
[+] saalg|5 years ago|reply
You could have done this much faster with the sleuth kit. Since you already knew the inode number of the deleted file I think you could just run `icat -f ext4 /dev/md2 <inode#> > recovered_file.flv`
[+] cranekam|5 years ago|reply
> In the long term, we’ll of course work on preventing this from possibly happening again. Leaving very specific solutions like alias mv='mv -i' aside..

A bare minimum first step would be to stop using mv directly and wrap it in a shell script with appropriate error checking/environment setup/etc. This will take 5 minutes to develop and immediately prevents a whole class of operator errors from happening again. It also makes for just one place to put all the logic needed, so when the process to expose a file becomes mv and something else the operator’s interface remains unchanged.

[+] eximius|5 years ago|reply
Simple (but possibly incomplete) answer: store files you don't want to delete on a ZFS filesystem with snapshots.
[+] marcan_42|5 years ago|reply
Better answer: have backups. Fancy filesystems may save you if you use their features properly, but are also hell to do deep data recovery on when something does go wrong. And they are also buggier simply by being more complex.
[+] matja|5 years ago|reply
Yes, ZFS makes this very easy. It's no problem to snapshot an entire filesystem with billions of files every 5 minutes from cron.

Then the OP could have done:

  zfs-restore-file recording-16679.flv
With `zfs-restore-file` as the following script (for example only, I hacked it up in a few minutes) :

  #!/bin/bash
  
  FILE="$1"
  FULL_PATH=$(realpath "$FILE")
  DATASET=$(findmnt --target="${FULL_PATH}" --output=SOURCE --noheadings)
  MOUNT_POINT=$(findmnt --source="${DATASET}" --output=TARGET --noheadings | head -n1)
  CURRENT_INODE="$(stat -c %i "${FULL_PATH}")"
  RELATIVE_PATH="$(echo "$FULL_PATH" | sed "s|^${MOUNT_POINT}/||")"
  
  # iterate all snapshots of the dataset containing the file, most recent first
  for SNAPSHOT in $( \
    zfs list -t snapshot -H -p -o creation,name "${DATASET}" \
    | sort -rn | awk '{print $2}' | cut -d@ -f2 \
  ) ; do
    echo "snapshot $DATASET @ $SNAPSHOT"
    SNAPSHOT_FILE="${MOUNT_POINT}/.zfs/snapshot/${SNAPSHOT}/${RELATIVE_PATH}"
    SNAPSHOT_FILE_INODE="$(stat -c %i "${SNAPSHOT_FILE}")"
    if [ "${SNAPSHOT_FILE_INODE}" == "" ] || [ "${SNAPSHOT_FILE_INODE}" == "${CURRENT_INODE}" ] ;   then
      continue
    fi
    echo "found the same named file with a different inode:"
    ls -l "${SNAPSHOT_FILE}"
    cp -i "${SNAPSHOT_FILE}" "${FILE}"
    break
  done
If OP didn't change the inode (overwritten with new content) then you could make another script that compares size/hash of the file, or manually specify a time of a snapshot to restore.
[+] noisy_boy|5 years ago|reply
Question: would creating a symlink via ln -s also work instead of doing mv? That would be less risky and more performant compared to moving across filesystems.
[+] gvb|5 years ago|reply
Actually, creating a hard link "ln" (no -s) would be the best choice. With the hard link, there are two directory entries pointing to the same data on the disk. At that point, removing the original with a "rm" unlinks the original but the new directory entry for the file remains.

As a bonus, ln will not overwrite the destination file if you mistakenly try to "ln" it to an existing file.

[+] tetha|5 years ago|reply
I'd say that depends on the retention requirements of the recording folder and the public folder.

If you symlink "./public/recording" to "./recording", the public file only exists as long as the original file exists. Some automated cleanup could result in unexpected file deletions from "public/" (or, more specifically, the creation of dangling symlinks). However, this might be a use case for a hard link if you need the file in both places and both directories are on the same filesystem. Though I haven't thought about the implications of a hard link in this case enough so far.

[+] angry_octet|5 years ago|reply
The httpd should not follow the symlink across filesystems (or arguably, follow symlinks at all) and hence the file would be inaccessible.
[+] gorgoiler|5 years ago|reply
Brilliant write up. Thank you for posting this. It makes me want to do a fire drill to see if I can use the same tools and techniques.

The remote-control magic sysrq trick is also fantastic.

[+] amelius|5 years ago|reply
All modern software has an undo option. Why doesn't the filesystem?
[+] ben509|5 years ago|reply
All modern software applications have an undo option. Windows Explorer has had a recycle bin since the '90s, the MacOS Finder has had a trash can since the '80s, and various Unix equivalents have the same. Those are interesting because they've had to put a lot of work into solving this seemingly simple problem.

I think file systems run into technical and psychological issues.

The major obvious technical issue is simply running out of space, and how you deal with that informs many other aspects of such a system. The other issue is how the user figures out what action they need to undo. High level applications have an integrated interface, so the user is directly issuing commands into the application's event loop, and the undo feature is also integrated into that event loop.

But you don't directly make calls to the filesystem; deletions or updates are always issued by a process acting on your behalf. Many processes are generating a ton of temporary data, so while these actions shouldn't be undoable, there's no general way for the filesystem to know this. A user attempting to undo an action would have to sift through a flood of irrelevant history.

The first psychological issue is the user's intent. Software applications tend to make this a two-step process: you make revocable changes, then when you hit "save" your actions become irrevocable. You move a file to the trash, and you can take it up, but when you empty trash it's irrevocable.

For a filesystem, again, because applications are acting on their behalf, this connection is largely lost. There's no clear "save point" where changes should be lost, so maybe you could add a "force" flag to make changes permanent. But if you start adding a "force" flag to actions, you'll change user behavior; if they have to force actions to make them permanent, they may start to do that routinely.

And there's a moral hazard produced by insuring that actions are revocable; if users get used to having an "undo," they will naturally begin to rely on it. If the system has to automatically make changes irrevocable (running low on disk space), then you'll get situations where users are screwed because they assumed they'd have undo to fall back on.

And, of course, users can be screwed when they thought they had deleted (or changed) something and it wasn't. This is already something forensics experts can do, but a generic undo feature lowers that bar to the nosey middle manager.

[+] firethief|5 years ago|reply
It does. If you care about your data, you're running ZFS
[+] vram22|5 years ago|reply
They may have been able to make this command they used:

cat /dev/md2 \

| pv -s 1888127576000 \

| grep -P --byte-offset --text 'FLV\x01\x05' \

| tee -a /mnt/storagebox/grep-log.txt

somewhat (or a lot) faster by adding a strings command (maybe with an appropriate length arg matching the grep pattern length) to the pipeline after the cat, and removing the pv call, be ause then only the printable ASCII characters woyls be passed to the grep, potentially heavily reducing the work it has to do, depending on the breakup between binary and text data (bytes) on disk.

https://man7.org/linux/man-pages/man1/strings.1.html

[+] xphx|5 years ago|reply
That would invalidate the byte offset. The intention here isn't so much printing the data as its location on disk.
[+] vram22|5 years ago|reply
Sorry, typos above:

>be ause then only the printable ASCII characters woyls

should be

because then only the printable ASCII characters would

[+] usr1106|5 years ago|reply
I have used automatic daily and weekly LVM snapshots. They slow down your write speed (especially the second one IIRC), but in software development use I haven't found it an issue. If you are write huge videos all day long that might be different.
[+] sh1mmer|5 years ago|reply
This is why writing scripts/tools for ops work is helpful.

You can do `alias mv='mv -n'` or similar but then you have to hope everyone is using the same shell prompt, etc.

Even if your tiny script is just:

``` #!/usr/bin/env bash mv -n recordings/$1 public/$1 ```

you’ve removed some of the human tendency towards occasional typos and written something move defensive.

In my experience as an SRE people set way to high a bar for what should be tooling or “automation”. As soon as you make something software then you can iterate on it, or not as it makes sense.

If you keep on typing one off commands then the humans need to be correct every time.

[+] AnIdiotOnTheNet|5 years ago|reply
> I used the big hammer to remount everything read-only immediately:

> # echo u > /proc/sysrq-trigger

> Uhm, okay, this worked, but how do I install any data recovery tools now?

Yesterday there was a discussion around an article that talked about how Desktop OSs were simpler (read: better) in the 90s. One of the things mentioned was that applications in may of them were single files (or folders) that could be located anywhere, requiring no special installation or uninstallation steps. This scenario highlights one of the many reasons that is useful.

[+] t0astbread|5 years ago|reply
If the recovery system allows you to write to the filesystem and has network access, what prevents one from using the package manager? If not, then you couldn't place a single executable anywhere on the system either (if I'm not mistaken?) so what difference does it make in this case?
[+] cmurf|5 years ago|reply
On Btrfs, there's no overwrite of metadata. The super keeps a record of current plus three backup sets of trees. You want to stop this file system quickly though, they don't stick around very long.

Those root tree address can be plugged into 'btrfs restore' (offline scrape tool) to search for and extract the files you want.