top | item 1618007

Recovering deleted files using grep

168 points| atomicobject | 15 years ago |spin.atomicobject.com | reply

47 comments

order
[+] sp332|15 years ago|reply
Make sure your output file is on a different filesystem! Otherwise, it might be saved in the newly-freed blocks of the file you're trying to recover.
[+] johnswamps|15 years ago|reply
Better yet, mount the hard drive read-only (on a different computer if necessary).
[+] Wilfred|15 years ago|reply
The author's intent is to write enough of the surrounding context that you recover it first time.

Still, it raises two questions:

What about fragmentation?

Why don't have a GNU safe-rm yet that moves files to the (freedesktop.org specified) trash location to avoid this?

[+] cryptoz|15 years ago|reply
> To help prevent this problem from happening in the first place, many people elect to alias the rm command to a script which will move files to a temporary location, like a trash bin, instead of actually deleting them.

Whatever happened to backups?

To help prevent this problem...

KEEP A BACKUP.

[+] derefr|15 years ago|reply
The kinds of files I most often regret rm-ing are the temporary files I have created myself as a step in a process, then deleted after I had moved onto the next step, not realizing an error had crept into the processor and that I would have to run it again on the source files (which are now, conveniently, gone.) Backups don't solve this problem, because the files themselves are never more than an hour old. A "trash" folder, however, fixes this perfectly: the semantic is that the file no longer has any place it "belongs," and may be purged if you successfully complete the project, but may be needed again if the project must be "rewound" to that step.

However, you're right that making rm(1) express move semantics isn't the right solution. Maybe if the filesystem had a "BEGIN TRANSACTION" command that you could ROLLBACK...

[+] thaumaturgy|15 years ago|reply
My backups don't run on a minute-to-minute basis (dunno about yours), so it's totally plausible that I can spend all day working on a particular file and then mistakenly nuke it somehow, and it won't be retrievable by the most backup standards.
[+] csummers|15 years ago|reply
Been there, done that. I rm -rf'd a bunch of important files once, and at the time grep was giving me "memory exhausted" errors. I was able to use strings to grab all of the text of the disk, and then wade through the results with vim.

I guess this is a pretty common problem. The blog post I wrote about it in 2005 continues to be the most searched-for entry point on my site: http://csummers.com/2005/12/20/undelete-text-files-on-linux-...

[+] tbrownaw|15 years ago|reply

    cat /dev/mem | strings | grep -i llama
[+] ramidarigaz|15 years ago|reply
Hmm... I'm getting an error on that one.

    cat: /dev/mem: Operation not permitted
Edit: even as root
[+] auxbuss|15 years ago|reply
Where the author says conservative, he means liberal.

(From afar, I understand my Colonial cousins' struggle with these two words.)

[+] omrisiri|15 years ago|reply
I've been using this method since i first learned about raw disk access (dev files) and grep.

I think it should be mentioned that this will work properly only if the file was not fragmented - Which will usually be the case in EXT3 unless you are using almost all of the space in the drive, but may happen frequently if you are using a FAT file system (which is used a lot in USB disks).

Also, If you just deleted a binary file this method will be problematic as well, and in that case you can use a tool like photorec to scan the disk and even limit it only to the free space on the drive - which reduce the time it takes to go over a disk and can detect all kinds of binary file types (uses the magic number of the file to detect the type).

Like other people mentioned here before, you should recover all the data to a different partition/disk than the one you are trying to recover a file from.

With that said - recovering data is a tedious and error prone process, so if the data is worth enough(and for some silly reason you don't have a backup) you should:

A. turn off the computer immediately after you've discovered the loss of data (to reduce the chances of overwriting anything important)

B.Give the computer/disk to a professional to recover (because you obviously aren't one since you don't keep backups)

[+] moell|15 years ago|reply
Fortunately point A on Linux can be substituted with mount -o remount,ro /
[+] naturalized|15 years ago|reply
Or, if you want to really delete a file, use #shred filename command

#man shred SHRED(1) User Commands SHRED(1)

NAME shred - overwrite a file to hide its contents, and optionally delete it

I especially like the -n option!

[+] moobot|15 years ago|reply
Except that shred is not guaranteed to work on many (most?) modern filesystems. From `man shred`:

       CAUTION: Note that shred relies on a very  important  assumption:  that
       the  file system overwrites data in place.  This is the traditional way
       to do things, but many modern file system designs do not  satisfy  this
       assumption.   The following are examples of file systems on which shred
       is not effective, or is not guaranteed to be effective in all file sys‐
       tem modes:

       * log-structured or journaled file systems, such as those supplied with
       AIX and Solaris (and JFS, ReiserFS, XFS, Ext3, etc.)

       * file systems that write redundant data and  carry  on  even  if  some
       writes fail, such as RAID-based file systems

       *  file  systems  that  make snapshots, such as Network Appliance's NFS
       server

       * file systems that cache in temporary locations, such as NFS version 3
       clients
[+] albertzeyer|15 years ago|reply
Via `reiserfsck --rebuild-tree`, you can also do that for ReiserFS partitions. Have worked very reliable for me. Only problem is that it doesn't always recover the filename and/or the directory structure (depending on how long it is ago that you have deleted it).
[+] kentnl|15 years ago|reply
Just don't do this if you've at some stage backed up another reiserfs filesystem inside your reiserfs filesystem with 'dd'.

the rebuild tree trick mistakenly sees entries in the dd'd copy as being files in the parent file system, and then sprays them all over your drive.

[+] datums|15 years ago|reply
One of my most mememorable cluster fucks was recovering a database using strings on the disk. The customer ran repair table and ended up with a very small table :) . It was tedious but felt awesome actually getting a large part of the data back.
[+] kajecounterhack|15 years ago|reply
I used this method once...the file created gets pretty huge but you can even manually sift through it for lost code if you know roughly where it ended up!
[+] retroafroman|15 years ago|reply
Excellent Linux hack. I hadn't ever heard this before.
[+] albertzeyer|15 years ago|reply
It works on all systems where you have raw access to the disk. And it isn't really that fancy if you think about how it works and how file systems work.
[+] koevet|15 years ago|reply
Actually, I think that the real great hack here is to alias the rm command to a trashbin script (as suggested at the end of the article)
[+] telemachos|15 years ago|reply
The danger of aliasing the command itself (the bare 'rm') is that you come to count on the safety of the alias. Then you work one day on a friend's or coworker's machine and...BOOM.

What I do instead is make a nearby (and simple) alias. For example:

    rmi='rm -i'
[+] abhiomkar|15 years ago|reply
This can be achieved using trash-cli
[+] freerobby|15 years ago|reply
Clever stuff, thanks for sharing.
[+] mkramlich|15 years ago|reply
frequent automatic backups and version control are your friend