README

http://www.ccl.net/cca/software/UNIX/recover-files-after-rm/README.shtml
CCL README
From chemistry-request@ccl.net Tue Aug 23 19:43:19 1994
Date: Wed, 24 Aug 1994 08:50:42 +1000 (EST)
From: John Meehan 
Subject: CCL:UNIX - rm
To: Shu-Chuan Jao 
Cc: chemistry@ccl.net

On Tue, 23 Aug 1994, Shu-Chuan Jao wrote:

> Hi!    I am really in a big trouble. I accidentally
> typed "rm *" and, as you can image, all my files in
> that directory are gone. We did not do backup. So, I 
> cannot restore those files. I know that there are some 
> packages that can do "undelete" on DOS to recover files
> which are deleted. Is there package doing the same thing
> for unix?
>     I appreciate any responds.            08/23/94

This was posted to comp.archives a little while ago. It may help....

>X-Url: ftp://gatekeeper.dec.com/pub/sysadm/recover.tar.Z

Archive-Name: auto/comp.unix.admin/File-recovery-program

	Below is a "Dr. File System" article I wrote as commentary on
	possible solutions to the problem of recovering lost files on 
	UNIX systems, especially those using the Berkeley Fast File 
	System.

	The article observes that it should be possible to write a program 
	which examines the cylinder group free fragment bit maps.  Once
	you have the bit maps, it is fairly easy to read the free blocks
	of the file system and examine the old data in them.

	A few months ago I removed some program sources that hadn't been 
	backed up yet and were long enough I didn't want to type them
	in again.  So, I wrote the program loosely described by the
	article.  Having been careful not to disturb the free list of 
	the disk while writing this program, I got back the previous 
	sources.

	More recently, I dusted off the program, updated it to run
	on DEC OSF/1 and fixed the stupidly slow parts to go faster.
	A compressed archive of the program has been placed on:

		gatekeeper.dec.com:/pub/sysadm/recover.tar.Z

	Enjoy.  If you have any questions about it, send me mail
	at "alan@nabeth.cxo.dec.com".

					Alan Rollow
					alan@nabeth.cxo.dec.com

				* * *

Dear Dr. File System,

   I just typed "rm -r" by accident in the wrong place and all my
files are gone.  How do I get them back?  Oh, by the way.  I don't
have any backups...

				signed, Clueless

Dear Clueless,

   You stupid twit!  [ and off Dr. File System goes into his usual
rant about not keeping good backups... ]

				* * *

While Dr. File System is ranting, perhaps we can help Clueless with
his problem (and make a bundle of money off software consulting
services in the process).  First examine what rm(1) really does:

	Process the argument list, making note of which options
	are used.

	For all remaining arguments, if the argument is ".."
	continue.  Otherwise "remove" it.

	"Removing" consists of taking care of details like
	not being able to remove directories, except as part
	of a recursive remove, making sure the file is really
	there, etc.

	In the end, it comes down to doing an unlink(2) on the
	filename.

The unlink(2) system call removes the file's name and inode number
from the directory and decrements the link count.  It is worth
pointing out for those that didn't know it already that a file can
have many names, but only one inode.  The inode includes a reference
count of how many names it has.  These references are also called
hard links.

When the link count goes to zero AND the file is no longer open,
then the file is removed.  Thus it is possible to create a file,
unlink it (removing the name) and let the opening processes continue 
to use it.  When the file is closed or the process exits, the file 
goes away.

Actually getting rid of the file consists of putting the blocks
it was using back on the "free list" and clearing the inode.  For
the fast file system this "free list" is actually a bit map for
each cylinder group.  A simple system macro is used to calculate
the cylinder group number from the block number.

It's also worth noting that neither rm(1) nor the file system
do anything with the contents of the block when it removes the
file or puts the block on the "free list".

Using this collection of information it may be possible to
help Clueless recover some of his data.  The first, absolute
most important thing to do is make sure the file system changes
as little as possible.  This may require rather extreme measures
(like shutting down or crashing the system).  The point of this
is to prevent the blocks that were part of Clueless' files from
getting allocated to other users.  Making a PHYSICAL backup of
the partition holding the file system is a good idea.  This
will let the recovery operate on a copy of the data, while
returning the disk to the service of the people that might be
using it.  Now to the recovery.

First, recall that the inodes were cleared when the files
were removed.  If they hadn't been, it would be fairly
simple matter to examine all the free inodes to determine
which had been owned by Clueless and look at the block
lists to determine where the data was.  This avenue isn't
available to us, though.

What remains is the examination of the data on the disk in
the hopes of picking out the pieces that were part of Clueless'
files and putting them back together like many jigsaw puzzles.
The methods for this examination can vary from brute force
approach to a highly optimized one.  Consider the choices.

1.  Theme - Examine every LBN of the disk.  Pick out the LBNs that 
    were probably interesting and ignore the rest.  This has the
    disadvantage of not throwing out blocks allocated to other
    people and those that are part of the file system's data
    structures.

2.  Variation #1 - Observe that the Fast File System is organized
    in pieces of data having two sizes; the fragment and the file
    block size.  Depending on the allocation scheme only the block
    size may be interesting.  The only advantage this has over the
    Theme is that there are few blocks to examine.

3.  Variation #2 - The skilled mechanic can easily identify what
    file system block belong to the file system overhead and what
    blocks are data.  This is bound to help some.

4.  Variation #3 - Observe that only free blocks are likely to
    contain Clueless' data.  Each cylinder group has a bit map
    of the free space for that cylinder group.  If the skilled
    mechanic is able to filter out the file system overhead,
    then limiting the search to the free list, shouldn't be that
    much harder.

5.  Variation #4 - The value of this optimization depends on exactly
    what Clueless removed and how the files were arranged.  If the
    files were in a directory and only the file in that directory
    were removed (no subdirectories), then we can make a good guess
    where to start the search.

    Recall from previous discussions that when a file is created it
    prefers to end up in the same cylinder group as the directory
    in which it resides.  If you can determine what cylinder group
    the directory of interest is in, then you can start the search
    in that cylinder group.  While you may still have to search all
    the cylinder groups to find all the data, there is a chance that
    you can recover an interesting amount of the quickly.

A program to do such a search is not exceeding difficult with
enough study of the file system include files and a good
starting example.  A good version of the program would perform
some data analysis of each block in the hopes of identifying
it.  Sort of like file(1).  A further enhancement would allow
formatting the data in various ways, like od(1).  All of the
interesting recovery work is the examination of the data, not
getting the data to examine.

--
Alan Rollow				alan@nabeth.cxo.dec.com





----------------------------------------------
John Meehan                   O     CH2-COOH      
Department of Biochemistry    "    /
University of Tasmania,    HO-P-O-C-COOH
Australia                     |    \
                             HO     CH2-COOH
PHOSPHOCITRIC ACID ----    A powerful, natural 
   inhibitor of Pathological Biomineralization
----------------------------------------------


---Administrivia: This message is automatically appended by the mail exploder:
CHEMISTRY@ccl.net -- everyone     | CHEMISTRY-REQUEST@ccl.net -- coordinator
MAILSERV@ccl.net: HELP CHEMISTRY  | Gopher: www.ccl.net 73
Anon. ftp www.ccl.net     | CHEMISTRY-SEARCH@ccl.net -- archive search
http://www.ccl.net/chemistry.html |     for info send: HELP SEARCH to MAILSERV
[ CCL Home Page ]
[ recover-files-after-rm ]
[ Raw Version of this page ]
Modified: Tue Aug 23 16:00:00 1994 GMT
Page accessed 30262 times since Sat Apr 17 21:24:59 1999 GMT