Data Recovery
Introduction
Don't assume that your data is lost, just because of horrendous noises from a hard drive, failure to boot, and diagnostic tool failure reports. With a combination of a live CD and some specialist tools, all may not be lost.
Caveats
Do consider using a professional data recovery expert for recovery of important data. This article is for Debian / Ubuntu; you need to make appropriate changes for other distributions and package managers.
Step 1: Boot in a live CD
There are many live CDs to choose from, and some dedicated to this purpose, but to make the process as simple as possible for me, I used Kubuntu as I know it well. The following instructions assume that you are using an Ubuntu live CD.
- Boot onto a live CD
- Configure networking
- Uncomment repositories in /etc/apt/sources.list - see AptRepositories for guidance
- Run aptitude update (don't bother upgrading though!)
Step 2: Mount a destination drive
Make sure that your destination drive is larger than the source drive, a 500gb drive may not be large enough to install a 500gb image of that drive.
Please also note that USB is notoriously slow, in a recent example it took 13 days to copy 500gb. Try and use a SATA or e-SATA connection, or perhaps USB3 if you have such a thing.
You need access to a hard drive that is able to support large files, if you use an external USB drive, then make sure that it is not FAT (usually the default). You can also use a mounted NFS share. Mount it ready for action; I will assume that the destination drive has been mounted at /mnt/destination. Ensure that the source (broken) drive is not mounted (it shouldn't be unless you mounted it).
Step 3: Determine source drive id
You need to find out the id of the source drive. This will be listed under /dev and if it's your primary drive will probably be /dev/sda or /dev/hda.
Step 4: Install GNU ddrescue
Note for historical* reasons, the package is named gddrescue in Debian and Ubuntu.
$ sudo aptitude install gddrescue$ man ddrescue* Debian (and Ubuntu) package names are screwed up: ddrescue has a package name gddrescue, whilst ddrescue does exist in the debian/ubuntu repos, but is actually a package for dd_rescue, which is an older and less effective program to do the same thing. Plenty of potential for disaster there.
Step 5: Run GNU ddrescue
N.B. ddrescue is very slow, I believe the speed should be dramatically better by using "-b 4k" or "--block-size=4k" on all the ddrescue commands below. I have not yet tested this theory. The benefit may be a 10th of the time taken, but on your head be it. I intend using this next time, and will update this page thereafter.
Replace "/dev/sda" for actual source drive, and "/mnt/destination" for actual destination drive.
$ sudo ddrescue -n /dev/sda /mnt/destination/recovered.img /mnt/destination/recovered.logThe "-n" should run faster as it will skip over the errors (although it seemed no better to me). Data recovery is not a fast process, and it will probably take a few days (see N.B. at the beginning of this section). The great thing about ddrescue is that you can abort at any time and recommence from where you left off. You can also skip forward by adding the switch "-i" followed by the number bytes into the disk, e.g. to start from 10gb:
$ sudo ddrescue -n -i 10000000000 /dev/sda /mnt/destination/recovered.img /mnt/destination/recovered.logMy tip is to keep aborting (Ctrl+C) and skip forward until you pass the area of the disk which is causing problems. Then, once the bulk of the drive has been recovered you can go back to the sections you skipped, or just move onto the second pass (see next section). ddrescue will not replace data already recovered, so you can do this safely.
Step 6: Run GNU ddrescue again
You should by now have a full image, albeit with some blank (or zeroed) areas. You may decide that you've spent long enough, and skip to the next section. Now you should run again, this time replacing the -n with "-r 1" or perhaps "-r 3" to try more than once to recover the data.
$ sudo ddrescue -r 1 /dev/sda /mnt/destination/recovered.img /mnt/destination/recovered.logStep 7: Copy the destination image
You don't want to mess up your hard earned image - so copy it and work on the copy.
Step 8: Install sleuthkit
You need mmls to determine the partition structure of your disk image. This is part of sleuthkit. On the destination PC, install sleuthkit:
$ sudo aptitude install sleuthkit$ man mmlsStep 9: Run mmls
Simply:
$ sudo mmls copy.imgDOS Partition TableOffset Sector: 0Units are in 512-byte sectors Slot Start End Length Description00: ----- 0000000000 0000000000 0000000001 Primary Table (#0)01: ----- 0000000001 0000000062 0000000062 Unallocated02: 00:00 0000000063 0117195119 0117195057 NTFS (0x07)03: ----- 0117195120 0117210239 0000015120 UnallocatedTake a note of the Start point of the partition that you wish to access.
Step 10: Calculate Offset
This shows several partitions. In this example, we want to mount the NTFS partition starting at block 63. To calculate the number of bytes, multiply by 512:
63 x 512 = 32256Step 11: Attempt to mount partition
For a DOS partition:
$ sudo mount -o loop,offset=16384 copy.img mountpointFor an NTFS partition:
$ sudo aptitude install ntfs-3g$ sudo mount -t ntfs-3g -o ro,force,loop,offset=32256 copy.img mountpointFor some reason, the image won't mount with -t ntfs, and does need the full ntfs-3g functionality, even though we are only mounting read-only; I don't profess to understand the reasons for this, but ntfs-3g just works.
Step 12: Extracting files from an unmounted disk image
If the image will not mount, then the general advice seems to be to copy the image to clean hardware (i.e. a physical disk) and use a Windows recovery disk to boot. Failing that, all is not lost, there are a number of tools that will search disk images for files. I played with photorec, but whilst it recovered loads of cached images from IE, it failed to recover more than a handful of proper photos. Foremost on the other hand seemed to be much more successful.
Update: A number of people have reported successes with photorec, so I suspect that it was simply a buggy version of photorec.
$ foremost -i copy.img -o output-folderWith luck this will give you a folder that looks like this:
drwxr-xr-x 30 root root 4096 2009-01-08 18:04 .drwxrwxrwx 5 root root 4096 2009-01-08 18:03 ..-rw-r--r-- 1 root root 888832 2009-01-08 18:15 audit.txtdrwxr-xr-- 2 root root 12288 2009-01-08 18:15 avidrwxr-xr-- 2 root root 12288 2009-01-08 18:15 bmpdrwxr-xr-- 2 root root 69632 2009-01-08 18:15 dlldrwxr-xr-- 2 root root 4096 2009-01-08 18:10 docdrwxr-xr-- 2 root root 20480 2009-01-08 18:15 exedrwxr-xr-- 2 root root 139264 2009-01-08 18:15 gifdrwxr-xr-- 2 root root 20480 2009-01-08 18:15 htmdrwxr-xr-- 2 root root 4096 2009-01-08 18:13 jardrwxr-xr-- 2 root root 135168 2009-01-08 18:15 jpgdrwxr-xr-- 2 root root 4096 2009-01-08 18:04 mbddrwxr-xr-- 2 root root 4096 2009-01-08 18:15 movdrwxr-xr-- 2 root root 4096 2009-01-08 18:04 mpgdrwxr-xr-- 2 root root 4096 2009-01-08 18:14 oledrwxr-xr-- 2 root root 4096 2009-01-08 18:14 pdfdrwxr-xr-- 2 root root 57344 2009-01-08 18:15 pngdrwxr-xr-- 2 root root 4096 2009-01-08 18:04 pptdrwxr-xr-- 2 root root 4096 2009-01-08 18:14 rardrwxr-xr-- 2 root root 4096 2009-01-08 18:04 rifdrwxr-xr-- 2 root root 4096 2009-01-08 18:04 sdwdrwxr-xr-- 2 root root 4096 2009-01-08 18:04 sxdrwxr-xr-- 2 root root 4096 2009-01-08 18:04 sxcdrwxr-xr-- 2 root root 4096 2009-01-08 18:04 sxidrwxr-xr-- 2 root root 4096 2009-01-08 18:04 sxwdrwxr-xr-- 2 root root 4096 2009-01-08 18:04 visdrwxr-xr-- 2 root root 12288 2009-01-08 18:15 wavdrwxr-xr-- 2 root root 4096 2009-01-08 18:15 wmvdrwxr-xr-- 2 root root 4096 2009-01-08 18:13 xlsdrwxr-xr-- 2 root root 4096 2009-01-08 18:14 zip
Member discussion