First 32GB of a 4TB ext4 filesystem overwritten. How to recover?
Due to an accident specifying a block device, the first 32GB of a 4TB ext4 filesystem on a SATA disk was overwritten by the dd command with the contents of a USB flash drive.
fdisk -l /dev/sda reports the following:
Disk /dev/sda: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0xeaad24fe
Device Boot Start End Sectors Size Id Type
/dev/sda1 2048 8388607 8386560 4G 6 FAT16
/dev/sda2 8388608 73924607 65536000 31.3G 83 Linux
parted shows the following:
GNU Parted 3.5
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) print
Model: ATA ST4000NM0165 (scsi)
Disk /dev/sda: 4001GB
Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Disk Flags:
Number Start End Size Type File system Flags
1 1049kB 4295MB 4294MB primary ext4
2 4295MB 37.8GB 33.6GB primary
Side Note: I’m not sure why parted thinks the file system is ext4 while fdisk shows it as FAT16, but it could possibly be related to the fsck that I tried to run before understanding what had happened. I attempted to run "e2fsck /dev/sda1", and answered yes to the following question:
Superblock has an invalid journal (inode 8).
Clear<y>? yes
It then came back with the statement that the partition size didn’t match the physical size, and it was at that point I stopped without proceeding further. (I apologize I don’t have the full text of my aborted attempt with fsck. I retyped the above from memory, and I only answered yes once.)
This is what used to be on the disk:
This disk was originally auto-partitioned+formatted by the installer of Ubuntu 18.04. It was an ext4 filesystem, with a single partition, sda1, that took the entire drive. There is a separate, NVME drive that is the system partition, and this disk was configured as a secondary data disk. The parameters will be whatever the Ubuntu 18.04 installer would have selected as the defaults in this instance.
I understand any data in the first 32GB of this disk is irretrievably lost. But the data on this disk is critically important. Is there any way to recover what was on the remaining 99% of the drive?
Can someone recommend steps that would allow me to recreate the original filesystem?
Edit: gdisk -l /dev/sda shows the following:
GPT fdisk (gdisk) version 1.0.5
Caution: invalid main GPT header, but valid backup; regenerating main header
from backup!
Warning: Invalid CRC on main header data; loaded backup partition table.
Warning! Main and backup partition tables differ! Use the 'c' and 'e' options
on the recovery & transformation menu to examine the two tables.
Warning! Main partition table CRC mismatch! Loaded backup partition table
instead of main partition table!
Warning! One or more CRCs don't match. You should repair the disk!
Main header: ERROR
Backup header: OK
Main partition table: ERROR
Backup partition table: OK
Partition table scan:
MBR: MBR only
BSD: not present
APM: not present
GPT: damaged
Found valid MBR and corrupt GPT. Which do you want to use? (Using the
GPT MAY permit recovery of GPT data.)
1 - MBR
2 - GPT
3 - Create blank GPT
You should make a full "dd" copy of the partition to another device, just for safekeeping in case something goes wrong.
In general, e2fsck
should be able to recover from such an issue, subject to loss of the overwritten metadata. The superblock, root directory, journal, and other metadata would be lost. However, the superblock and other critical metadata have multiple backups later in the partition, so the majority of the data should be intact.
You might need to specify a backup superblock location, like e2fsck -fy -B 4096 -b 11239424 /dev/sda2
. The backups are stored in groups numbered 3^n, 5^n, 7^n, with 128MiB group size, so if you clobbered up to 32 GiB that is 256 groups and the next highest group number is 7x7x7 = 343
, so the backup superblock is in block 343x32768 = 11239424
.
It will put everything into the lost+found
directory, so you will have to identify files/directories by their content, age, etc.