zfs pool metadata corrupt

I am an idiot. I had on my list to get my offsite backups set up and .. you guessed it, I didn’t get around to it before this happened. I actually thought I had set up local backups properly but it turns out that no, I hadn’t. Anyway:

I’m new to ZFS. I am running Proxmox, and enabled passthrough on 9 drives on a HDA card to a TrueNAS VM for a pool. I have two NVMe drives though I think I only set one of them up for caching, and one SSD for Proxmox. For reasons that aren’t clear to me, my zpool corrupted yesterday. My Proxmox host seems aware of the pool, which is odd to me because I created the pool in the TrueNAS guest.

I have tried running zpool import with -f -F -FX and -fFX flags. I’m not sure if I should be running these commands on the host or the guest. I’ve also tried with --readonly=on and (on the host) I’ve tried setting echo 0 > /sys/module/zfs/parameters/spa_load_verify_metadata, although I haven’t tried doing that before trying to import the zpool on the guest, because frankly, I’m a bit freaked out that both the host and guest seem to have access to the pool and I’m not sure that that isn’t contributing to the problem.

The error I’m getting is that the metadata is corrupt.
I don’t know if this is related, but this happened around the time I was trying to get a GPU installed and PCIe/GPU passthrough enabled in Proxmox for that device.

Proxmox:

root@proxmox:~# zpool import
   pool: Seabreeze
     id: 821564149027342835
  state: FAULTED
status: The pool metadata is corrupted.
 action: The pool cannot be imported due to damaged devices or data.
        The pool may be active on another system, but can be imported using
        the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-72
 config:

        Seabreeze   FAULTED  corrupted data
          raidz2-0  FAULTED  corrupted data
            sdf2    ONLINE
            sdh2    ONLINE
            sdc2    ONLINE
            sde2    ONLINE
            sdj2    ONLINE
            sdb2    ONLINE
            sdg2    ONLINE
            sdd2    ONLINE
            sdi2    ONLINE
root@proxmox:~#

TrueNAS:

truenas% sudo zpool import
   pool: Seabreeze
     id: 821564149027342835
  state: FAULTED
status: The pool was last accessed by another system.
 action: The pool cannot be imported due to damaged devices or data.
        The pool may be active on another system, but can be imported using
        the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-EY
 config:

        Seabreeze                                       FAULTED  corrupted data
          raidz2-0                                      FAULTED  corrupted data
            gptid/bb911e9d-c067-11ec-b393-734570047b00  ONLINE
            gptid/bbb5c9f6-c067-11ec-b393-734570047b00  ONLINE
            gptid/bba92ac5-c067-11ec-b393-734570047b00  ONLINE
            gptid/bbbf0f87-c067-11ec-b393-734570047b00  ONLINE
            gptid/bbda0fa2-c067-11ec-b393-734570047b00  ONLINE
            gptid/bc03effa-c067-11ec-b393-734570047b00  ONLINE
            gptid/bc114e59-c067-11ec-b393-734570047b00  ONLINE
            gptid/bbd0f901-c067-11ec-b393-734570047b00  ONLINE
            gptid/bc18eaf4-c067-11ec-b393-734570047b00  ONLINE
truenas%

Is my data recoverable?

Asked By: Harv

||

I used zdb -u -l to dump a list of uberblocks, set vfs.zfs.spa.load_verify_metadata and vfs.zfs.spa.load_verify_data to 0, and used a combination of -n, -N, -R /some/Mountpoint, -o readonly=on and -T with the txg of an older uberblock’s txg to at least get to where the data is present, in read-only form. From there I was able to see with zpool status -v, which files were corrupt, then decrypt the pool, and file-level copy the data out to an external HDD.

Answered By: Harv
Categories: Answers Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.