I/O errors, but after running badblocks everything works again : how is that possible?
HDD seemed damaged. Unable to format partition (
mkfs.ext4 I/O errors), even with a newly created GPT table. SMART test shows some errors. I was about to throw the disk away. Before that, out of curiosity, I ran a full
badblocks test. Big surprise : it didn’t detect any bad blocks ! Went back to GParted, created a GPT table + a few partitions. Everything works fine now ! What did
badblocks do ?
The full story
I am trying to make sense of what just happened : I was about to throw a HDD away because I was unable to create partitions on it, and SMART showed some errors. Before throwing the disk away I just wanted to play a little with
badblocks, and … big surprise :
badblocks seemed to have repaired my disk ! I didn’t even know that it could do that ! So I am happy now, I can indeed use my disk, it works fine, but I am still trying to figure out what just happened !
It’s a 4TB Seagate HDD that I hadn’t used in a few years. I plugged it in a SATA ↔ USB adapter (adapter works fine, I use it with several other HDDs). Wirh GParted I created a new GPT partition table, and then a partition. It was unable to proceed to the end, there was a
mkfs.ext4 I/O error :
(...) Allocating group tables: done Writing inode tables: done Creating journal (131072 blocks): done Writing superblocks and filesystem accounting information: 0/895 mke2fs 1.46.2 (28-Feb-2021) mkfs.ext4: Input/output error while writing out and closing file system
I tried several times, with different USB adapters, different USB cables, different USB ports. Never worked.
I then did a SMART short test :
# smartctl -t short -C /dev/sde (...) # smartctl -a /dev/sde (...) Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short captive Completed: read failure 90% 528 191105024 (...)
Obviously the HDD seems defect, right ? So I was about to throw it away, but did a
badblocks test before :
# badblocks -wvs -t random -b 4096 /dev/sde Checking for bad blocks in read-write mode From block 0 to 976754645 Testing with random pattern: done Reading and comparing: done Pass completed, 0 bad blocks found. (0/0/0 errors)
The test lasted about 19 hours (4TB disk), it didn’t show any errors. I was very surprised !
Back to GParted, created a new GPT table, some partitions, everything went smooth.
I ended up doing some copy tests I am used to do, in order to check the disk’s performances, and everything seems normal (155MB/s R/W when copying big files).
Also did another SMART short test, it
completed without error this time :
# smartctl -t short -C /dev/sde (...) # smartctl -a /dev/sde (...) Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short captive Completed without error 00% 549 - # 2 Short captive Completed: read failure 90% 528 191105024 (...)
Can someone make sense of that ? It’s as if running
badblocks somehow repaired my HDD. How is that possible ? Is
badblocks even supposed to do that ?
Note : more info is available if needed (full SMART output and full GParted results)
badblocks can have that effect — not really by design, but because hard drives can remap failing blocks, and will do so when they encounter a failed block during a write (since there’s no data that can be lost). By writing to every single accessible sector in the drive,
badblocks gives ample opportunity for the drive to do so; and if the drive’s spare capacity is sufficient to remap all the failed blocks,
badblocks won’t see anything amiss.
If you run
smartctl -a on the drive, you should see that it has a non-zero “reallocated sector count” (attribute 5). This indicates that it has remapped sectors.
While the drive may work fine now, this does indicate that it has problems, so it should be treated with suspicion; if part of its storage has failed, more is liable to fail in the not-too-distant future.