Rock-stable filesystem for large files (backups) for linux

What filesystem would be best for backups? I’m interested primary in stability (especially uncorruptability of files during hard reboots etc.) but how efficiently it handles large (>5GB) files is also important.

Also, which mount parameters should I use?

Kernel is Linux >= 2.6.34.

EDIT: I do not want backup methods. I need the filesystem to store them.

Asked By: Maciej Piechotka

||

btrfs has transparent checksumming of data written to disk and a fast ordered-writes mode that is always on (and many other backup-friendly features) which makes it appealing for backups. See https://btrfs.wiki.kernel.org/index.php/Main_Page for more details.

Answered By: durin42

Just use a backup tool that support checksums. For example Dar does, and it supports incremental backups. Then you can backup to a rock solid filesystem like ext3.

For backups you want something rock solid/very stable. And btrfs or ZFS are simply not ready today.

Answered By: maxschlepzig

You can use ext4 but I would recommend mounting with journal_data mode which will turn off dealloc (delayed allocation) which ’caused some earlier problems. The disabling of dealloc will make new data writes slower, but make writes in the event of power failure less likely to have loss. I should also mention that you can disable dealloc without using journal_data which has some other benefits (or at least it did in ext3), such as slightly improved reads, and I believe better recovery.

Extents will still help with fragmentation. Extents make delete’s of large files much faster than ext3, a delete of any sized data (single file) should be near instantaneous on ext4 but can take a long time on ext3. (any extent based FS has this advantage)

ext4 also fsck ‘s faster than ext3.

One last note, there were bugfixes in ext4 up to like 2.6.31? I would basically make sure you aren’t running a kernel pre 2.6.32 which is an LTS kernel.

Answered By: xenoterracide

XFS is rock solid and has been in the kernel for ages. Examine tools like xfs_freeze and see if it is what you are looking for. I know this is highly subjective but I have used XFS for data storage for years without incident.

Answered By: dsp

An imho very important aspect I have not seen discussed in the other answers is the stability features of the on disk layout of the filesystem (e.g. consider consulting the documentation of possible canditates ext4, btrfs)

While the codebase and amount of testing of the codebase filesystem drivers, is indeed important as other answers alrady showed, since it is the protection of the data during its reading and writing, the on disk layout/format is the protection against risks to your data at rest, which are forms of hardware deffects such as unreadable sectors, or silent bit rot.

With respect to ext4, which is said to have good characteristics as regards is long tested codebase (https://events.static.linuxfound.org/sites/events/files/slides/AFL%20filesystem%20fuzzing%2C%20Vault%202016_0.pdf shows it took longer to find bugs in it than for instance in the more modern and more complex btrfs), I have looked into ext4 resistence at rest and found some imho deficiencies, of the else praised filesystem.

I would consider it prudent (if chosen ext4 as the “rock-solid backup fs“) to improve recoverability (albeit “hardening it”) by using the e2image tool the developers of ext4 provide

The e2image program will save critical ext2, ext3, or ext4 filesystem
metadata located on device to a file specified by image-file. The
image file may be examined by dumpe2fs and debugfs, by using the -i
option to those programs. This can assist an expert in recovering
catastrophically corrupted filesystems. In the future, e2fsck will be
enhanced to be able to use the image file to help recover a badly
damaged filesystem.

and recommend.

It is a very good idea to create image files for all of filesystems on
a system and save the partition layout (which can be generated using
the fdisk -l command) at regular intervals — at boot time, and/or
every week or so. The image file should be stored on some filesystem
other than the filesystem whose data it contains, to ensure that this
data is accessible in the case where the filesystem has been badly
damaged.

Considering that not even all meta-data of ext4 on disk layout are provided with redundancy (i.e. superblock is stored commenly multiple times as a copy, indoes are stored in exactly 1 place only), the ext4 is surely inferrior with btrfs that would provide at least checksums for all metadata + the file contents data.

To counteract this “shortcome” of ext4 and make it a more rock-solid thing in the aspect of on disk layout it could be reasonable to supplement this redundancy and recovery for the file contents via par2/parchive

Despite the question demands the focus on the filesystem solutions, I would like to bring to attention that most of what a filesystem provides (caching, journals, reclaiming of allocated space, allocating of blocks etc) is not necessarily something that backup data will benefit from much when being only writen and read in bulk and rarley. For that I would consider using a parchive supplement tar backup as the more optimal backup solution, as codebase used in the process es reduced, and hence there less bugs if there are less “features”.

Answered By: humanityANDpeace
Categories: Answers Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.