Any linux command to perform parallel decompression of tar.bz2 file?

I have a rather large file (~50GB) and it takes some time to run

tar xvf file.tar.bz2

on it. I’m aware of programs that can do parallel compression for bzip2 files but unaware of programs that can do parallel decompression for bzip2 files.

Are there any programs that can achieve this? What is the exact syntax of the command to use to extract from the file?

I’m using ubuntu 12.04

Asked By: user784637

||

you can use pbzip2 with the -d flag to “decompress”,

from the manpage:

  pbzip2 -d myfile.tar.bz2

This example will decompress the file “myfile.tar.bz2” into the
decompressed file “myfile.tar”. It will use the autodetected # of
processors (or 2 processors if autodetect not supported).

After decompressing, you need to untar the file with

 tar xf myfile.tar

A tar file is just a container, to which you can apply multiple compression algorithms, for example, you can have a “.tar.gz” or a “.tar.bz2” which both have different compression algorithms applied. So pbzip2 will only uncompress the archive but it will not extract the files, use tar to extract the files. Tar shouldn’t take long since the archive is already uncompressed and it will just extract the files. (note that we are Not using the ‘z’ flag or the ‘j’ flag in the tar command, which they indicate that we also want to decompress the file)

Answered By: Sam

lbzip2 and pbzip2 are the tools which you can use for parallel compression and decompression.

Usage:

lbzip2 -d <file.tar.bz2> 
pbzip2 -d <file.tar.bz2> 

-d option is used for decompression.

To install these packages run one or both of the following commands:

sudo apt install lbzip2 and sudo apt install pbzip2

Answered By: devav2

You can uncompress your archive with a single command using the tar -I option.
It gives you the ability to use any compression utility that supports the -d option.

tar -I lbzip2 -xvf <file.tar.bz2>

It comes very useful when deailing with big archive as you don’t need to have twice the uncompressed size available on the target filesystem (the tar temp file and the output file)
It’s also faster as you need far less disk IO.

Of course that works when compressing too :

tar -I lbzip2 -cvpf <file.tar.bz2> <file>

Check tar --help for more options.

Answered By: Ludovic Ronsin

lbzip2 seems a lot better than pbzip2 in your case as it is able to speed up decompression of standard .bz2 files while pbzip2 doesn’t do that. (Just tested it – 17 seconds for lbzip2 vs 56 seconds for pbzip2 on a partially loaded quad core).

Answered By: Stefan Reich
Categories: Answers Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.