How to download an archive and extract it without saving the archive to disk?

I’d like to download, and extract an archive under a given directory. Here is how I’ve been doing it so far:

wget http://downloads.mysql.com/source/dbt2-0.37.50.3.tar.gz
tar zxf dbt2-0.37.50.3.tar.gz
mv dbt2-0.37.50.3 dbt2

I’d like instead to download and extract the archive on the fly, without having the tar.gz written to the disk. I think this is possible by piping the output of wget to tar, and giving tar a target, but in practice I don’t know how to put the pieces together.

Asked By: BenMorel

||

You can do it by telling wget to output its payload to stdout (with the flag -O-) and suppress its own output (with the flag -q):

wget -qO- your_link_here | tar xvz

To specify a target directory:

wget -qO- your_link_here | tar xvz -C /target/directory

If you happen to have GNU tar, you can also rename the output dir:

wget -qO- your_link_here | tar --transform 's/^dbt2-0.37.50.3/dbt2/' -xvz
Answered By: Joseph R.

This oneliner does the trick:

tar xvzf -C /tmp/ < <(wget -q -O - http://foo.com/myfile.tar.gz)

short explanation:
the right side in the parenthesis is executed first (-q tells wget to do it quietly, -O - is used to write the output to stdout).

Then we create a named pipe using the process substitution operator from Bash <( to create a named pipe.
This way we create a temporary file descriptor and then direct the contents of that descriptor to tar using the < file redirection operator.

Answered By: ItsMe

Another option is to use curl which writes to stdout by default:

curl -s -L https://example.com/archive.tar.gz | tar xvz - -C /tmp
Answered By: Zlemini

Named pipe with stdin solution and really mind the flags for tar’s -xvz

tar -xvz -C /tmp/ -f <(wget -q -O - https://github.com/user/repo/release/download/v/v.tar.gz)
Answered By: Khalil Gharbaoui

One liner that handles redirects and can extract tar.bz2 files. Use xzfor extracting gzip files.

curl -L https://downloads.getmonero.org/cli/linux64 | tar xj
Answered By: Elijah

The extraction part should take input from STDOUT. We may need tar -xzvf - -C <output_dir>

Example:


# this may not work
# It might complain 
# tar (child): -C: Cannot open: No such file or directory
wget -qO - https://dlcdn.apache.org/spark/spark-3.3.0/spark-3.3.0-bin-hadoop3-scala2.13.tgz | tar -xzvf -C /opt/spark --strip-component 1


# this should work. 
wget -qO - https://dlcdn.apache.org/spark/spark-3.3.0/spark-3.3.0-bin-hadoop3-scala2.13.tgz | tar -xzvf - -C /opt/spark --strip-component 1


Answered By: Sairam Krish
Categories: Answers Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.