How to extract specific file(s) from tar.gz

How can we extract specific files from a large tar.gz file? I found the process of extracting files from a tar in this question but, when I tried the mentioned command there, I got the error:

$ tar --extract --file={test.tar.gz} {extract11}
tar: {test.tar.gz}: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now

How do I then extract a file from tar.gz?

Asked By: Ankit Vashistha

||

Your example works for me if you omit the braces

$ tar --extract --file=test.tar.gz extract11

If your file extract11 is in a subfolder, you should specify the path within the tarball.

$ tar --extract --file=test.tar.gz subfolder/extract11
Answered By: Bernhard

You can also use tar -zxvf <tar filename> <file you want to extract>

You must write the file name exacty as tar ztf test.tar.gz shows it. If it says e.g. ./extract11, or some/bunch/of/dirs/extract11, that’s what you have to give (and the file will show up under exactly that name, needed directories are created automatically).

  • -x: instructs tar to extract files.
  • -f: specifies filename / tarball name.
  • -v: Verbose (show progress while extracting files).
  • -z: filter archive through gzip, use to decompress .gz files.
  • -t: List the contents of an archive
Answered By: harish.venkat

Let’s assume you have a tarball called lotsofdata.tar.gz and you just know there is one file in there you want but all you can remember is that its name contains the word contract. You have two options:

Either use tar and grep to list the contents of your tarball so you can find out the full path and name of any files that match the part you know, and then use tar to extract that one file now you know its exact details, or you can use two little known switches to just extract all files that match what little you do know of your file name—you don’t need to know the full name or any part of its path for this option. The details are:

Option 1

$ tar -tzf lotsofdata.tar.gz | grep contract

This will list the details of all files whose names contain your known part. Then you extract what you want using:

$ tar -xzf lotsofdata.tar.gz <full path and filename from your list above>

You may need ./ in front of your path for it to work.

Option 2

$ tar -xzf lotsofdata.tar.gz --wildcards --no-anchored '*contract*'

Up to you which you find easier or most useful.

Answered By: Wendy Cricks

Please find below the examples of extracting specific files from tar.gz file.

From local file:

$ tar xvf file.tgz path/README.txt 2nd_file.txt

From remote URL:

$ curl -s http://example.com/file.tgz | tar xvf - path/README.txt 2nd_file.txt
Answered By: kenorb

I was trying to extract a couple hundred files from a tarball with thousands of files the other day. The files I need cannot be referenced by a single wildcard. So I googled and found this page.

However, none of tricks above seem good for my task. I ended up reading the man, and found this option --files-from, so my final solution is

gunzip < thousands.tar.gz | tar -x -v --files-from hundreds.list -f -

and it works like a charm.

Update: The list file should have the same format as you would see from tar -tvf, otherwise you would not be able to extract any files.

Answered By: JJ Tang

To extract only files matching a certain pattern:

for i in $(tar ztf test.tar.gz | grep 2021-01); do tar -xzvf test.tar.gz $i; done

For multiple patterns:

for i in $(tar ztf test.tar.gz | egrep '2021-01|2021-02|2021-03'); do tar -xzvf test.tar.gz $i; done
Answered By: hansaplast

to extract only some files from a large archive, use bsdtar --fast-read from libarchive

example:

$ du -sh chromium-124.0.6367.60.tar.zstd
2.2G    chromium-124.0.6367.60.tar.zstd

$ time bsdtar --fast-read -x -f chromium-124.0.6367.60.tar.zstd -- source/DEPS

real    0m0.034s

in this case, this is fast because the file is at the beginning of the archive,
and with --fast-read, only the first match is extracted

$ tar tf chromium-124.0.6367.60.tar.zstd | grep -n -m1 source/DEPS
20:source/DEPS

gnu tar does not have the fast-read option, it will always scan the entire archive

Answered By: milahu
Categories: Answers Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.