How to redirect output of wget as input to unzip?
I have to download a file from this link. The file download is a zip file which I will have to unzip in the current folder.
Normally, I would download it first, then run the unzip command.
wget http://www.vim.org/scripts/download_script.php?src_id=11834 -O temp.zip
unzip temp.zip
But in this way, I need to execute two commands, wait for the completion of first one to execute the next one, also, I must know the name of the file temp.zip
to give it to unzip
.
Is it possible to redirect output of wget
to unzip
? Something like
unzip < `wget http://www.vim.org/scripts/download_script.php?src_id=11834`
But it didn’t work.
bash: `wget http://www.vim.org/scripts/download_script.php?src_id=11834 -O temp.zip`: ambiguous redirect
Also, wget
got executed twice, and downloaded the file twice.
You have to download your files to a temp file, because (quoting the unzip man page):
Archives read from standard input
are not yet supported, except with
funzip (and then only the first
member of the archive can be
extracted).
Just bring the commands together:
wget "http://www.vim.org/scripts/download_script.php?src_id=11834" -O temp.zip
unzip temp.zip
rm temp.zip
But in order to make it more flexible you should probably put it into a script so you save some typing and in order to make sure you don’t accidentally overwrite something you could use the mktemp
command to create a safe filename for your temp file:
#!/bin/bash
TMPFILE=`mktemp`
PWD=`pwd`
wget "$1" -O $TMPFILE
unzip -d $PWD $TMPFILE
rm $TMPFILE
I don’t think you even want to bother piping wget’s output into unzip.
From the wikipedia “ZIP (file format)” article:
A ZIP file is identified by the presence of a central directory located at the end of the file.
wget has to completely finish the download before unzip can do any work, so they run sequentially, not interwoven as one might think.
This is a repost of my answer to a similar question:
The ZIP file format includes a directory (index) at the end of the archive. This directory says where, within the archive each file is located and thus allows for quick, random access, without reading the entire archive.
This would appear to pose a problem when attempting to read a ZIP archive through a pipe, in that the index is not accessed until the very end and so individual members cannot be correctly extracted until after the file has been entirely read and is no longer available. As such it appears unsurprising that most ZIP decompressors simply fail when the archive is supplied through a pipe.
The directory at the end of the archive is not the only location where file meta information is stored in the archive. In addition, individual entries also include this information in a local file header, for redundancy purposes.
Although not every ZIP decompressor will use local file headers when the index is unavailable, the tar and cpio front ends to libarchive (a.k.a. bsdtar and bsdcpio) can and will do so when reading through a pipe, meaning that the following is possible:
wget -qO- http://example.org/file.zip | bsdtar -xvf-
If you have the JDK installed, you can use jar
:
wget -qO- http://example.org/file.zip | jar xvf /dev/stdin
The proper syntax would be:
$ unzip <(curl -sL https://www.winpcap.org/archive/1.0-docs.zip)
but it won’t work, because of the error (Info-ZIP on Debian):
lseek(3, 0, SEEK_SET) = -1 ESPIPE (Illegal seek)
Archive: /dev/fd/63
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of /dev/fd/63 or
/dev/fd/63.zip, and cannot find /dev/fd/63.ZIP, period.
or on BSD/OS X:
Trying to read large file (> 2 GiB) without large file support
This is, because the standard zip tools are mainly using lseek
function in order to set the file offset at the end to read its end of central directory record. It is located at the end of the archive structure and it is required to read the list of the files (see: Zip file format structure). Therefore the file cannot be FIFO, pipe, terminal device or any other dynamic, because the input object cannot be positioned by the lseek
function.
So you have the following workarounds:
- use different kind of compression (e.g.
tar.gz
), - you have to use two separate commands,
- use alternative tools (as suggested in other answers),
- create an alias or function to use multiple commands.
Repost of my answer:
BusyBox’s unzip
can take stdin and extract all the files.
wget -qO- http://downloads.wordpress.org/plugin/akismet.2.5.3.zip | busybox unzip -
The dash after unzip
is to use stdin as input.
You can even,
cat file.zip | busybox unzip -
But that’s just redundant of unzip file.zip
.
If your distro uses BusyBox by default (e.g. Alpine), just run unzip -
.
This works for me quite well on macOS :
tar xvf <(curl -sL http://www.vim.org/scripts/download_script.php?src_id=11834)
jar xvf <(curl -sL http://www.vim.org/scripts/download_script.php?src_id=11834)
wget -qO- http://www.vim.org/scripts/download_script.php?src_id=11834 | tar xvf -
wget -qO- http://www.vim.org/scripts/download_script.php?src_id=11834 | jar xvf -
BTW : The tar
version used here is :
$ tar --version
bsdtar 3.3.2 - libarchive 3.3.2 zlib/1.2.11 liblzma/5.0.5 bz2lib/1.0.6
A zip
archive is not sequential because it often has the table of contents at the end of the file, so it is difficult to stream-unzip it.
An alternative solution is to see if you can get another file format, like .tar.gz
.
For example, if you’re downloading a .zip
file from GitHub, there is almost always a .tar.gz
version available.
For example,
- https://github.com/madler/zlib/archive/v1.2.11.zip
- https://github.com/madler/zlib/archive/v1.2.11.tar.gz
- https://github.com/curl/curl/archive/curl-7_68_0.zip
- https://github.com/curl/curl/archive/curl-7_68_0.tar.gz
Notice the pattern — just replace .zip
with .tar.gz
and pipe to | tar xzf -
If there is only one file in zip, you can use zcat
or gunzip
or funzip
:
wget -qO- http://www.vim.org/scripts/download_script.php?src_id=11834 | funzip -
To unzip more that one file inside a zip archive provided from a pipe, you can either wait for the unzip team to implement it or use the busybox unzip
solution from Saftever.
Just for information : Here are the definitions of gunzip
and zcat
on my system :
$ grep ^exec $(which gunzip zcat)
/bin/gunzip:exec gzip -d "$@"
/bin/zcat:exec gzip -cd "$@"