Get over 2 GB limit creating PDFs with ImageMagick
I am using convert
to create a PDF file from about 2,000 images:
convert 0001.miff 0002.miff ... 2000.miff -compress jpeg -quality 80 out.pdf
The process terminates reproducible when the output file has reached 2^31-1 bytes (2 GB −1) with the message
convert: unknown `out.pdf'.
The PDF file specification allows for ≈10 GB. I tried to pull more information from -debug all
, but I didn’t see anything helpful in the logging output. The file system is ext3 which allows for files at least up to 16 GiB (may be more). As to ulimit
, file size
is unlimited
. /etc/security/limits.conf
only contains commented-out lines. What else can cause this and how can I increase the limit?
ImageMagick version: 6.4.3 2016-08-05 Q16 OpenMP
Distribution: SLES 11.4 (i586)
Your limitation does not stem indeed from the filesystem; or from package versions I think.
Your 2GB limit is coming from you using a 32-bit version of your OS.
The option to increase the file would be installing a 64-bit version if the hardware supports it.
Traditionally, many operating systems and their underlying file system
implementations used 32-bit integers to represent file sizes and
positions. Consequently, no file could be larger than 232 − 1 bytes (4
GB − 1). In many implementations, the problem was exacerbated by
treating the sizes as signed numbers, which further lowered the limit
to 231 − 1 bytes (2 GB − 1).
Try limiting the pixel cache used by convert
to e.g. 1 GiB:
convert 0001.miff ... 2000.miff -limit memory 1GiB -limit map 1GiB -compress jpeg -quality 80 out.pdf
Hopefully this will force ImageMagic to regularly dump already processed data on the disk instead of trying to fit more than 2 GiB in RAM buffers.
BTW, the amount of virtual memory available to a single process on 32-bit Linux is defined by the VMSPLIT
kernel config setting. This can be either 2G/2G (2GB for kernel + 2GB for userland) or 1G/3G (1 GB for kernel + 3 GB for userland). On a running system, the setting can be found via
zcat /proc/config.gz | grep VMSPLIT
On some systems the kernel config is stored in /boot/config-$(uname -r)
instead.
If it wasn’t for the huge number of photographs you could use TeX/LaTeX to create the PDF. Then you can still get the same outcome (pdf of images) without the converter crash problem. The file limits on TeX should just be your system (hardware+OS)
But I think you could use a shell script to write the TeX:
0)
mkdir convert
pushd convert
PATH=convert:$PATH /* keep everything in one directory for tidyness.*/
1) make a template
1.1) I’m sure there’s a way to do this step in one go, by replacing the image name with variable and inserting rather than appending, and to format $FOO to have the correct leading 0’s, but the following is just what I know.
1.2) The template needs to split in order for the script to insert the file name
1.3) nano tmplt1 /* or editor of your choice*/
/* white space line */
begin{figure}[h!]
includegraphics[width=0.5linewidth]{
/* at this point the script will insert $FOO, the file name variable */
1.3.1) However, your files go 0001.miff … 0010.miff … 0100.miff … 2000.miff. Ie a variable number of leading zeros. Workaround: 4 versions of tmplt1: tmplt1-9, tmplt10-99, tmplt100-999, tmplt1000-2000. Tmplt1-9 ends “…width]{000” (ie add 3 0’s); tmplt10-99 ends “…width]{00” (ie add 2 0’s). 100-999 adds 1 zero and 1000-2000 is the same as tmplt1
1.4) next part of template: nano tmplt2 /* OEOYC */
.miff}
caption{ /* if you want to caption, otherwise skip to tmplt3.
Same again, script will insert $FOO here */
1.5) next part of template: nano tmplt3 /* OEOYC */
}
label{f: /*if you want them labelled which is actually
a index/reference for the text to refer to, not a caption.
Same again, the script will insert $FOO here. If you do not
want labels, skip to tmplt4*/
1.6) next template: nano tmplt4 /* OEOYC */
}
end{figure}
2) make the begining of the file: nano head /* OEOYC */
documentclass{article} /* Or more suitable class */
usepackage{graphicx}
begin{document}
/* white space line*/
3) make the end of file: nano foot /*OEOYC */
end {document}
4) make the script: nano loader /* OEOYC */
#! /bin/bash
cat head > out.pdf
for FOO in {1...9}
do
cat tmplt1-9 >> out.pdf
echo "$FOO" | cat >> out.pdf
cat tmplt2 >> out.pdf
echo "$FOO" | cat >> out.pdf
cat tmplt3 >> out.pdf
echo "$FOO" | cat >> out.pdf
cat tmplt4 >> out.pdf
done
for FOO in {10...99}
do
cat tmplt10-99 >> out.pdf /* this looks like a lot but
is actually copy-paste of first block, just add relevant 0's and 9's */
echo "$FOO" | cat >> out.pdf
cat tmplt2 >> out.pdf
echo "$FOO" | cat >> out.pdf
cat tmplt3 >> out.pdf
echo "$FOO" | cat >> out.pdf
cat tmplt4 >> out.pdf
done
for FOO in {100...999}
do
cat tmplt100-999 >> out.pdf
echo "$FOO" | cat >> out.pdf
cat tmplt2 >> out.pdf
echo "$FOO" | cat >> out.pdf
cat tmplt3 >> out.pdf
echo "$FOO" | cat >> out.pdf
cat tmplt4 >> out.pdf
done
for FOO in {1000...2000}
do
cat tmplt1000-2000 >> out.pdf
echo "$FOO" | cat >> out.pdf
cat tmplt2 >> out.pdf
echo "$FOO" | cat >> out.pdf
cat tmplt3 >> out.pdf
echo "$FOO" | cat >> out.pdf
cat tmplt4 >> out.pdf
done
cat foot >> out.pdf
5) make script executable: chmod u+x loader
5.1) After testing this, I found that every time $FOO was inserted, it was spread out over 3 lines. I don’t know any workaround other than going into the script and manually deleting the carriage returns. At least it is only 36 for all 2000 photos
6) call script: loader
7) compile the TeX: pdflatex out.pdf