How to define a multiple page ranges for pdftk with a bash variable

I’m using Arch linux, Openbox window manager, and bash.
Everything is up to date with the latest versions.

Can anyone tell me why I can’t get the "$page_range" variable to show up within pdftk when I specify a couple of page ranges as 3-5 7-9?

When I specify only one page range 3-5 in my yad pop-up box everything works as it should.

pdftk does allow more than one page range to be defined within the command. Indeed when I type the command out on the command line without using bash variables within it, pdftk works as expected taking the page ranges 3-5 7-9. Just not when I contain this value within the variable "$page_range".

All I want to do is extract page ranges 3-5 and 7-9 from file
into another pdf file
using the variable $page_range to define my ranges.

Here is my simple script.


# collect the values with yad

extract_values=$(yad --form --width=200 
--title="Enter the page ranges you wish to extract" 
--text="nn  Enter the page ranges you wish to extractn    as eg 301-302n    or 301-302 305-306n     for grouping" 
--field="Page range":text "11-13 21-23" 
--button="Edit script":1 

# strip out the values from the string
page_range=$(echo $extract_values | cut -d '|' -f  1)
echo $page_range

# produce a unique file extender 
page_range_slugify="$(echo "$page_range" | sed 's/ /_/g')" 
echo;echo $page_range_slugify

# specify the filename

# get path and file name without pdf extension

# check everything is as it should be
yad --text="n page range = $page_rangen page_range_slugify = $page_range_slugifyn file + path without file extension = $fznn"

# below works only for one range but will not expand for two page ranges
pdftk "$f" cat "$page_range" output "$fz"_"$page_range_slugify".pdf

# below takes one range only as above 
#pdftk "$f" cat "$(printf %s "$page_range")" output "$fz"_"$page_range_slugify".pdf

# below takes both ranges when ranges are directly placed within the command
#pdftk "$f" cat 3-5 7-9 output "$fz"_"$page_range_slugify".pdf
Asked By: Kes


The solution is to not double quote the variable $page_range.
At least it gets the script working functionally.

do this $page_range
not this "$page_range"

For some reason pdftk does not like " " expansion of that particular variable.

I was guessing that pdftk was eating one of the quotation marks and not the other at that position because of some bug which causes it to fail.
But that can’t be that because
page_range="3-5" expands correctly as "$page_range" double quoted no space
page_range="3-5 7-9" does not expand correctly as "$page_range" double quoted

So it must be something to do with the space in the middle of the page ranges when double quoting and the way this is expanded or the way pdftk sees it.

Anyone any ideas?

Even if it is all now working without the quotation marks around $page_range variable this is very odd.

Beause normally quotation marks around the variable in bash is safe. We are all used to doing this for the eventuality that a file path and name we are processing contains dreaded spaces!
It’s thus very odd that not double quoting handles the spaces and quoting does not.
How strange.

Another thought is that space expansion may be to a particular type of space, in a particular character encoding format that pdftk does not like.

Answered By: Kes

This is happening because you are doing the right thing, you are quoting your variables. However, because they are quoted, that means the two ranges are passed as a single string to pdftk and it expects two or more strings separated by spaces. In this specific case, where you know and control what the variable’s value is, you might be able to get away with no quoting. But not in all cases, and this looks like you’re asking users for input so they could pass anything to the script, making that a security risk, so the clean solution is to use an array instead. Try this:

page_range=( $(printf '%sn' "$extract_values" | cut -d '|' -f  1) )

You can then pass that as "${page_range[@]}" and have both the benefits of safely quoting your variables, and the ease of use of having multiple ranges in a variable.

So, the relevant lines in your script become:

page_range=( $(printf '%sn' "$extract_values" | cut -d '|' -f  1) )

[ . . . ]
## With thanks to
page_range_slugify="$(IFS="_" ; printf '%sn' "${page_range[*]}")" 

[ . . . ]
pdftk "$f" cat "${page_range[@]}" output "${fz}_$page_range_slugify".pdf
Answered By: terdon
Categories: Answers Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.