Unable to understand what the sed commands are doing in this script
I have a script with a function which is sourced in other scripts. I am trying to go line by line but the sed regex is too complicated.
#!/usr/bin/env bash
# This function will update the value associated with a key,
# remove a comment from the beginning of the line,
# or append the key value pair to the end of the file if the key is not found
# To use this function in a script
# source this script:
# . lineinfile
#
# To invoke the function:
# lineinfile "key=value" "filename"
# OR
# lineinfile "key value" "filename"
lineinfile() {
[ -s $2 ] || echo "${1}" >> ${2}
if [[ "$1" == *"="* ]]; then
sed -i -e "/^#?.*(${1%%=*}).*/{s@@${1}@;:a;n;ba;q}" -e "$a${1}" ${2}
elif [[ "$1" == *" "* ]]; then
sed -i -e "/^#?.*(${1%% *}).*/{s@@${1}@;:a;n;ba;q}" -e "$a${1}" ${2}
elif [[ "$1" == *$'tt'* ]]; then
sed -i -e "/^#?.*(${1%%$'tt'*}).*/{s@@${1}@;:a;n;ba;q}" -e "$a${1}" ${2}
fi
}
The first line of the function [ -s $2 ] || echo "${1}" >> ${2}
– Checks if the second positional argument is a file that exists and has a non-zero size, then append the contents of $1
to end of $2
file. Why is ||
used here?
I am really not sure what the if-elif blocks are testing for. What are*"="*
*" "*
and *$'tt'*
trying to match in the if conditions?
Additionally, I have no idea what the sed commands are doing. The regex is complicated. Can anyone breakdown the sed commands for me.
-
The
||
runs the following command if the exit code of the previous command was non-zero (false/error).[ -s $2 ] || echo "${1}" >> ${2}
is equivalent to:
if ! [ -s $2 ] ; then echo "${1}" >> ${2} ; fi
Either of these will append the first arg (
$1
) to the file ($2
) if the file either doesn’t exist or is empty. BTW, see the next point, 2, below about quoting.BTW, the function could (and should) return at this point. There’s no need to run
sed
on the file as it now contains the desired value. For example (and usingprintf
instead ofecho
– see Why is printf better than echo?):[ -s "$2" ] || printf '%sn' "$1" >> "$2" && return
or, better:
if ! [ -s "$2" ] ; then printf '%sn' "$1" >> "$2" ; return ; fi
With the first form, the
return
only executes if the previous command (printf
) succeeded. The second form always returns, whetherprintf
succeeded or not. There’s no good reason to make thereturn
dependent on theprintf
succeeding (it’s just a common idiom for "chaining" commands in this kind of short-hand if/then/fi construct). Most of the time, theprintf
will succeed but sometimes (e.g. permissions or disk full) it will fail. If it fails, the function should return anyway – thesed
scripts will also fail so there’s no point running them. BTW,return
without an argument will return the exit code of the last command to be executed, so the caller will be able to detect success or failure. -
The author doesn’t seem to understand what quoting is for in shell, or how it works, or that curly-braces, e.g.
${var}
, are NOT a substitute for quoting, e.g."$var"
. See $VAR vs ${VAR} and to quote or not to quote and Why does my shell script choke on whitespace or other special characters?.The author consistently fails to quote
$2
when it should be quoted (filenames can contain spaces, tabs, newlines, and shell metacharacters that can break a shell script if used unquoted). -
The three if/elif tests check whether the first argument (
$1
) contains an=
symbol, a space, or two tabs. It runs one of three slightly different versions of a sed script depending on which one it finds.The sed scripts all check if their variant of "key" followed by either
=
, a space, or two tabs is in the file, optionally commented out with a#
. If a match is found, it replaces it with the value of$1
and runs a loop to just read and output the rest of the file. I think the purpose here is to replace only the first occurrence ofkey=value
.If the match wasn’t found (and thus the loop and quit wasn’t executed), it appends
$1
to the end of the file. -
The author seems to be overthinking this (or perhaps underthinking it). This could be done with just one sed script if
$1
were first split into key and value variables. i.e. first extract and "normalise" the data from$1
into a consistent form, then use it in just onesed
script.Or just rewrite the whole thing in perl, it’s a good choice when you want to do something that requires the strengths of both shell and sed (and has a far better and more capable regex engine than most versions of sed). e.g. replace the
if/elif/elif/fi
part of the the function with something like:perl -0777 -i -pe ' BEGIN { $r = shift; ($key,$val) = split /(=| |tt)/, $r }; s/z/$rn/ unless (s/^#?.*b$keyb.*$/$r/m)' key=value filename
This perl version works with all three variants (=, space, two tabs – the latter two need to be quoted). It slurps the entire file in at once (
-0777
option), and tries to do a multi-line search and replace operation (/m
regex modifier). If that operation fails, it appends the first arg (plus a newline) to the end of the file (z
). It also fixes a bug in the original, which fails to distinguish between, e.g.,foo=123
andfoobar=123
. Theb
word-boundary markers are used to do this. In sed, you’d use<
and>
to surround the key pattern.BTW, the
X unless Y
construct is just perl syntactic sugar forif not Y, then do X
. It could have been written asif (! s/^#?.*$key.*$/$r/m) {s/z/$rn/}
and would still work exactly the same. -
The function name
lineinfile
is very generic, but what it does is very specific. Worse, the name doesn’t match or even hint at what the function actually does. This is generally considered to be bad practice.