Unable to understand what the sed commands are doing in this script

I have a script with a function which is sourced in other scripts. I am trying to go line by line but the sed regex is too complicated.

#!/usr/bin/env bash

# This function will update the value associated with a key,
# remove a comment from the beginning of the line,
# or append the key value pair to the end of the file if the key is not found

# To use this function in a script
# source this script:
#   . lineinfile
# To invoke the function:
#   lineinfile "key=value" "filename"
# OR
#   lineinfile "key value" "filename"

lineinfile() {
  [ -s $2 ] || echo "${1}" >> ${2}
  if [[ "$1" == *"="* ]]; then
    sed -i -e "/^#?.*(${1%%=*}).*/{s@@${1}@;:a;n;ba;q}" -e "$a${1}" ${2}
  elif [[ "$1" == *" "* ]]; then
    sed -i -e "/^#?.*(${1%% *}).*/{s@@${1}@;:a;n;ba;q}" -e "$a${1}" ${2}
  elif [[ "$1" == *$'tt'* ]]; then
    sed -i -e "/^#?.*(${1%%$'tt'*}).*/{s@@${1}@;:a;n;ba;q}" -e "$a${1}" ${2}

The first line of the function [ -s $2 ] || echo "${1}" >> ${2} – Checks if the second positional argument is a file that exists and has a non-zero size, then append the contents of $1 to end of $2 file. Why is || used here?

I am really not sure what the if-elif blocks are testing for. What are*"="* *" "* and *$'tt'* trying to match in the if conditions?
Additionally, I have no idea what the sed commands are doing. The regex is complicated. Can anyone breakdown the sed commands for me.

Asked By: Cruise5

  1. The || runs the following command if the exit code of the previous command was non-zero (false/error).

    [ -s $2 ] || echo "${1}" >> ${2}

    is equivalent to:

    if ! [ -s $2 ] ; then echo "${1}" >> ${2} ; fi

    Either of these will append the first arg ($1) to the file ($2) if the file either doesn’t exist or is empty. BTW, see the next point, 2, below about quoting.

    BTW, the function could (and should) return at this point. There’s no need to run sed on the file as it now contains the desired value. For example (and using printf instead of echo – see Why is printf better than echo?):

    [ -s "$2" ] || printf '%sn' "$1" >> "$2" && return

    or, better:

    if ! [ -s "$2" ] ; then printf '%sn' "$1" >> "$2" ; return ; fi

    With the first form, the return only executes if the previous command (printf) succeeded. The second form always returns, whether printf succeeded or not. There’s no good reason to make the return dependent on the printf succeeding (it’s just a common idiom for "chaining" commands in this kind of short-hand if/then/fi construct). Most of the time, the printf will succeed but sometimes (e.g. permissions or disk full) it will fail. If it fails, the function should return anyway – the sed scripts will also fail so there’s no point running them. BTW, return without an argument will return the exit code of the last command to be executed, so the caller will be able to detect success or failure.

  2. The author doesn’t seem to understand what quoting is for in shell, or how it works, or that curly-braces, e.g. ${var}, are NOT a substitute for quoting, e.g. "$var". See $VAR vs ${VAR} and to quote or not to quote and Why does my shell script choke on whitespace or other special characters?.

    The author consistently fails to quote $2 when it should be quoted (filenames can contain spaces, tabs, newlines, and shell metacharacters that can break a shell script if used unquoted).

  3. The three if/elif tests check whether the first argument ($1) contains an = symbol, a space, or two tabs. It runs one of three slightly different versions of a sed script depending on which one it finds.

    The sed scripts all check if their variant of "key" followed by either =, a space, or two tabs is in the file, optionally commented out with a #. If a match is found, it replaces it with the value of $1 and runs a loop to just read and output the rest of the file. I think the purpose here is to replace only the first occurrence of key=value.

    If the match wasn’t found (and thus the loop and quit wasn’t executed), it appends $1 to the end of the file.

  4. The author seems to be overthinking this (or perhaps underthinking it). This could be done with just one sed script if $1 were first split into key and value variables. i.e. first extract and "normalise" the data from $1 into a consistent form, then use it in just one sed script.

    Or just rewrite the whole thing in perl, it’s a good choice when you want to do something that requires the strengths of both shell and sed (and has a far better and more capable regex engine than most versions of sed). e.g. replace the if/elif/elif/fi part of the the function with something like:

     perl -0777 -i -pe '
          BEGIN { $r = shift; ($key,$val) = split /(=| |tt)/, $r };
          s/z/$rn/ unless (s/^#?.*b$keyb.*$/$r/m)' key=value filename

    This perl version works with all three variants (=, space, two tabs – the latter two need to be quoted). It slurps the entire file in at once (-0777 option), and tries to do a multi-line search and replace operation (/m regex modifier). If that operation fails, it appends the first arg (plus a newline) to the end of the file (z). It also fixes a bug in the original, which fails to distinguish between, e.g., foo=123 and foobar=123. The b word-boundary markers are used to do this. In sed, you’d use < and > to surround the key pattern.

    BTW, the X unless Y construct is just perl syntactic sugar for if not Y, then do X. It could have been written as if (! s/^#?.*$key.*$/$r/m) {s/z/$rn/} and would still work exactly the same.

  5. The function name lineinfile is very generic, but what it does is very specific. Worse, the name doesn’t match or even hint at what the function actually does. This is generally considered to be bad practice.

Answered By: cas
Categories: Answers Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.