How to make bash not assign a n to a variable?

I have a variable that goes more or less like this:

$ echo "$LIST"
file1: ok
file2: ok
file3:
file4:
file5: ok

Then I need to get the list of files that are not ok:

$ sed '/:s.+$/d' <<< "$LIST"
file3:
file4:

That works file, but in the event that there is no file that are not ok in the list, this will happen:

$ echo "$LIST"
file1: ok
file2: ok
file3: ok
file4: ok
file5: ok
$ sed '/:s.+$/d' <<< "$LIST"
$ NEWLIST=$(sed '/:s.+$/d' <<< "$LIST")
$ echo "$NEWLIST"

$ cat -A <<< "$NEWLIST"
$
$ wc -l <<< "$NEWLIST"
1
$ wc -c <<< "$NEWLIST"
1

This newline (I think that is a n) added to the variable is making a mess on my program as it identifies as having one file listed because I use wc -l to know how many files exist there. I’m not entirely sure if the n is assigned by bash or by sed. Does anyone know a workaround?

Asked By: Adriano_epifas

||

I’m not entirely sure if the n is assigned by bash or by sed.

It is neither at this point. You can check it with for example

$ echo -n "$NEWLIST"  # echo without -n adds a newline
$ wc -l <<< ""
1
$ wc -c <<< ""
1

It is the echo which is intentionally printing your variable (which is empty) and ends with a newline and for the here-string Bash ensures that it ends with a newline because tools tend to behave surprising if the last line is not ending with a n.

The simplest way to fix it is probably to just check if NEWLIST is empty. Alternatively work more with files directly. For example:

list_file="$(mktemp)"
new_list_file="$(mktemp)"

# cleanup
trap 'rm "$list_file" "$new_list_file"' EXIT

echo "$LIST" > "$list_file"
sed '/:s.+$/d' "$list_file" > "$new_list_file"
wc -l "$new_list_file"
# or wc -l < "$new_list_file" if you want to prevent it from printing the filename

For an example of an unexpected result if here-string would not add a newline think what result you expect from the following command and then run it:

echo -n "Content" | wc -l

Answer is: 0

Answered By: Paul Pazderski

If you just want to exclude ok items, you could do this:

grep -cv ': ok$' <<< "$LIST"

Or similarly, but broadly opposite

grep -c ':$' <<< "$LIST"

EDIT: based on comment from @ilkkachu

If the list is entirely empty the -vc variant will erroneously report a count of 1, due to a <<< behaviour.

You could either add a guard to detect an empty LIST or simply use pipe instead

printf "%s" "$LIST" | grep -vc ": ok$"

If the list could contain blank lines they would also cause a miscount when using -vc.

In either case a further modification could be applied to prevent miscounting.

grep -vcE "^$|: ok$"

But now we are starting to jump through hoops and thus making the code harder to understand.

Answered By: bxm

A here-string <<< adds a newline at the end of the string before passing it to the command, and a command substitution removes any trailing newlines after reading the output of the command inside.

Also, echo adds a trailing newline to what it prints, but that’s not a major issue here.

So, let’s say you have one complete line in the variable, with the newline at the end, so the string is file1: ok<nl>

Now, you run sed '/ok/d' <<< "$LIST", and here, sed gets the input file1: ok<nl><nl>. It removes any lines that contain ok, and outputs the empty line <nl>.

You capture that with a command substitution, which removes the trailing newlines, giving the empty string. That’s then assigned to NEWLIST.

Then, echo "$NEWLIST" prints a single newline (because echo adds one), and wc -l <<< "$NEWLIST" gives a single newline as input to wc (because the here-string adds one).

You’d get the same end result if the variable was initially just file1: ok, without the trailing newline, just the command substitution would have no newlines to remove from the end of the output of sed.

All of that means both command substitutions and here-strings mostly work well for single-line values, where you might not want the newline to appear in a variable but often do want to provide any commands with a complete line as input. As you saw, they do work for multiline-strings too (if the final newlines is again missing from the variable), but the odd dropping and adding of the newline still happens. It breaks down in the case where there is no newline to remove.

To see why that juggling is done, note that if command substitution didn’t remove the newline from the output of date here, this would print a newline just before the period at the end, breaking the line in the middle.

$ weekday=$(date +%A)
$ echo "today is a $weekday."
today is a Thursday.

The simplest workaround is likely to just store multiline data like that in files instead. With inputfile containing the five lines (with the appropriate newlines):

file1: ok
file2: ok
file3: ok
file4: ok
file5: ok

then:

tmpfile=$(mktemp)
sed -e '/ok/d' < inputfile > "$tmpfile"
wc -l < "$tmpfile"
rm -f "$tmpfile"

outputs 0.

(Note that neither s or + is standard POSIX basic or extended regex syntax, so they don’t work on all systems, e.g. with the sed on macOS. That’s why I used just /ok/ above.)

Answered By: ilkkachu
Categories: Answers Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.