How to merge multiple piped awk commands into a single awk command

I am writing a script to filter a file that has contents like

a:10
b:20
c:60
# comment
{{# random mustache templating}}
d=4
e=6

to get the output which would look like

a
b
c
d
e

Here is my command

cat filename.txt | awk '{$1=$1;print}' | awk -F'{{' '{print $1}' | awk -F'=' '{print $1}' | awk -F':' '{print $1}' | awk -F'#' '{print $1}' | awk /./

Purpose:

  • Remove anything in a line from the occurrence of characters ‘=’ or ‘:’.
  • Remove the line that starts with ‘{{‘ to remove templating.
  • Trim whitespaces at the beginning and end of each line.
  • Remove all blank lines.

As I am new to bash, how can I make this command shorter?

Asked By: borz

||

The field separator can be a full regex, so

awk -F'[:#=]' '!/^{{/ && length($1) > 0 { split($1, a, " "); print a[1] }' filename.txt

is sufficient: any one of ‘:’, ‘#’, ‘=’ will act as a separator. We exclude lines starting with “{{”, match lines where $1 is non-empty, split $1 on whitespace, and print the first resulting field.

Answered By: Stephen Kitt

May be this will help you to achieve the expected result

#!/bin/bash

dynamic_array=()

while read -r line 
do 
    var=$(echo "$line" | cut -c 1)    
    if ! { [ "$var" = '#' ] ||  [ "$var" = '{' ] || [ "$var" = '}' ]; }
    then
                 dynamic_array+=("$var")   
    fi 
done < A.txt

str_array_value="${dynamic_array[*]}" ; echo "$str_array_value" | tr ' ' 'n' | awk '!seen[$0]++'

Output :

a   
b   
c    
d
e
Answered By: codeholic24

Keep it simple:

$ awk 'NF && ($1 !~ /^(#|{+)/) { sub(/[:=].*/,""); print $1 }' file
a
b
c
d
e
Answered By: Ed Morton

To achieve the result above, I just used regex for the field separator, regex to select the lines and {print $1} to print the first column.

I see no leading whitespace or blank lines in your example, but if you need to deal with these, see my variations to this command below.

awk -F'[:=]' '!/^[#{]/{print $1}' filename.txt

Result:

a
b
c
d
e

If you have whitespace leading or trailing, the following may work. Though, I will admit, without seeing an example it is tricky for me to visualise.

awk -F'[:=]' '{gsub(/^s+|s+$/,"",$1)} !/^[#{]/{print $1}' filename.txt

To cover every possible case, based on your comments, I have adapted the example. Now, we have leading and trailing whitespace and empty lines.

a:10
b :20
  c:60
# comment

 {{# random mustache templating}}
d=4
e =6   

This is the slightly altered command to deal with this:

awk -F'[:=]' '{gsub(/^s+|s+$/,"",$1)} !/^[#{]/ && !/^$/{print $1}' filename.txt
  1. The field separator regex separates the first field $1 from everything which comes after : or =
  2. gsub removes all leading and trailing spaces
  3. The regex before {print $1} removes all lines starting with a # or { to exclude comments, ‘templating’ and blank lines.

This produces the following result from the adapted example:

a
b
c
d
e
Answered By: Bumbling Badger

Using sed:

sed -E '{ s/s*([^:=]*).*/1/ }; /^({{|#|$)/d' infile

Swipe the order of the commands above to sed -E '/.../d; { ... }', if you also want to keep those lines that not started immediately with {{ or # characters but whitespaces.

Answered By: αғsнιη
awk -F "[:=#/{]" '{print $1}' | awk NF
Answered By: ppppp
Categories: Answers Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.