Using awk to sum the values of a column, based on the values of another column

I am trying to sum certain numbers in a column using awk. I would like to sum just column 3 of the “smiths” to get a total of 212. I can sum the whole column using awk but not just the “smiths”. I have:

awk 'BEGIN {FS = "|"} ; {sum+=$3} END {print sum}' filename.txt

Also I am using putty. Thank you for any help.

smiths|Login|2
olivert|Login|10
denniss|Payroll|100
smiths|Time|200
smiths|Logout|10
Asked By: jake

||

Very good so far. All you need to do is add a selector before the block to add the sum. Here we check that the first argument contains only “smiths”:

awk 'BEGIN {FS = "|"} ; $1 ~ /^smiths$/ {sum+=$3} END {print sum}'

You could shorten this a bit by specifying the field separator as an option. In awk it’s generally a good idea to initialize variables on the command line:

awk -F'|' '$1 ~ /^smiths$/ {sum+=$3} END {print sum}'
Answered By: RobertL
awk -F '|' '$1 ~ /smiths/ {sum += $3} END {print sum}' inputfilename
  • The -F flag sets the field separator; I put it in single quotes because it is a special shell character.
  • Then $1 ~ /smiths/ applies the following {code block} only to lines where the first field matches the regex /smiths/.
  • The rest is the same as your code.

Note that since you’re not really using a regex here, just a specific value, you could just as easily use:

awk -F '|' '$1 == "smiths" {sum += $3} END {print sum}' inputfilename

Which checks string equality. This is equivalent to using the regex /^smiths$/, as mentioned in another answer, which includes the ^ anchor to only match the start of the string (the start of field 1) and the $ anchor to only match the end of the string. Not sure how familiar you are with regexes. They are very powerful, but for this case you could use a string equality check just as easily.

Answered By: Wildcard

Another approach is to use awk associative arrays, more info here. This line produces the desired output:

awk -F '|' '{a[$1] += $3} END{print a["smiths"]}' filename.txt

As a side effect, the array stores all other values:

awk -F '|' '{a[$1] += $3} END{for (i in a) print i, a[i]}' filename.txt

Output:

smiths 212
denniss 100
olivert 10
Answered By: Andrey
cat filename.txt | grep smiths | awk -F '|' '{sum+=$NF} END {print sum}'
  • -F option to specify separator.
  • $NF is for “last column”.
Answered By: forzagreen

I personally would prefer to keep the awk section as simple as possible and do as much as you can without it. Comingled logic doesn’t take advantage of the power of Unix pipelines and is thus harder to understand, debug or modify for closely related use cases.

cat filename.txt | perl -pe 's{.*|}{}g' | awk '{sum+=$1} END {print sum}'
Answered By: Sridhar Sarnobat
Categories: Answers Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.