awk high precision arithmetic

I am looking for a way to tell awk to do high-precision arithmetic in a substitution operation. This involves, reading a field from a file and substituting it with a 1% increment on that value. However, I am losing precision there. Here is a simplified reproduction of the problem:

 $ echo 0.4970436865354813 | awk '{gsub($1, $1*1.1)}; {print}'
   0.546748

Here, I have a 16 digit after decimal precision but awk gives only six. Using printf, I am getting the same result:

$ echo 0.4970436865354813 | awk '{gsub($1, $1*1.1)}; {printf("%.16Gn", $1)}'
0.546748

Any suggestions on to how to get the desired precision?

Asked By: Ketan Maheshwari

||
$ echo 0.4970436865354813 | awk -v CONVFMT=%.17g '{gsub($1, $1*1.1)}; {print}'
0.54674805518902947

Or rather here:

$ echo 0.4970436865354813 | awk '{printf "%.17gn", $1*1.1}'
0.54674805518902947

is probably the best you can achieve. Use bc instead for arbitrary precision.

$ echo '0.4970436865354813 * 1.1' | bc -l
.54674805518902943
Answered By: Stéphane Chazelas

For higher precision with (GNU) awk (with bignum compiled in) use:

$ echo '0.4970436865354813' | awk -M -v PREC=100 '{printf("%.18fn", $1)}'
0.497043686535481300

The PREC=100 means 100 bits instead of the default 53 bits.
If that awk is not available, use bc

$ echo '0.4970436865354813*1.1' | bc -l
.54674805518902943

Or you will need to learn to live with the inherent imprecision of floats.


In your original lines there are several issues:

  • A factor of 1.1 is 10% increase, not 1% (should be a 1.01 multiplier). I’ll use 10%.
  • The conversion format from a string to a (floating) number is given by CONVFMT. Its default value is %.6g. That limits the values to 6 decimal digits (after the dot). That is applied to the result of the gsub change of $1.

    $ a='0.4970436865354813'
    $ echo "$a" | awk '{printf("%.16fn", $1*1.1)}'
    0.5467480551890295
    
    $ echo "$a" | awk '{gsub($1, $1*1.1)}; {printf("%.16fn", $1)}'
    0.5467480000000000
    
  • The printf format g removes trailing zeros:

    $ echo "$a" | awk '{gsub($1, $1*1.1)}; {printf("%.16gn", $1)}'
    0.546748
    
    $ echo "$a" | awk '{gsub($1, $1*1.1)}; {printf("%.17gn", $1)}'
    0.54674800000000001
    

    Both issues could be solved with:

    $ echo "$a" | awk '{printf("%.17gn", $1*1.1)}'
    0.54674805518902947
    

    Or

    $ echo "$a" | awk -v CONVFMT=%.30g '{gsub($1, $1*1.1)}; {printf("%.17fn", $1)}'
    0.54674805518902947 
    

But don’t get the idea that this means higher precision. The internal number representation is still a float in double size. That means 53 bits of precision and with that you could only be sure of 15 correct decimal digits, even if many times up to 17 digits look correct. That’s a mirage.

$ echo "$a" | awk -v CONVFMT=%.30g '{gsub($1, $1*1.1}; {printf("%.30fn", $1)}'
0.546748055189029469325134868996

The correct value is:

$ echo "scale=18; 0.4970436865354813 * 1.1" | bc
.54674805518902943

Which could be also calculated with (GNU) awk if the bignum library has been compiled in:

$ echo "$a" | awk -M -v PREC=100 -v CONVFMT=%.30g '{printf("%.30fn", $1)}'
0.497043686535481300000000000000
Answered By: user232326

My awk script is bigger than just a one liner, so I used the combination of St├ęphane Chazelas’s and Isaac’s answers:

  1. I set the CONVFMT variable which will globally takes care of the output formatting
  2. I also use the bignum parameter -M along with the PREC variable

Example snippet:

#!/usr/bin/awk -M -f
BEGIN {
  FS="<|>"
  CONVFMT="%.18g"
  PREC=100
}
{
  if ($2 == "LatitudeDegrees") {
    CORR = $3 // redacted specific corrections
    print("     <LatitudeDegrees>" CORR "</LatitudeDegrees>");
  } else if ($2 == "LongitudeDegrees") {
    CORR = $3 // redacted specific corrections
    print("     <LongitudeDegrees>" CORR "</LongitudeDegrees>");
  } else {
    print($0);
  }
}
END {
}

OP simplified his example, but if the awk script is not a one liner you don’t want to pollute it with printfs, but set the format like this in the variable. Likewise the precision so it don’t get lost in the actual command line invocation.

Answered By: Csaba Toth
Categories: Answers Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.