I was halfway there. That's an old bug.
Philip Guenther <[email protected]> wrote: > On Sat, Jun 6, 2020 at 5:08 PM Zé Loff <[email protected]> wrote: > > > On Sat, Jun 06, 2020 at 03:51:58PM -0700, Jordan Geoghegan wrote: > > > I'm working on a simple awk snippet to convert the IP range data listed > > in > > > the Extended Delegation Statistics data from ARIN [1] and convert it into > > > CIDR blocks. I have a snippet that works perfectly fine on mawk and gawk, > > > but not on the base system awk. I'm 99% sure I'm not using any GNUisms, > > as > > > when I break the command up into two parts, it works perfectly. > > > > > > The snippet below does not work with base awk, but does work with gawk > > and > > > mawk: (Running on 6.6 -stable system) > > > > > > awk -F '|' '{ if ( $3 == "ipv4" && $2 == "US") printf("%s/%d\n", $4, > > > 32-log($5)/log(2))}' delegated-arin-extended-latest.txt > > > > > > > > > The command does output data, but it also throws errors for certain > > lines: > > > > > > awk: log result out of range > > > input record number 94027, file delegated-arin-extended-latest.txt > > > source line number 1 > > > > > > Most CIDR blocks are calculated correctly, but about 10% of them have > > errors > > > (ie something that should calculated to be a /24 is instead calculated > > to be > > > a /30). > > > ... > > > I have no idea about what is going on, but FWIW I can reproduce this on > > i386 6.7-stable and amd64 6.7-current (well, current-ish, #232). > > Truncating the file to a single offending line produces the same result: > > log($5) is out of range. > > > > It appears to have something to do with the last field. Removing it or > > changing some of its characters seems to work, e.g.: > > > > > > arin|US|ipv4|216.250.144.0|4096|20050503|allocated|5e58386636aa775c2106140445cf2c30 > > > > arin|US|ipv4|216.250.144.0|4096|20050503|allocated|5a58386636aa775c2106140445cf2c30 > > ^ > > Fails on the first line but works on the second. > > > > Hah! Nice observation! > > The last field of the first line looks kinda like a number in scientific > notation, but when awk internally tries to set up the fields it generates > an ERANGE error...and the global errno variable is left with that value. > Several builtins in awk, including log(), perform operations and then check > whether errno is set to EDOM or ERANGE but fail to clear errno beforehand. > > The fix is to zero errno before all the code sequences that use the > errcheck() function, ala: > > --- run.c 13 Aug 2019 10:45:56 -0000 1.44 > +++ run.c 7 Jun 2020 03:14:38 -0000 > @@ -26,6 +26,7 @@ THIS SOFTWARE. > #define DEBUG > #include <stdio.h> > #include <ctype.h> > +#include <errno.h> > #include <setjmp.h> > #include <limits.h> > #include <math.h> > @@ -1041,8 +1042,10 @@ Cell *arith(Node **a, int n) /* a[0] + a > case POWER: > if (j >= 0 && modf(j, &v) == 0.0) /* pos integer > exponent */ > i = ipow(i, (int) j); > - else > + else { > + errno = 0; > i = errcheck(pow(i, j), "pow"); > + } > break; > default: /* can't happen */ > FATAL("illegal arithmetic operator %d", n); > @@ -1135,8 +1138,10 @@ Cell *assign(Node **a, int n) /* a[0] = > case POWEQ: > if (yf >= 0 && modf(yf, &v) == 0.0) /* pos integer > exponent */ > xf = ipow(xf, (int) yf); > - else > + else { > + errno = 0; > xf = errcheck(pow(xf, yf), "pow"); > + } > break; > default: > FATAL("illegal assignment operator %d", n); > @@ -1499,12 +1504,15 @@ Cell *bltin(Node **a, int n) /* builtin > u = strlen(getsval(x)); > break; > case FLOG: > + errno = 0; > u = errcheck(log(getfval(x)), "log"); break; > case FINT: > modf(getfval(x), &u); break; > case FEXP: > + errno = 0; > u = errcheck(exp(getfval(x)), "exp"); break; > case FSQRT: > + errno = 0; > u = errcheck(sqrt(getfval(x)), "sqrt"); break; > case FSIN: > u = sin(getfval(x)); break; > > > Todd, are we up to date with upstream, or is this latent there too? > > > Philip Guenther

