I was halfway there.

That's an old bug.

Philip Guenther <[email protected]> wrote:

> On Sat, Jun 6, 2020 at 5:08 PM Zé Loff <[email protected]> wrote:
> 
> > On Sat, Jun 06, 2020 at 03:51:58PM -0700, Jordan Geoghegan wrote:
> > > I'm working on a simple awk snippet to convert the IP range data listed
> > in
> > > the Extended Delegation Statistics data from ARIN [1] and convert it into
> > > CIDR blocks. I have a snippet that works perfectly fine on mawk and gawk,
> > > but not on the base system awk. I'm 99% sure I'm not using any GNUisms,
> > as
> > > when I break the command up into two parts, it works perfectly.
> > >
> > > The snippet below does not work with base awk, but does work with gawk
> > and
> > > mawk: (Running on 6.6 -stable system)
> > >
> > >   awk -F '|' '{ if ( $3 == "ipv4" && $2 == "US") printf("%s/%d\n", $4,
> > > 32-log($5)/log(2))}' delegated-arin-extended-latest.txt
> > >
> > >
> > > The command does output data, but it also throws errors for certain
> > lines:
> > >
> > >   awk: log result out of range
> > >   input record number 94027, file delegated-arin-extended-latest.txt
> > >   source line number 1
> > >
> > > Most CIDR blocks are calculated correctly, but about 10% of them have
> > errors
> > > (ie something that should calculated to be a /24 is instead calculated
> > to be
> > > a /30).
> >
> ...
> 
> > I have no idea about what is going on, but FWIW I can reproduce this on
> > i386 6.7-stable and amd64 6.7-current (well, current-ish, #232).
> > Truncating the file to a single offending line produces the same result:
> > log($5) is out of range.
> >
> > It appears to have something to do with the last field.  Removing it or
> > changing some of its characters seems to work, e.g.:
> >
> >
> > arin|US|ipv4|216.250.144.0|4096|20050503|allocated|5e58386636aa775c2106140445cf2c30
> >
> > arin|US|ipv4|216.250.144.0|4096|20050503|allocated|5a58386636aa775c2106140445cf2c30
> >                                                     ^
> > Fails on the first line but works on the second.
> >
> 
> Hah!  Nice observation!
> 
> The last field of the first line looks kinda like a number in scientific
> notation, but when awk internally tries to set up the fields it generates
> an ERANGE error...and the global errno variable is left with that value.
> Several builtins in awk, including log(), perform operations and then check
> whether errno is set to EDOM or ERANGE but fail to clear errno beforehand.
> 
> The fix is to zero errno before all the code sequences that use the
> errcheck() function, ala:
> 
> --- run.c       13 Aug 2019 10:45:56 -0000      1.44
> +++ run.c       7 Jun 2020 03:14:38 -0000
> @@ -26,6 +26,7 @@ THIS SOFTWARE.
>  #define DEBUG
>  #include <stdio.h>
>  #include <ctype.h>
> +#include <errno.h>
>  #include <setjmp.h>
>  #include <limits.h>
>  #include <math.h>
> @@ -1041,8 +1042,10 @@ Cell *arith(Node **a, int n)     /* a[0] + a
>         case POWER:
>                 if (j >= 0 && modf(j, &v) == 0.0)       /* pos integer
> exponent */
>                         i = ipow(i, (int) j);
> -               else
> +               else {
> +                       errno = 0;
>                         i = errcheck(pow(i, j), "pow");
> +               }
>                 break;
>         default:        /* can't happen */
>                 FATAL("illegal arithmetic operator %d", n);
> @@ -1135,8 +1138,10 @@ Cell *assign(Node **a, int n)    /* a[0] =
>         case POWEQ:
>                 if (yf >= 0 && modf(yf, &v) == 0.0)     /* pos integer
> exponent */
>                         xf = ipow(xf, (int) yf);
> -               else
> +               else {
> +                       errno = 0;
>                         xf = errcheck(pow(xf, yf), "pow");
> +               }
>                 break;
>         default:
>                 FATAL("illegal assignment operator %d", n);
> @@ -1499,12 +1504,15 @@ Cell *bltin(Node **a, int n)    /* builtin
>                         u = strlen(getsval(x));
>                 break;
>         case FLOG:
> +               errno = 0;
>                 u = errcheck(log(getfval(x)), "log"); break;
>         case FINT:
>                 modf(getfval(x), &u); break;
>         case FEXP:
> +               errno = 0;
>                 u = errcheck(exp(getfval(x)), "exp"); break;
>         case FSQRT:
> +               errno = 0;
>                 u = errcheck(sqrt(getfval(x)), "sqrt"); break;
>         case FSIN:
>                 u = sin(getfval(x)); break;
> 
> 
> Todd, are we up to date with upstream, or is this latent there too?
> 
> 
> Philip Guenther

Reply via email to