[Bug libfortran/93871] COTAN is slow for complex types

sgk at troutmask dot apl.washington.edu Fri, 28 Feb 2020 12:52:13 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93871


--- Comment #34 from Steve Kargl <sgk at troutmask dot apl.washington.edu> ---
On Fri, Feb 28, 2020 at 09:30:25AM +0000, thenlich at gcc dot gnu.org wrote:
>
> Going back one step, I wonder if it would be good enough
> to perform a correctly rounded conversion from degrees to
> radians, and just use sin() as is.
> 
>   ...
>      f = sgn * sin(deg2rad(arg))
>   end function fcn
> 
>   ! Compute correctly rounded value of x * pi / 180 for x = 0 ... 90
>   function deg2rad(x) result(y)
>     real(sp) y
>     real(sp), intent(in) :: x
>     real(sp), parameter :: a = atan(1._4) / 45
>     real(sp), parameter :: b = atan(1._16) / 45 - a
>     y = a * x + b * x
>   end function deg2rad
> 
> (The exact values of a and b still need some work)
> 

The only way I know of that will get the correctly rounded
value without punting to a higher precision is to decompose
the scaling and argumet to high and low precision, and then
do the multiplcation.

shi = high_bits_of(pi/180)  ! # of bits = half of precision
slo = pi/180 - shi
xhi = high_bits_of(arg)
xlo = arg - xhi

You then do

value = slo * xlo + slo * xhi + shi * xlo + shi * xhi

The last term is exact, and the summation is accumulated
from smallest terms to largest (ie., you get correct 
rounding). 

For float precision (and little endian), I use

static const float
deg2rad(float ax)
{
   /* Split pi/180 into 12 high bits and 24 low bits. */
   static const float pihi = 1.74560547e-02f, pilo =-2.76216747e-06f;

   float lo, hi;

   /* Need union to allow the clearly of low bits. */
   union {
      unsigned int u;
      float e;
   } v;

   v.e = ax;
   v.u = (v.u >> 12) << 12;  /* 12 high bits. */
   hi = v.e;                    
   lo = ax - hi;             /* low part.

   return (lo * pilo + lo * pihi + hi * pilo + hi * pihi);
}

Even this appears to have some irregularities as my exhaustive
test in the interval [1.e-8,1) with direct call to sinf() yields

count: 223622026
ulp > 1.0: 125
ulp > 1.5: 0
ulp > 1.6: 0
ulp > 1.7: 0
 max_ulp: 1.321531
x at max: 0.447627

[Bug libfortran/93871] COTAN is slow for complex types

Reply via email to