RE: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

Tamar Christina Tue, 13 Jun 2017 03:23:09 -0700

Hi Richard,

> > First, nowadays please add an internal function instead of builtins.
> > You can even take advantage of Richards work to directly tie those to
> > optabs (he might want to chime in to tell you how).  You don't need
> > the fortran FE changes in that case.
> 
> Yeah, it should just be a case of adding:
> 
> DEF_INTERNAL_OPTAB_FN (XORSIGN, ECF_CONST, xorsign, binary)
> 
> to internal-fn.def.  The supposedly useful thing about this is that it
> automatically extends to vectors, so you shouldn't need the xorsign vector
> builtins or the aarch64_builtin_vectorized_function change.


Ah, ok, thanks! I'll change it to an internal function.
And take a look at the testcases for the updated patch. 

> However, we don't yet support SLP vectorisation of internal functions.
> I have a patch for that that I've been looking for an excuse to post (at the
> moment I think it only helps SVE).  If this goes in I can post it as a 
> follow-on.
> 
> In:
> 
> > diff --git a/gcc/testsuite/gcc.dg/vec-xorsign_exec.c
> > b/gcc/testsuite/gcc.dg/vec-xorsign_exec.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..f8c8befd336c7f2743a1621d3b
> 0f
> > 53d78bab9df7
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/vec-xorsign_exec.c
> > @@ -0,0 +1,53 @@
> > +/* { dg-do run } */
> > +/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
> > +/* { dg-additional-options "-march=armv8-a" { target { aarch64*-*-* }
> > +} }*/
> > +
> > +extern void abort ();
> > +
> > +#define N 16
> > +float a[N] = {-0.1f, -3.2f, -6.3f, -9.4f,
> > +         -12.5f, -15.6f, -18.7f, -21.8f,
> > +         24.9f, 27.1f, 30.2f, 33.3f,
> > +         36.4f, 39.5f, 42.6f, 45.7f};
> > +float b[N] = {-1.2f, 3.4f, -5.6f, 7.8f,
> > +         -9.0f, 1.0f, -2.0f, 3.0f,
> > +         -4.0f, -5.0f, 6.0f, 7.0f,
> > +         -8.0f, -9.0f, 10.0f, 11.0f};
> > +float r[N];
> > +
> > +float ad[N] = {-0.1fd,  -3.2d,  -6.3d,  -9.4d,
> > +               -12.5d, -15.6d, -18.7d, -21.8d,
> > +                24.9d,  27.1d,  30.2d,  33.3d,
> > +                36.4d,  39.5d,  42.6d, 45.7d}; float bd[N] = {-1.2d,
> > +3.4d, -5.6d,  7.8d,
> > +               -9.0d,  1.0d, -2.0d,  3.0d,
> > +               -4.0d, -5.0d,  6.0d,  7.0d,
> > +               -8.0d, -9.0d, 10.0d, 11.0d}; float rd[N];
> 
> Looks like these last three were meant to be doubles.
> 
> > +
> > +int
> > +main (void)
> > +{
> > +  int i;
> > +
> > +  for (i = 0; i < N; i++)
> > +    r[i] = a[i] * _builtin_copysignf (1.0f, b[i]);
> > +
> > +  /* check results:  */
> > +  for (i = 0; i < N; i++)
> > +    if (r[i] != a[i] * __builtin_copysignf (1.0f, b[i]))
> > +      abort ();
> > +
> > +  for (i = 0; i < N; i++)
> > +    rd[i] = ad[i] * _builtin_copysignd (1.0d, bd[i]);
> > +
> > +  /* check results:  */
> > +  for (i = 0; i < N; i++)
> > +    if (r[i] != ad[i] * __builtin_copysignd (1.0d, bd[i]))
> > +      abort ();
> > +
> > +
> > +  return 0;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" }
> > +} */
> 
> Why does only one loop get vectorised?
> 
> Thanks,
> Richard

RE: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

Reply via email to