Hi Richard, > > First, nowadays please add an internal function instead of builtins. > > You can even take advantage of Richards work to directly tie those to > > optabs (he might want to chime in to tell you how). You don't need > > the fortran FE changes in that case. > > Yeah, it should just be a case of adding: > > DEF_INTERNAL_OPTAB_FN (XORSIGN, ECF_CONST, xorsign, binary) > > to internal-fn.def. The supposedly useful thing about this is that it > automatically extends to vectors, so you shouldn't need the xorsign vector > builtins or the aarch64_builtin_vectorized_function change.
Ah, ok, thanks! I'll change it to an internal function. And take a look at the testcases for the updated patch. > However, we don't yet support SLP vectorisation of internal functions. > I have a patch for that that I've been looking for an excuse to post (at the > moment I think it only helps SVE). If this goes in I can post it as a > follow-on. > > In: > > > diff --git a/gcc/testsuite/gcc.dg/vec-xorsign_exec.c > > b/gcc/testsuite/gcc.dg/vec-xorsign_exec.c > > new file mode 100644 > > index > > > 0000000000000000000000000000000000000000..f8c8befd336c7f2743a1621d3b > 0f > > 53d78bab9df7 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/vec-xorsign_exec.c > > @@ -0,0 +1,53 @@ > > +/* { dg-do run } */ > > +/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */ > > +/* { dg-additional-options "-march=armv8-a" { target { aarch64*-*-* } > > +} }*/ > > + > > +extern void abort (); > > + > > +#define N 16 > > +float a[N] = {-0.1f, -3.2f, -6.3f, -9.4f, > > + -12.5f, -15.6f, -18.7f, -21.8f, > > + 24.9f, 27.1f, 30.2f, 33.3f, > > + 36.4f, 39.5f, 42.6f, 45.7f}; > > +float b[N] = {-1.2f, 3.4f, -5.6f, 7.8f, > > + -9.0f, 1.0f, -2.0f, 3.0f, > > + -4.0f, -5.0f, 6.0f, 7.0f, > > + -8.0f, -9.0f, 10.0f, 11.0f}; > > +float r[N]; > > + > > +float ad[N] = {-0.1fd, -3.2d, -6.3d, -9.4d, > > + -12.5d, -15.6d, -18.7d, -21.8d, > > + 24.9d, 27.1d, 30.2d, 33.3d, > > + 36.4d, 39.5d, 42.6d, 45.7d}; float bd[N] = {-1.2d, > > +3.4d, -5.6d, 7.8d, > > + -9.0d, 1.0d, -2.0d, 3.0d, > > + -4.0d, -5.0d, 6.0d, 7.0d, > > + -8.0d, -9.0d, 10.0d, 11.0d}; float rd[N]; > > Looks like these last three were meant to be doubles. > > > + > > +int > > +main (void) > > +{ > > + int i; > > + > > + for (i = 0; i < N; i++) > > + r[i] = a[i] * _builtin_copysignf (1.0f, b[i]); > > + > > + /* check results: */ > > + for (i = 0; i < N; i++) > > + if (r[i] != a[i] * __builtin_copysignf (1.0f, b[i])) > > + abort (); > > + > > + for (i = 0; i < N; i++) > > + rd[i] = ad[i] * _builtin_copysignd (1.0d, bd[i]); > > + > > + /* check results: */ > > + for (i = 0; i < N; i++) > > + if (r[i] != ad[i] * __builtin_copysignd (1.0d, bd[i])) > > + abort (); > > + > > + > > + return 0; > > +} > > + > > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } > > +} */ > > Why does only one loop get vectorised? > > Thanks, > Richard