On Thu, Nov 25, 2021 at 11:21 AM Kewen.Lin via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Hi, > > This patch is to add a test case similar to the one in i386 > to add testing coverage for 510.parest_r hotspots. > > As evaluated, the emulated gather capability of vectorizer > (r12-2733) can help to speed up SPEC2017 510.parest_r on > Power8/9/10 by 5% to 9% with option sets Ofast unroll and > Ofast lto. But since rs6000 missed unpacking support for > unsigned int before, it can only vectorize the hotspots > until r12-3134. > > By checking why r12-2733 doesn't immediately show its impact > for SPEC2017 510.parest_r while the associated test case > already can get vectorized on rs6000 at that time, I realized > the associated test case use int as INDEXTYPE while the > hotspots actually use unsigned int. So different from the one > in i386, this patch uses unsigned int as INDEXTYPE since the > unpack support for unsigned int (r12-3134) also matters for > the hotspots vectorization. Not sure if it's worth to updating > the one in i386 as well? It looks like the same testcase added in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88531 > > Tested on powerpc64le-linux-gnu P9 and powerpc64-linux-gnu P8. > > Is it ok for trunk? > > BR, > Kewen > ----- > gcc/testsuite/ChangeLog: > > * gcc.target/powerpc/vect-gather-1.c: New test. > > diff --git a/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c > b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c > new file mode 100644 > index 00000000000..bf98045ab03 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c > @@ -0,0 +1,20 @@ > +/* { dg-do compile } */ > +/* Profitable from Power8 since it supports efficient unaligned load. */ > +/* { dg-options "-Ofast -mdejagnu-cpu=power8 -fdump-tree-vect-details > -fdump-tree-forwprop4" } */ > + > +#ifndef INDEXTYPE > +#define INDEXTYPE unsigned int > +#endif > +double vmul(INDEXTYPE *rowstart, INDEXTYPE *rowend, > + double *luval, double *dst) > +{ > + double res = 0; > + for (const INDEXTYPE * col = rowstart; col != rowend; ++col, ++luval) > + res += *luval * dst[*col]; > + return res; > +} > + > +/* With gather emulation this should be profitable to vectorize from Power8. > */ > +/* { dg-final { scan-tree-dump "loop vectorized" "vect" } } */ > +/* The index vector loads and promotions should be scalar after forwprop. */ > +/* { dg-final { scan-tree-dump-not "vec_unpack" "forwprop4" } } */ > -- > 2.25.1 >
-- BR, Hongtao