https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80248

            Bug ID: 80248
           Summary: sparse access to Array of structures does not
                    vectorize
           Product: gcc
           Version: 7.0.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vincenzo.innocente at cern dot ch
  Target Milestone: ---

in the following example "aos" does not vectorize  while the equivalent aos2
does vectorize using vgatherdps instruction

On a slight different matter:
"soa" vectorizes and produces code that is apparently 20% faster than "aos2":
I may open a different PR with a benchmark attached...


cat simpleGather.cc
struct float3 {
  float x;
  float y;
  float z;
};

#define N 1024
float fx[N], g[N];
float fy[N];
float fz[N]; 
int k[N];

float3 f3[N];


void
aos (void)
{
  int i;
  for (i = 0; i < N; i++)
    g[i] = f3[k[i]].x+f3[k[i]].y+f3[k[i]].z;
}


// use gather
void
aos2 (void)
{
  float * ff = &(f3[0].x);
  int i;
  for (i = 0; i < N; i++)
    g[i] = ff[3*k[i]]+ff[3*k[i]+1]+ff[3*k[i]+2];
}


// use gather
void
soa (void)
{
  int i;
  for (i = 0; i < N; i++)
    g[i] = fx[k[i]]+fy[k[i]]+fz[k[i]];
}

[innocent@vinavx3 vectorize]$ c++ -Ofast -Wall -march=haswell -S
simpleGather.cc -fopt-info-vec
simpleGather.cc:31:17: note: loop vectorized
simpleGather.cc:41:17: note: loop vectorized
[innocent@vinavx3 vectorize]$ c++ -v
Using built-in specs.
COLLECT_GCC=c++
COLLECT_LTO_WRAPPER=/afs/cern.ch/work/i/innocent/public/w5/bin/../libexec/gcc/x86_64-pc-linux-gnu/7.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-trunk//configure
--prefix=/afs/cern.ch/user/i/innocent/w5 -enable-languages=c,c++,lto,fortran
--enable-lto -enable-libitm -disable-multilib
Thread model: posix
gcc version 7.0.1 20170326 (experimental) [trunk revision 246485] (GCC)

Reply via email to