https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80248
Bug ID: 80248
Summary: sparse access to Array of structures does not
vectorize
Product: gcc
Version: 7.0.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: vincenzo.innocente at cern dot ch
Target Milestone: ---
in the following example "aos" does not vectorize while the equivalent aos2
does vectorize using vgatherdps instruction
On a slight different matter:
"soa" vectorizes and produces code that is apparently 20% faster than "aos2":
I may open a different PR with a benchmark attached...
cat simpleGather.cc
struct float3 {
float x;
float y;
float z;
};
#define N 1024
float fx[N], g[N];
float fy[N];
float fz[N];
int k[N];
float3 f3[N];
void
aos (void)
{
int i;
for (i = 0; i < N; i++)
g[i] = f3[k[i]].x+f3[k[i]].y+f3[k[i]].z;
}
// use gather
void
aos2 (void)
{
float * ff = &(f3[0].x);
int i;
for (i = 0; i < N; i++)
g[i] = ff[3*k[i]]+ff[3*k[i]+1]+ff[3*k[i]+2];
}
// use gather
void
soa (void)
{
int i;
for (i = 0; i < N; i++)
g[i] = fx[k[i]]+fy[k[i]]+fz[k[i]];
}
[innocent@vinavx3 vectorize]$ c++ -Ofast -Wall -march=haswell -S
simpleGather.cc -fopt-info-vec
simpleGather.cc:31:17: note: loop vectorized
simpleGather.cc:41:17: note: loop vectorized
[innocent@vinavx3 vectorize]$ c++ -v
Using built-in specs.
COLLECT_GCC=c++
COLLECT_LTO_WRAPPER=/afs/cern.ch/work/i/innocent/public/w5/bin/../libexec/gcc/x86_64-pc-linux-gnu/7.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-trunk//configure
--prefix=/afs/cern.ch/user/i/innocent/w5 -enable-languages=c,c++,lto,fortran
--enable-lto -enable-libitm -disable-multilib
Thread model: posix
gcc version 7.0.1 20170326 (experimental) [trunk revision 246485] (GCC)