http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50789

vincenzo Innocente <vincenzo.innocente at cern dot ch> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vincenzo.innocente at cern
                   |                            |dot ch

--- Comment #10 from vincenzo Innocente <vincenzo.innocente at cern dot ch> 
2013-04-02 16:49:53 UTC ---
I was trying to see how gcc behaves w.r.t. this example
http://software.intel.com/en-us/articles/bkm-coaxing-the-compiler-to-vectorize-structured-data-via-gathers

So I started from the example in comment 6 and "evolved" as follows
f21() and f22() are equivalent to my eyes
f21 vectorize, f22 not
also the variant f21b does not vectorize…

c++ -v
Using built-in specs.
COLLECT_GCC=c++
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-apple-darwin12.2.0/4.8.0/lto-wrapper
Target: x86_64-apple-darwin12.2.0
Configured with: ./configure --enable-languages=c,c++,fortran
--disable-multilib --disable-bootstrap --enable-lto -disable-libitm :
(reconfigured) ./configure --enable-languages=c,c++,fortran --disable-multilib
--disable-bootstrap --enable-lto -disable-libitm : (reconfigured) ./configure
--enable-languages=c,c++,fortran --disable-multilib --disable-bootstrap
--enable-lto -disable-libitm
Thread model: posix
gcc version 4.8.0 20130313 (experimental) [trunk revision 196633] (GCC) 

c++ -std=c++11 -Ofast -mavx2 -S gather.cc -ftree-vectorizer-verbose=2  

struct float3 {
  float x;
  float y;
  float z;
};

#define N 1024
float fx[N], g[N];
float fy[N];
float fz[N]; 
int k[N];

float ff[3*N];
float3 f3[N];
void
f20 (void)
{
  int i;
  for (i = 0; i < N; i++)
    g[i] = fx[k[i]]+fy[k[i]]+fz[k[i]];
}

void
f21 (void)
{
  int i;
  for (i = 0; i < N; i++)
    g[i] = ff[3*k[i]]+ff[3*k[i]+1]+ff[3*k[i]+2];
}
void
f22 (void)
{
  int i;
  for (i = 0; i < N; i++)
    g[i] = f3[k[i]].x+f3[k[i]].y+f3[k[i]].z;
}


void
f21b (void)
{
  int i;
  for (i = 0; i < N; i++) {
    auto j = ff+3*k[i];
    g[i] = j[0]+j[1]+j[2];
  }
}

Reply via email to