$ cat a.c void foo (float * restrict x, float * restrict y) { int i; for (i = 0; i < 10000; i++) x[i] = y[i] * y[i]; } $ gcc a.c -O1 -ffast-math -msse -mfpmath=sse -ftree-vectorize -ftree-vectorizer-verbose=5 -std=c99 -c
a.c:5: note: Alignment of access forced using peeling. a.c:5: note: Vectorizing an unaligned access. a.c:5: note: not vectorized: relevant stmt not supported: D.1353_14 = __builtin_powf (D.1352_13, 2.0e+0) a.c:5: note: vectorized 0 loops in function. I find in fold-const.c:fold_binary, around line 9091, I found the following: /* Optimize x*x as pow(x,2.0), which is expanded as x*x. */ -- Summary: x*x in a loop folded to powf(x,2.) which prevents vectorization Product: gcc Version: 4.2.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: fxcoudert at gcc dot gnu dot org GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28524