I've been doing some benchmarking of gfortran, and reducing the testcase leads to what seems a missed optimization in the following code:
$ cat a.c void foo (float * restrict x, float * restrict y) { int i; for (i = 0; i < 10000; i++) x[i] = y[i] * y[i]; } $ gcc a.c -O1 -ffast-math -msse -mfpmath=sse -ftree-vectorize -ftree-vectorizer-verbose=5 -std=c99 -c a.c:5: note: Alignment of access forced using peeling. a.c:5: note: Vectorizing an unaligned access. a.c:5: note: not vectorized: relevant stmt not supported: D.1353_14 = __builtin_powf (D.1352_13, 2.0e+0) a.c:5: note: vectorized 0 loops in function. I find in fold-const.c:fold_binary, around line 9091, I found the following: /* Optimize x*x as pow(x,2.0), which is expanded as x*x. */ Should I file it as a missed-optimization bug in bugzilla, or is there some clever reason for that behaviour? FX