On 2/6/2014 1:51 PM, Uros Bizjak wrote:
Hello!
4.9 does not enable -ftree-vectorize for -O3 (and Ofast) anymore. Is
this intentional?
$/ssd/uros/gcc-build/gcc/xgcc -B /ssd/uros/gcc-build/gcc -O3 -Q
--help=optimizers
...
-ftree-vectorize [disabled]
...
I'm seeing vectorization but no output from -ftree-vectorizer-verbose,
and no dot product vectorization inside omp parallel regions, with gcc
g++ or gfortran 4.9. Primary targets are cygwin64 and linux x86_64.
I've been unable to use -O3 vectorization with gcc, although it works
with gfortran and g++, so use gcc -O2 -ftree-vectorize together with
additional optimization flags which don't break.
I've made source code changes to take advantage of the new vectorization
with merge() and ? operators; while it's useful for -march=core-avx2,
it's sometimes a loss for -msse4.1.
gcc vectorization with #pragma omp parallel for simd is reasonably
effective in my tests only on 12 or more cores.
#pragma omp simd reduction(max: ) is giving correct results but poor
performance in my tests.
You've probably seen my gcc testresults posts. The one major recent
improvement is the ability to skip cilkplus tests on targets where it's
totally unsupported. Without cilk_for et al. even on "supported"
targets cilkplus seems useless.
There are still lots of failing stabs tests on targets where those
apparently aren't supported.
So there are some mysteries about what the developers intend. I suppose
this was posted on gcc list on account of such questions being ignored
on gcc-help.
--
Tim Prince