https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71414
--- Comment #8 from rguenther at suse dot de <rguenther at suse dot de> --- On Tue, 7 Jun 2016, yyc1992 at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71414 > > --- Comment #7 from Yichao Yu <yyc1992 at gmail dot com> --- > If I add `-fvariable-expansion-in-unroller` (omg this options is like half the > command line ;-p ...), the performance matches the clang one after the clang > 3.8 regression. > > ``` > % gcc -funroll-loops -fvariable-expansion-in-unroller -Ofast -march=core-avx2 > benchmark.c -o benchmark2 > % ./benchmark2 > 45.588861 > % ./benchmark-gcc > 80.518152 > % ./benchmark-clang38 > 41.920054 > % ./benchmark-clang37 > 25.093145 > ``` Yeah, but -fvariable-expansion-in-unroller is quite late.