http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51497
--- Comment #4 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-12-12 13:09:49 UTC --- > I can't see any vectorizer differences for the testcase in comment #2 and the > patch you cite only (should) have debuginfo changes, no changes to the > produced > IL at statement level (eventually it has better type-based alias analysis). > > Not confirmed. I have just done the following check: (1) gfc -Ofast -funroll-loops nf.f90 -ftree-vectorizer-verbose=1 > & tmp1 (2) gfc -Ofast -funroll-loops nf.f90 -ftree-vectorizer-verbose=1 -flto > & tmp2 I noticed that the tmp2 file contains two sets of annotations, likely one for the usual vectorization (up to line 334) and a second one for the lto stage. (3) I have split the file tmp2 in a new tmp2 keeping only the first 334 lines and a second one containing the second part. (4) I have used diff to compare the files: tmp1 and the new tmp2 are identical, while I see missing vectorizations in tmp3: --- tmp1 2011-12-12 13:49:06.000000000 +0100 +++ tmp3 2011-12-12 13:54:12.000000000 +0100 ... -206: LOOP VECTORIZED. -nf.f90:204: note: vectorized 7 loops in function. ... -nf.f90:256: note: vectorized 3 loops in function. +nf.f90:256: note: vectorized 2 loops in function. ... -nf.f90:288: note: vectorized 3 loops in function. +nf.f90:288: note: vectorized 2 loops in function. This confirms what I have seen in the disassembled executable. Questions: (1) do you see the slowdown with -flto? (2) can you reproduce the above? > The two else if blocks are related, not independent, independently reverting > them makes no sense. I am not suggesting to remove one block. I was only interested in finding which part of the patch caused/exposed the problem (which looks like yet another instance of a bad choice of optimization for size: as pointed in 51499, the vectorization generates two loops, one vectorized and one not, hence ~doubling the code size).