https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61619

            Bug ID: 61619
           Summary: Benefits from -ftree-vectorize lost easily when
                    changing unrelated code
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: shmueller2 at gmail dot com

Created attachment 33011
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33011&action=edit
Uncommenting (1), (2) or (3) significantly accelerates this code

When compiling the attached code with:

g++-4.9 -std=c++11 -O3 -Wall -Wextra -pedantic bug.cpp -o bug

I found that minor changes in seemingly irrelevant aspects of the code had a
strong effect on performance. When run with:

time ./bug

the code as attached gave the following best-of-10 timing on a 2011 Macbook
Air:

real   0m1.718s
user   0m1.395s
sys    0m0.306s

Minor changes by uncommenting any of the lines marked with (1), (2), (3)
(replacing the line immediately above) yielded significantly better results:

Uncomment line (1):

real   0m1.343s
user   0m1.029s
sys    0m0.312s

Uncomment line (2):

real   0m1.364s
user   0m1.062s
sys    0m0.297s

Uncomment line (3):

real   0m1.332s
user   0m1.016s
sys    0m0.315s

The generated assembly code (-S) differs significantly in all cases. When using

-fno-tree-vectorize

the performance is similar to the first (slow) result for all variations. 

The bug I'm reporting is that the optimization benefits from -ftree-vectorize
are apparently lost easily and non-transparently when changing seemingly
unrelated parts of the code on a high level, which should not affect
performance. 

I would have expected that none of the changes (1), (2) and (3) would have
resulted in a difference in the generated assembly code, and it was very
surprising to me that such details mattered so much.

Reply via email to