https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61619
Bug ID: 61619 Summary: Benefits from -ftree-vectorize lost easily when changing unrelated code Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: shmueller2 at gmail dot com Created attachment 33011 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33011&action=edit Uncommenting (1), (2) or (3) significantly accelerates this code When compiling the attached code with: g++-4.9 -std=c++11 -O3 -Wall -Wextra -pedantic bug.cpp -o bug I found that minor changes in seemingly irrelevant aspects of the code had a strong effect on performance. When run with: time ./bug the code as attached gave the following best-of-10 timing on a 2011 Macbook Air: real 0m1.718s user 0m1.395s sys 0m0.306s Minor changes by uncommenting any of the lines marked with (1), (2), (3) (replacing the line immediately above) yielded significantly better results: Uncomment line (1): real 0m1.343s user 0m1.029s sys 0m0.312s Uncomment line (2): real 0m1.364s user 0m1.062s sys 0m0.297s Uncomment line (3): real 0m1.332s user 0m1.016s sys 0m0.315s The generated assembly code (-S) differs significantly in all cases. When using -fno-tree-vectorize the performance is similar to the first (slow) result for all variations. The bug I'm reporting is that the optimization benefits from -ftree-vectorize are apparently lost easily and non-transparently when changing seemingly unrelated parts of the code on a high level, which should not affect performance. I would have expected that none of the changes (1), (2) and (3) would have resulted in a difference in the generated assembly code, and it was very surprising to me that such details mattered so much.