[Bug c++/84280] New: Performance regression in g++-7 with Eigen for non-AVX2 CPUs

2018-02-08 Thread patrikhuber at gmail dot com
Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: patrikhuber at gmail dot com Target Milestone: --- Hello, I noticed today what may look like quite a large performance regression with Eigen (3.3.4) matrix multiplication. It only seems to

[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs

2018-02-08 Thread patrikhuber at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #3 from Patrik Huber --- @Richard: I'm not 100% sure what you mean with "preprocessed source" but I googled and you probably mean the output of compiling with "-c -save-temps". Please see attached.

[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs

2018-02-08 Thread patrikhuber at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #4 from Patrik Huber --- Created attachment 43363 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43363&action=edit gcc5_gemm_test.s

[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs

2018-02-08 Thread patrikhuber at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #5 from Patrik Huber --- Created attachment 43364 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43364&action=edit gcc7_gemm_test.s

[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs

2018-02-08 Thread patrikhuber at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #6 from Patrik Huber --- I could also upload you the .ii files but they are 5 MB, which the bugtracker doesn't allow (1 MB limit).

[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs

2018-02-08 Thread patrikhuber at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #7 from Patrik Huber --- Created attachment 43365 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43365&action=edit gemm_test.cpp

[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs

2018-02-08 Thread patrikhuber at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #8 from Patrik Huber --- Created attachment 43366 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43366&action=edit full_log.txt

[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs

2018-02-08 Thread patrikhuber at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #10 from Patrik Huber --- Created attachment 43367 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43367&action=edit gcc5_gemm_test.ii

[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs

2018-02-08 Thread patrikhuber at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #11 from Patrik Huber --- Created attachment 43368 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43368&action=edit gcc7_gemm_test.ii

[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs

2018-02-08 Thread patrikhuber at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #13 from Patrik Huber --- >> Did you try with FDO? (-fprofile-generate, run, -fprofile-use) I just tried this with g++-7. It didn't help, the final executable has the same slower run time as in the attached log without the FDO.

[Bug target/84280] [6/7/8 Regression] Performance regression in g++-7 with Eigen for non-AVX2 CPUs

2018-02-08 Thread patrikhuber at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84280 --- Comment #14 from Patrik Huber --- It even seems a few percent slower after the FDO stuff. But the ` -fprofile-use` is a bit weird. If there is no .gcda file, it doesn't complain. If you give it a file that doesn't exist (e.g. -fprofile-use=fo