On Mon, 20 Jan 2014, Tim Shen wrote:
...even the regex input is nothing to do with quantifiers at all (say regex re(" ")), g++ -O3 generates slower code than -O2:~ # g++ -O2 perf.cc && time ./a.out ./a.out 0.46s user 0.00s system 99% cpu 0.461 total ~ # g++ -O3 perf.cc && time ./a.out ./a.out 0.56s user 0.00s system 99% cpu 0.569 total perf.cc is almost the same as testsuite/performance/28_regex/split.cc. Following the man page, I found that g++ claims that the difference between -O3 and -O2 are:
It is a FAQ that you can't get the effects of -Oy with -Ox and a bunch of -f flags. Some things depend directly on the optimization level. Note that you can also try the reverse: start from -O3 and use -fno-* flags.
I would first add -march=native to make sure this isn't the result of optimizing for a different platform.
I would suggest you use -fdump-tree-optimized and look at the generated files at -O2 and -O3 and see (they are in a vaguely C-like dialect) if you can find a difference that might explain the result. If so, then you can use -fdump-tree-all and find out where exactly gcc is going wrong. If not, there is -da, but the files will be much harder to read.
In any case, reducing the testcase can only make it easier to understand the issue.
-- Marc Glisse
