[Bug tree-optimization/79151] Missed BB vectorization with strided/scalar stores

2017-02-20 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79151 --- Comment #6 from Thomas Koenig --- A few more test cases with a relatively recent trunk. POWER7: [tkoenig@gcc1-power7 ~]$ gcc -mcpu=power7 -O3 foo.c && time ./a.out 41.987257 real0m3.688s user0m3.685s sys 0m0.002s [tkoenig@gcc1-

[Bug tree-optimization/79151] Missed BB vectorization with strided/scalar stores

2017-02-20 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79151 --- Comment #5 from Thomas Koenig --- (In reply to Richard Biener from comment #3) > The question is of course whether vector division has comparable latency / > throughput as the scalar one. Here's a test case on a rather old CPU, a Core 2 Q820

[Bug tree-optimization/79151] Missed BB vectorization with strided/scalar stores

2017-02-20 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79151 --- Comment #4 from Andrew Pinski --- (In reply to Richard Biener from comment #3) > The question is of course whether vector division has comparable latency / > throughput as the scalar one. On the cores that cavium produces the answer is yes f

[Bug tree-optimization/79151] Missed BB vectorization with strided/scalar stores

2017-02-20 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79151 --- Comment #3 from Richard Biener --- The question is of course whether vector division has comparable latency / throughput as the scalar one.

[Bug tree-optimization/79151] Missed BB vectorization with strided/scalar stores

2017-02-19 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79151 --- Comment #2 from Thomas Koenig --- Another test case. It might even be profitable just to look for divisions, because these are so expensive that packing/unpacking should always be profitable. double foo(double a, double b) { return 1/a +

[Bug tree-optimization/79151] Missed BB vectorization with strided/scalar stores

2017-01-20 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79151 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Status|