https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57796
Richard Biener changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57796
--- Comment #11 from Richard Biener ---
Author: rguenth
Date: Thu Apr 20 14:26:26 2017
New Revision: 247026
URL: https://gcc.gnu.org/viewcvs?rev=247026&root=gcc&view=rev
Log:
2017-04-20 Richard Biener
PR tree-optimization/57796
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57796
--- Comment #10 from vincenzo Innocente ---
added a self contained "benchmark"
on my machine
[innocent@vinavx3 ctest]$ c++ -Ofast -Wall SparseOnly.c -march=native ; time
./a.out
0.496u 0.000s 0:00.49 100.0%0+0k 0+0io 0pf+0w
[innocent@vinavx3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57796
--- Comment #9 from vincenzo Innocente ---
Created attachment 41070
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41070&action=edit
self contained benchmark of scimark2 SparseMat must
content is not randomized
param must be modified by ha
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57796
--- Comment #8 from vincenzo Innocente ---
My understanding of the gather latency is that it essentially corresponds to a
load per cacheline: fast if all items are closeby, slower than scalar loads if
items are all in different cachelines. Not su
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57796
--- Comment #5 from vincenzo Innocente ---
so with latest 4.9
gcc version 4.10.0 20140611 (experimental) [trunk revision 211467] (GCC)
situation has not changed much (the scalar version is now faster!):
I think that the cost of gather instructi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57796
--- Comment #4 from Yuri Rumyantsev ---
(In reply to Jakub Jelinek from comment #3)
> By tuning I've meant the vectorizer cost model. If the desirability of
> gathers vs. no vectorization at all doesn't depend only on the insns in the
> loop, but
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57796
--- Comment #3 from Jakub Jelinek ---
By tuning I've meant the vectorizer cost model. If the desirability of gathers
vs. no vectorization at all doesn't depend only on the insns in the loop, but
also on how many iterations the loop has, then perh
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57796
Yuri Rumyantsev changed:
What|Removed |Added
CC||ysrumyan at gmail dot com
--- Comment #