http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58529
--- Comment #8 from Tobias Burnus ---
(In reply to H.J. Lu from comment #7)
> Can you add "-funroll-loops --param max-unroll-times=7"?
On Intel Core i5-3570 (glibc-2.18, openSUSE 13.1b1), I get with the attached
Intel .s file and today's GCC:
re
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58529
--- Comment #7 from H.J. Lu ---
Can you add "-funroll-loops --param max-unroll-times=7"?
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58529
--- Comment #6 from Marc Glisse ---
Please ignore my last comment, I now see the same 30% difference, the rest must
have been a user error on my part.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58529
--- Comment #5 from Marc Glisse ---
I actually see gcc 4 times (not just 30%) slower than icpc here using the same
command lines. The asm produced by gcc contains tons of mov insn.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58529
--- Comment #4 from Tobias Burnus ---
(In reply to Marc Glisse from comment #3)
> Does it help if you pass the_bins_size as int*restrict (and adapt the uses)?
> Or use a local variable instead that you write at the end?
That doesn't have any effe
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58529
--- Comment #3 from Marc Glisse ---
Does it help if you pass the_bins_size as int*restrict (and adapt the uses)? Or
use a local variable instead that you write at the end? Gcc has a notoriously
restricted view of what restrict means, compared to m
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58529
--- Comment #2 from Tobias Burnus ---
Created attachment 30895
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30895&action=edit
Assembler generated by Intel's icpc for test.cc
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58529
--- Comment #1 from Tobias Burnus ---
Created attachment 30894
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30894&action=edit
Main file (calls test file in a loop)