[Bug middle-end/58529] Loop 30% faster with Intel than with GCC

2013-09-25 Thread burnus at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58529 --- Comment #8 from Tobias Burnus --- (In reply to H.J. Lu from comment #7) > Can you add "-funroll-loops --param max-unroll-times=7"? On Intel Core i5-3570 (glibc-2.18, openSUSE 13.1b1), I get with the attached Intel .s file and today's GCC: re

[Bug middle-end/58529] Loop 30% faster with Intel than with GCC

2013-09-25 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58529 --- Comment #7 from H.J. Lu --- Can you add "-funroll-loops --param max-unroll-times=7"?

[Bug middle-end/58529] Loop 30% faster with Intel than with GCC

2013-09-25 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58529 --- Comment #6 from Marc Glisse --- Please ignore my last comment, I now see the same 30% difference, the rest must have been a user error on my part.

[Bug middle-end/58529] Loop 30% faster with Intel than with GCC

2013-09-25 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58529 --- Comment #5 from Marc Glisse --- I actually see gcc 4 times (not just 30%) slower than icpc here using the same command lines. The asm produced by gcc contains tons of mov insn.

[Bug middle-end/58529] Loop 30% faster with Intel than with GCC

2013-09-25 Thread burnus at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58529 --- Comment #4 from Tobias Burnus --- (In reply to Marc Glisse from comment #3) > Does it help if you pass the_bins_size as int*restrict (and adapt the uses)? > Or use a local variable instead that you write at the end? That doesn't have any effe

[Bug middle-end/58529] Loop 30% faster with Intel than with GCC

2013-09-25 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58529 --- Comment #3 from Marc Glisse --- Does it help if you pass the_bins_size as int*restrict (and adapt the uses)? Or use a local variable instead that you write at the end? Gcc has a notoriously restricted view of what restrict means, compared to m

[Bug middle-end/58529] Loop 30% faster with Intel than with GCC

2013-09-25 Thread burnus at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58529 --- Comment #2 from Tobias Burnus --- Created attachment 30895 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30895&action=edit Assembler generated by Intel's icpc for test.cc

[Bug middle-end/58529] Loop 30% faster with Intel than with GCC

2013-09-25 Thread burnus at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58529 --- Comment #1 from Tobias Burnus --- Created attachment 30894 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30894&action=edit Main file (calls test file in a loop)