http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58529
Bug ID: 58529 Summary: Loop 30% faster with Intel than with GCC Product: gcc Version: 4.9.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org Host: x86-64-gnu-linux Created attachment 30893 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30893&action=edit Test file The Intel icpc 13.1.1 compiler generates code which is 30% faster than GCC 4.9 for the following function (see test2.cc): the_bins_size = 0; for (int i = 0; i < arraylength; i++) { if (coordexist[i]) { the_bins[the_bins_size] = i; coordexist[i] = the_bins_size++; } } GCC: real 0m2.493s user 0m2.491s sys 0m0.002s -funroll-loops GCC: real 0m1.494s user 0m1.493s sys 0m0.000s ICC: real 0m1.160s user 0m1.157s sys 0m0.001s The main function (test.main.cc) has been compiled with g++; used system: Intel(R) Xeon(R) CPU E5-2630, CentOS 6/x86-64-gnu-linux, glibc-2.12. g++ -march=native -fno-rtti -fno-exceptions -Ofast -std=c++ icpc -O3 -no-prec-div -xHost -fno-rtti -fno-exceptions