https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79390
Bug ID: 79390 Summary: 10% performance drop in SciMark2 LU after r242550 Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: krister.walfridsson at gmail dot com Target Milestone: --- Created attachment 40677 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40677&action=edit The relevant source code and generated asm before/after this change The dense LU matrix factorization test from the old SciMark2 (http://math.nist.gov/scimark) used in the Phoronix compiler test suite has regressed 10% compared to the November trunk when run on Intel i7 6800K Broadwell (compiled with "-O3 -march=native"). GCC 6 generated much slower code, so this is not a regression compared to released versions of the compiler. The regression was introduced in r242550: ------------------------------------------------------------------------ r242550 | wschmidt | 2016-11-17 15:22:17 +0100 (tor, 17 nov 2016) | 18 lines [gcc] 2016-11-17 Bill Schmidt <wschm...@linux.vnet.ibm.com> Richard Biener <rguent...@suse.de> PR tree-optimization/77848 * tree-if-conv.c (tree_if_conversion): Always version loops unless the user specified -ftree-loop-if-convert. [gcc/testsuite] 2016-11-17 Bill Schmidt <wschm...@linux.vnet.ibm.com> Richard Biener <rguent...@suse.de> PR tree-optimization/77848 * gfortran.dg/vect/pr77848.f: New test. ------------------------------------------------------------------------ and has the effect that the pivot-finding loop int LU_factor(int M, int N, double **A, int *pivot) { int minMN = M < N ? M : N; int j=0; for (j=0; j<minMN; j++) { /* find pivot in column j and test for singularity. */ int jp=j; int i; double t = fabs(A[j][j]); for (i=j+1; i<M; i++) { double ab = fabs(A[i][j]); if ( ab > t) { jp = i; t = ab; } } pivot[j] = jp; ... is transformed. The perf output seems to say that this is due to bad branch prediction, but I do not understand x86 assembler enough to be able to determine its cause (or to say if it really is a bug or just some random thing the compiler cannot know about...)