http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57534
Uroš Bizjak <ubizjak at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2013-06-05
Component|rtl-optimization |tree-optimization
Target Milestone|--- |4.8.2
Summary|Performance regression |[4.8. 4.9 Regression]:
|versus 4.7.3, 4.8.1 is ~15% |Performance regression
|slower |versus 4.7.3, 4.8.1 is ~15%
| |slower
Ever confirmed|0 |1
--- Comment #2 from Uroš Bizjak <ubizjak at gmail dot com> ---
Confirmed, for some reason tree optimizers CSE part of the address, resulting
in the _.optimized dump that shows:
index.6_13 = (unsigned int) index_1;
_14 = index.6_13 * 8; <- here
_16 = x_15(D) + _14;
_17 = *_16;
_20 = _14 + 8;
_21 = x_15(D) + _20;
_22 = *_21;
_23 = _17 + _22;
_26 = _14 + 16;
_27 = x_15(D) + _26;
_28 = *_27;
_29 = _23 + _28;
_32 = _14 + 24;
_33 = x_15(D) + _32;
_34 = *_33;
_35 = _29 + _34;
sum_36 = _35 + sum_3;
_38 = _14 + 32;
_39 = x_15(D) + _38;
_40 = *_39;
_42 = _14 + 40;
_43 = x_15(D) + _42;
_44 = *_43;
_45 = _40 + _44;
_47 = _14 + 48;
_48 = x_15(D) + _47;
_49 = *_48;
_50 = _45 + _49;
_52 = _14 + 56;
_53 = x_15(D) + _52;
_54 = *_53;
_55 = _50 + _54;
sum2_56 = _55 + sum2_4;
index_57 = index_1 + 8;
Starting from there, the final assembly results in:
.L16:
leal 0(,%esi,8), %eax <- this is CSEd part: %eax = %esi * 8
fldl (%ebx,%esi,8)
faddl 8(%ebx,%eax)
faddl 16(%ebx,%eax)
faddl 24(%ebx,%eax)
faddp %st, %st(2)
fldl 32(%ebx,%eax)
faddl 40(%ebx,%eax)
faddl 48(%ebx,%eax)
faddl 56(%ebx,%eax)
leal 8(%esi), %eax
cmpl %eax, %edi
faddp %st, %st(1)
jg .L17
movl keepgoing, %eax
testl %eax, %eax
je .L18
addl $1, %ebp
xorl %eax, %eax
.L17:
movl %eax, %esi
jmp .L16
Confirmed as tree optimizers problem.