https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68894
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization, TREE --- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> --- (In reply to Andrew Pinski from comment #5) > I think this is fixed on the trunk now. Or rather improved enough so PHI_OPT could do something about if needed: if (_6 < _7) goto <bb 6>; else goto <bb 7>; <bb 6>: _22 = MIN_EXPR <_6, pretmp_25>; goto <bb 8>; <bb 7>: _28 = MIN_EXPR <_7, pretmp_25>; <bb 8>: # d_2 = PHI <_22(6), _28(7)> Though we get: .L12: ldr w4, [x2, x8] ldr w0, [x2, x6] ldr w3, [x2, x7] cmp w0, w4 csel w0, w0, w4, le cmp w0, w3 csel w0, w0, w3, le str w0, [x1, x2] add x2, x2, 4 cmp x5, x2 bne .L12 From the assembly code which looks correct: For aarch64 at -O3 we get: .L18: ldr q0, [x2, x7] add w3, w3, 1 ldr q2, [x2, x5] ldr q1, [x2, x6] smin v0.4s, v0.4s, v2.4s smin v0.4s, v0.4s, v1.4s str q0, [x1, x2] add x2, x2, 16 cmp w4, w3 bhi .L18 -O3 tree level: <bb 14>: # ivtmp.35_79 = PHI <0(13), ivtmp.35_78(14)> _42 = MEM[symbol: a1, index: ivtmp.35_79, offset: 0B]; _43 = MEM[symbol: a2, index: ivtmp.35_79, offset: 0B]; pretmp_44 = MEM[symbol: a3, index: ivtmp.35_79, offset: 0B]; _45 = MIN_EXPR <_42, pretmp_44>; _46 = MIN_EXPR <_43, pretmp_44>; d_47 = _42 < _43 ? _45 : _46; MEM[base: c_12(D), index: ivtmp.35_79, offset: 0B] = d_47; ivtmp.35_78 = ivtmp.35_79 + 4; if (ivtmp.35_78 == _111) goto <bb 12>; else goto <bb 14>; As you can see it does the right thing for the vectorized code. Though in both cases it is not done at the tree level only at the RTL level.