https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105715
Bug ID: 105715 Summary: [13 Regression] missed RTL if-conversion with COND_EXPR change Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- gcc.target/i386/pr45685.c with -march=cascadelake shows missing RTL if-conversion. The cruical GIMPLE difference is _36 = _3 > 0; iftmp.0_13 = _36 ? 1 : -1; prephitmp_31 = ABS_EXPR <_3>; prephitmp_32 = _36 ? -1 : 1; prephitmp_33 = _36 ? 4294967295 : 1; prephitmp_35 = _36 ? 1 : 4294967295; ... _29 = prephitmp_31 != _42; val_12 = _29 ? prephitmp_32 : iftmp.0_13; prephitmp_37 = _29 ? prephitmp_33 : prephitmp_35; vs. iftmp.0_13 = _3 > 0 ? 1 : -1; prephitmp_31 = ABS_EXPR <_3>; prephitmp_32 = _3 > 0 ? -1 : 1; prephitmp_33 = _3 > 0 ? 4294967295 : 1; prephitmp_35 = _3 > 0 ? 1 : 4294967295; ... val_12 = i.1_6 == prephitmp_31 ? iftmp.0_13 : prephitmp_32; prephitmp_37 = i.1_6 != prephitmp_31 ? prephitmp_33 : prephitmp_35; where the split out condition is now CSEd and the multi-use makes us not TER the comparison. Previously we got two compare & jump sequences while now we get the compare computing a QImode value and the then two compare & jump sequences. While without -march=cascadelake we do get the desired number of cmovs the generated code is still worse. The testcase is unfortunately a bit obfuscated due to the many if-conversions taking place. Smaller GIMPLE testcases do not exhibit jumpy RTL expansion. void __GIMPLE(ssa, startwith("optimized")) foo (long *p, long a, long b, long c, long d, long e, long f) { _Bool _2; long _3; long _8; __BB(2): _2 = a_1(D) < b_10(D); _3 = _2 ? c_4(D) : d_5(D); _8 = _2 ? f_6(D) : e_7(D); __MEM <long> (p_9(D)) = _3; __MEM <long> (p_9(D) + 4) = _8; return; } #if __GNUC__ < 13 void __GIMPLE(ssa, startwith("optimized")) bar (long *p, long a, long b, long c, long d, long e, long f) { long _3; long _8; __BB(2): _3 = a_1(D) < b_10(D) ? c_4(D) : d_5(D); _8 = a_1(D) >= b_10(D) ? e_7(D) : f_6(D); __MEM <long> (p_9(D)) = _3; __MEM <long> (p_9(D) + 4) = _8; return; } #endif