On Sat, Apr 23, 2022 at 8:53 AM Jakub Jelinek <ja...@redhat.com> wrote: > > Hi! > > The following testcase regressed on x86_64 on the trunk, due to some GIMPLE > pass changes (r12-7687) we end up an *.optimized dump difference of: > @@ -8,14 +8,14 @@ int foo (int i) > > <bb 2> [local count: 1073741824]: > if (i_2(D) != 0) > - goto <bb 4>; [35.00%] > + goto <bb 3>; [35.00%] > else > - goto <bb 3>; [65.00%] > + goto <bb 4>; [65.00%] > > - <bb 3> [local count: 697932184]: > + <bb 3> [local count: 375809640]: > > <bb 4> [local count: 1073741824]: > - # iftmp.0_1 = PHI <5(2), i_2(D)(3)> > + # iftmp.0_1 = PHI <5(3), i_2(D)(2)> > return iftmp.0_1; > > } > and similarly for the other functions. That is functionally equivalent and > there is no canonical form for those. The reason for i_2(D) in the PHI > argument as opposed to 0 is the uncprop pass, that is in many cases > beneficial for expansion as we don't need to load the value into some pseudo > in one of the if blocks. > Now, for the 11.x ordering we have the pseudo = i insn in the extended basic > block (it comes first) and so forwprop1 undoes what uncprop does by > propagating constant 0 there. But for the 12.x ordering, the extended basic > block contains pseudo = 5 and pseudo = i is in the other bb and so fwprop1 > doesn't change it. > During the ce1 pass, we attempt to emit a conditional move and we have very > nice code for the cases where both last operands of ?: are constant, and yet > another for !TARGET_CMOVE if at least one of them is. > > The following patch will undo the uncprop behavior during > ix86_expand_int_movcc, but just for those spots that can benefit from both > or at least one operands being constant, leaving the pure cmov case as is > (because then it is useful not to have to load a constant into a pseudo > as it already is in one). We can do that in the > op0 == op1 ? op0 : op3 > or > op0 != op1 ? op2 : op0 > cases if op1 is a CONST_INT by pretending it is > op0 == op1 ? op1 : op3 > or > op0 != op1 ? op2 : op1 > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2022-04-23 Jakub Jelinek <ja...@redhat.com> > > PR target/105338 > * config/i386/i386-expand.cc (ix86_expand_int_movcc): Handle > op0 == cst1 ? op0 : op3 like op0 == cst1 ? cst1 : op3 for the non-cmov > cases. > > * gcc.target/i386/pr105338.c: New test.
OK. Thanks, Uros. > > --- gcc/config/i386/i386-expand.cc.jj 2022-04-13 15:42:39.000000000 +0200 > +++ gcc/config/i386/i386-expand.cc 2022-04-22 14:18:27.347135185 +0200 > @@ -3136,6 +3136,8 @@ ix86_expand_int_movcc (rtx operands[]) > bool sign_bit_compare_p = false; > rtx op0 = XEXP (operands[1], 0); > rtx op1 = XEXP (operands[1], 1); > + rtx op2 = operands[2]; > + rtx op3 = operands[3]; > > if (GET_MODE (op0) == TImode > || (GET_MODE (op0) == DImode > @@ -3153,17 +3155,29 @@ ix86_expand_int_movcc (rtx operands[]) > || (op1 == constm1_rtx && (code == GT || code == LE))) > sign_bit_compare_p = true; > > + /* op0 == op1 ? op0 : op3 is equivalent to op0 == op1 ? op1 : op3, > + but if op1 is a constant, the latter form allows more optimizations, > + either through the last 2 ops being constant handling, or the one > + constant and one variable cases. On the other side, for cmov the > + former might be better as we don't need to load the constant into > + another register. */ > + if (code == EQ && CONST_INT_P (op1) && rtx_equal_p (op0, op2)) > + op2 = op1; > + /* Similarly for op0 != op1 ? op2 : op0 and op0 != op1 ? op2 : op1. */ > + else if (code == NE && CONST_INT_P (op1) && rtx_equal_p (op0, op3)) > + op3 = op1; > + > /* Don't attempt mode expansion here -- if we had to expand 5 or 6 > HImode insns, we'd be swallowed in word prefix ops. */ > > if ((mode != HImode || TARGET_FAST_PREFIX) > && (mode != (TARGET_64BIT ? TImode : DImode)) > - && CONST_INT_P (operands[2]) > - && CONST_INT_P (operands[3])) > + && CONST_INT_P (op2) > + && CONST_INT_P (op3)) > { > rtx out = operands[0]; > - HOST_WIDE_INT ct = INTVAL (operands[2]); > - HOST_WIDE_INT cf = INTVAL (operands[3]); > + HOST_WIDE_INT ct = INTVAL (op2); > + HOST_WIDE_INT cf = INTVAL (op3); > HOST_WIDE_INT diff; > > diff = ct - cf; > @@ -3559,6 +3573,9 @@ ix86_expand_int_movcc (rtx operands[]) > if (BRANCH_COST (optimize_insn_for_speed_p (), false) <= 2) > return false; > > + operands[2] = op2; > + operands[3] = op3; > + > /* If one of the two operands is an interesting constant, load a > constant with the above and mask it in with a logical operation. */ > > --- gcc/testsuite/gcc.target/i386/pr105338.c.jj 2022-04-22 16:14:35.827045371 > +0200 > +++ gcc/testsuite/gcc.target/i386/pr105338.c 2022-04-22 16:20:43.579913630 > +0200 > @@ -0,0 +1,26 @@ > +/* PR target/105338 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fno-ipa-icf -masm=att" } */ > +/* { dg-final { scan-assembler-times "\tnegl\t" 3 } } */ > +/* { dg-final { scan-assembler-times "\tsbbl\t" 3 } } */ > +/* { dg-final { scan-assembler-times "\tandl\t" 3 } } */ > + > +int > +foo (int i) > +{ > + return i ? 5 : 0; > +} > + > +int > +bar (int b) > +{ > + return !!b * 5; > +} > + > +int > +baz (int b) > +{ > + if (!b) > + return 0; > + return 5; > +} > > Jakub >