On Sat, Apr 23, 2022 at 8:53 AM Jakub Jelinek <ja...@redhat.com> wrote:
>
> Hi!
>
> The following testcase regressed on x86_64 on the trunk, due to some GIMPLE
> pass changes (r12-7687) we end up an *.optimized dump difference of:
> @@ -8,14 +8,14 @@ int foo (int i)
>
>    <bb 2> [local count: 1073741824]:
>    if (i_2(D) != 0)
> -    goto <bb 4>; [35.00%]
> +    goto <bb 3>; [35.00%]
>    else
> -    goto <bb 3>; [65.00%]
> +    goto <bb 4>; [65.00%]
>
> -  <bb 3> [local count: 697932184]:
> +  <bb 3> [local count: 375809640]:
>
>    <bb 4> [local count: 1073741824]:
> -  # iftmp.0_1 = PHI <5(2), i_2(D)(3)>
> +  # iftmp.0_1 = PHI <5(3), i_2(D)(2)>
>    return iftmp.0_1;
>
>  }
> and similarly for the other functions.  That is functionally equivalent and
> there is no canonical form for those.  The reason for i_2(D) in the PHI
> argument as opposed to 0 is the uncprop pass, that is in many cases
> beneficial for expansion as we don't need to load the value into some pseudo
> in one of the if blocks.
> Now, for the 11.x ordering we have the pseudo = i insn in the extended basic
> block (it comes first) and so forwprop1 undoes what uncprop does by
> propagating constant 0 there.  But for the 12.x ordering, the extended basic
> block contains pseudo = 5 and pseudo = i is in the other bb and so fwprop1
> doesn't change it.
> During the ce1 pass, we attempt to emit a conditional move and we have very
> nice code for the cases where both last operands of ?: are constant, and yet
> another for !TARGET_CMOVE if at least one of them is.
>
> The following patch will undo the uncprop behavior during
> ix86_expand_int_movcc, but just for those spots that can benefit from both
> or at least one operands being constant, leaving the pure cmov case as is
> (because then it is useful not to have to load a constant into a pseudo
> as it already is in one).  We can do that in the
> op0 == op1 ? op0 : op3
> or
> op0 != op1 ? op2 : op0
> cases if op1 is a CONST_INT by pretending it is
> op0 == op1 ? op1 : op3
> or
> op0 != op1 ? op2 : op1
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2022-04-23  Jakub Jelinek  <ja...@redhat.com>
>
>         PR target/105338
>         * config/i386/i386-expand.cc (ix86_expand_int_movcc): Handle
>         op0 == cst1 ? op0 : op3 like op0 == cst1 ? cst1 : op3 for the non-cmov
>         cases.
>
>         * gcc.target/i386/pr105338.c: New test.

OK.

Thanks,
Uros.

>
> --- gcc/config/i386/i386-expand.cc.jj   2022-04-13 15:42:39.000000000 +0200
> +++ gcc/config/i386/i386-expand.cc      2022-04-22 14:18:27.347135185 +0200
> @@ -3136,6 +3136,8 @@ ix86_expand_int_movcc (rtx operands[])
>    bool sign_bit_compare_p = false;
>    rtx op0 = XEXP (operands[1], 0);
>    rtx op1 = XEXP (operands[1], 1);
> +  rtx op2 = operands[2];
> +  rtx op3 = operands[3];
>
>    if (GET_MODE (op0) == TImode
>        || (GET_MODE (op0) == DImode
> @@ -3153,17 +3155,29 @@ ix86_expand_int_movcc (rtx operands[])
>        || (op1 == constm1_rtx && (code == GT || code == LE)))
>      sign_bit_compare_p = true;
>
> +  /* op0 == op1 ? op0 : op3 is equivalent to op0 == op1 ? op1 : op3,
> +     but if op1 is a constant, the latter form allows more optimizations,
> +     either through the last 2 ops being constant handling, or the one
> +     constant and one variable cases.  On the other side, for cmov the
> +     former might be better as we don't need to load the constant into
> +     another register.  */
> +  if (code == EQ && CONST_INT_P (op1) && rtx_equal_p (op0, op2))
> +    op2 = op1;
> +  /* Similarly for op0 != op1 ? op2 : op0 and op0 != op1 ? op2 : op1.  */
> +  else if (code == NE && CONST_INT_P (op1) && rtx_equal_p (op0, op3))
> +    op3 = op1;
> +
>    /* Don't attempt mode expansion here -- if we had to expand 5 or 6
>       HImode insns, we'd be swallowed in word prefix ops.  */
>
>    if ((mode != HImode || TARGET_FAST_PREFIX)
>        && (mode != (TARGET_64BIT ? TImode : DImode))
> -      && CONST_INT_P (operands[2])
> -      && CONST_INT_P (operands[3]))
> +      && CONST_INT_P (op2)
> +      && CONST_INT_P (op3))
>      {
>        rtx out = operands[0];
> -      HOST_WIDE_INT ct = INTVAL (operands[2]);
> -      HOST_WIDE_INT cf = INTVAL (operands[3]);
> +      HOST_WIDE_INT ct = INTVAL (op2);
> +      HOST_WIDE_INT cf = INTVAL (op3);
>        HOST_WIDE_INT diff;
>
>        diff = ct - cf;
> @@ -3559,6 +3573,9 @@ ix86_expand_int_movcc (rtx operands[])
>        if (BRANCH_COST (optimize_insn_for_speed_p (), false) <= 2)
>         return false;
>
> +      operands[2] = op2;
> +      operands[3] = op3;
> +
>        /* If one of the two operands is an interesting constant, load a
>          constant with the above and mask it in with a logical operation.  */
>
> --- gcc/testsuite/gcc.target/i386/pr105338.c.jj 2022-04-22 16:14:35.827045371 
> +0200
> +++ gcc/testsuite/gcc.target/i386/pr105338.c    2022-04-22 16:20:43.579913630 
> +0200
> @@ -0,0 +1,26 @@
> +/* PR target/105338 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-ipa-icf -masm=att" } */
> +/* { dg-final { scan-assembler-times "\tnegl\t" 3 } } */
> +/* { dg-final { scan-assembler-times "\tsbbl\t" 3 } } */
> +/* { dg-final { scan-assembler-times "\tandl\t" 3 } } */
> +
> +int
> +foo (int i)
> +{
> +  return i ? 5 : 0;
> +}
> +
> +int
> +bar (int b)
> +{
> +  return !!b * 5;
> +}
> +
> +int
> +baz (int b)
> +{
> +  if (!b)
> +    return 0;
> +  return 5;
> +}
>
>         Jakub
>

Reply via email to