Hi!

In expand_mul_overflow, we emit (depending on operand ranges etc.) more than
one multiplication to optimize for the common case.  One of those is where
we are guaranteed that both operands are either sign or zero extended from
half precision types to the full precision, we can then use much shorter
sequence.  opN_small_p is a compile time predicate whether the corresponding
argument is known to be in that range.  If yes, we just emit the simple
code, otherwise we emit a runtime conditional to check (for uns compare the
high half of bits against zero, for !uns against signbit of the low part
shifted arithmetically right).  In both cases we use SUBREG_PROMOTED_VAR_P
on the SUBREG to propagate this information down to the rest of the
expansion code.

Unfortunately, seems the RTL hoist pass hapilly hoists the subreg/s/v in
this case before the conditional branch, the r259649 change in
simplify-rtx.c optimizes more than in the past a ZERO_EXTENSION from
subreg/s/v into just a pseudo copy and later on yet another pass optimizes
the comparison away.

The following patch fixes that by emitting slightly different code if
!opN_small_p - instead of adding a subreg/s/v on the original pseudo, it
adds a new pseudo, copies original pseudo to it only in the bb guarded with
the condition and makes subreg/s/v on that new pseudo.  That is enough to
tell the hoisting that it shouldn't hoist it, if even that wouldn't work,
I'd say that SUBREG_PROMOTED_VAR_P would be pretty much useless.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-04-16  Jakub Jelinek  <ja...@redhat.com>

        PR middle-end/90095
        * internal-fn.c (expand_mul_overflow): Don't set SUBREG_PROMOTED_VAR_P
        on lowpart SUBREG of op0 if !op0_small_p, instead copy op0 into a new
        pseudo and set SUBREG_PROMOTED_VAR_P on a lowpart SUBREG of that
        pseudo.  Similarly for op1.

        * gcc.dg/pr90095.c: New test.

--- gcc/internal-fn.c.jj        2019-01-07 09:47:33.000000000 +0100
+++ gcc/internal-fn.c   2019-04-15 17:41:50.915039293 +0200
@@ -1756,13 +1756,35 @@ expand_mul_overflow (location_t loc, tre
          rtx lopart0s = lopart0, lopart1s = lopart1;
          if (GET_CODE (lopart0) == SUBREG)
            {
-             lopart0s = shallow_copy_rtx (lopart0);
+             if (!op0_small_p)
+               {
+                 /* If !op0_small_p, the SUBREG_PROMOTED_VAR_P state we
+                    want to set is just local to the containing basic block,
+                    guarded with the above conditional branch.  Make sure
+                    to create a new pseudo, so that hoisting doesn't hoist
+                    the promoted subreg before the conditional and optimize
+                    away the conditional incorrectly.  See PR90095 for
+                    details.  */
+                 rtx op0_copy = gen_reg_rtx (mode);
+                 emit_move_insn (op0_copy, op0);
+                 lopart0s = convert_modes (hmode, mode, op0_copy, uns);
+                 gcc_assert (GET_CODE (lopart0s) == SUBREG);
+               }
+             lopart0s = shallow_copy_rtx (lopart0s);
              SUBREG_PROMOTED_VAR_P (lopart0s) = 1;
              SUBREG_PROMOTED_SET (lopart0s, uns ? SRP_UNSIGNED : SRP_SIGNED);
            }
          if (GET_CODE (lopart1) == SUBREG)
            {
-             lopart1s = shallow_copy_rtx (lopart1);
+             if (!op1_small_p)
+               {
+                 /* See above PR90095 comment.  */
+                 rtx op1_copy = gen_reg_rtx (mode);
+                 emit_move_insn (op1_copy, op1);
+                 lopart1s = convert_modes (hmode, mode, op1_copy, uns);
+                 gcc_assert (GET_CODE (lopart1s) == SUBREG);
+               }
+             lopart1s = shallow_copy_rtx (lopart1s);
              SUBREG_PROMOTED_VAR_P (lopart1s) = 1;
              SUBREG_PROMOTED_SET (lopart1s, uns ? SRP_UNSIGNED : SRP_SIGNED);
            }
--- gcc/testsuite/gcc.dg/pr90095.c.jj   2019-04-15 17:47:21.402673528 +0200
+++ gcc/testsuite/gcc.dg/pr90095.c      2019-04-15 17:52:38.111531221 +0200
@@ -0,0 +1,18 @@
+/* PR middle-end/90095 */
+/* { dg-do run } */
+/* { dg-options "-Os -fno-tree-bit-ccp" } */
+
+unsigned long long a;
+unsigned int b;
+
+int
+main ()
+{
+  unsigned int c = 255, d = c |= b;
+  if (__CHAR_BIT__ != 8 || __SIZEOF_INT__ != 4 || __SIZEOF_LONG_LONG__ != 8)
+    return 0;
+  d = __builtin_mul_overflow (-(unsigned long long) d, (unsigned char) - c, 
&a);
+  if (d != 0)
+    __builtin_abort ();
+  return 0;
+}

        Jakub

Reply via email to