Hi! In expand_mul_overflow, we emit (depending on operand ranges etc.) more than one multiplication to optimize for the common case. One of those is where we are guaranteed that both operands are either sign or zero extended from half precision types to the full precision, we can then use much shorter sequence. opN_small_p is a compile time predicate whether the corresponding argument is known to be in that range. If yes, we just emit the simple code, otherwise we emit a runtime conditional to check (for uns compare the high half of bits against zero, for !uns against signbit of the low part shifted arithmetically right). In both cases we use SUBREG_PROMOTED_VAR_P on the SUBREG to propagate this information down to the rest of the expansion code.
Unfortunately, seems the RTL hoist pass hapilly hoists the subreg/s/v in this case before the conditional branch, the r259649 change in simplify-rtx.c optimizes more than in the past a ZERO_EXTENSION from subreg/s/v into just a pseudo copy and later on yet another pass optimizes the comparison away. The following patch fixes that by emitting slightly different code if !opN_small_p - instead of adding a subreg/s/v on the original pseudo, it adds a new pseudo, copies original pseudo to it only in the bb guarded with the condition and makes subreg/s/v on that new pseudo. That is enough to tell the hoisting that it shouldn't hoist it, if even that wouldn't work, I'd say that SUBREG_PROMOTED_VAR_P would be pretty much useless. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2019-04-16 Jakub Jelinek <ja...@redhat.com> PR middle-end/90095 * internal-fn.c (expand_mul_overflow): Don't set SUBREG_PROMOTED_VAR_P on lowpart SUBREG of op0 if !op0_small_p, instead copy op0 into a new pseudo and set SUBREG_PROMOTED_VAR_P on a lowpart SUBREG of that pseudo. Similarly for op1. * gcc.dg/pr90095.c: New test. --- gcc/internal-fn.c.jj 2019-01-07 09:47:33.000000000 +0100 +++ gcc/internal-fn.c 2019-04-15 17:41:50.915039293 +0200 @@ -1756,13 +1756,35 @@ expand_mul_overflow (location_t loc, tre rtx lopart0s = lopart0, lopart1s = lopart1; if (GET_CODE (lopart0) == SUBREG) { - lopart0s = shallow_copy_rtx (lopart0); + if (!op0_small_p) + { + /* If !op0_small_p, the SUBREG_PROMOTED_VAR_P state we + want to set is just local to the containing basic block, + guarded with the above conditional branch. Make sure + to create a new pseudo, so that hoisting doesn't hoist + the promoted subreg before the conditional and optimize + away the conditional incorrectly. See PR90095 for + details. */ + rtx op0_copy = gen_reg_rtx (mode); + emit_move_insn (op0_copy, op0); + lopart0s = convert_modes (hmode, mode, op0_copy, uns); + gcc_assert (GET_CODE (lopart0s) == SUBREG); + } + lopart0s = shallow_copy_rtx (lopart0s); SUBREG_PROMOTED_VAR_P (lopart0s) = 1; SUBREG_PROMOTED_SET (lopart0s, uns ? SRP_UNSIGNED : SRP_SIGNED); } if (GET_CODE (lopart1) == SUBREG) { - lopart1s = shallow_copy_rtx (lopart1); + if (!op1_small_p) + { + /* See above PR90095 comment. */ + rtx op1_copy = gen_reg_rtx (mode); + emit_move_insn (op1_copy, op1); + lopart1s = convert_modes (hmode, mode, op1_copy, uns); + gcc_assert (GET_CODE (lopart1s) == SUBREG); + } + lopart1s = shallow_copy_rtx (lopart1s); SUBREG_PROMOTED_VAR_P (lopart1s) = 1; SUBREG_PROMOTED_SET (lopart1s, uns ? SRP_UNSIGNED : SRP_SIGNED); } --- gcc/testsuite/gcc.dg/pr90095.c.jj 2019-04-15 17:47:21.402673528 +0200 +++ gcc/testsuite/gcc.dg/pr90095.c 2019-04-15 17:52:38.111531221 +0200 @@ -0,0 +1,18 @@ +/* PR middle-end/90095 */ +/* { dg-do run } */ +/* { dg-options "-Os -fno-tree-bit-ccp" } */ + +unsigned long long a; +unsigned int b; + +int +main () +{ + unsigned int c = 255, d = c |= b; + if (__CHAR_BIT__ != 8 || __SIZEOF_INT__ != 4 || __SIZEOF_LONG_LONG__ != 8) + return 0; + d = __builtin_mul_overflow (-(unsigned long long) d, (unsigned char) - c, &a); + if (d != 0) + __builtin_abort (); + return 0; +} Jakub