https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77881
--- Comment #4 from Michael Matz <matz at gcc dot gnu.org> --- Actually, it's merely a deficiency in current combine not simplifying intermediate expressions enough. One of the things that need to happen is the following transformation: (compare:CCZ (subreg:QI (lshiftrt:DI (reg:DI 95) (const_int 63 [0x3f])) 0) (const_int 0 [0])) --> (remove subreg) (compare:CCZ (lshiftrt:DI (reg:DI 95) (const_int 63 [0x3f])) (const_int 0 [0])) --> (recognize that it's a sign bit extract) (ge (reg:DI 95) (const_int 0)) (or 'lt', doesn't matter, depends on the outer code by which the compare:CCZ result itself is compared). In current combine.c this requires two steps, the removal of the irrelevant subreg (irrelevant in this specific context), and then the recognition of the sign bit extraction. With the first function combine has the chance for two attempts of simplification because we have a sequence of three isnstructions to start with: t1 = ~a t2 = 255 & (t1 >> 63) flags = t2 != 0 The intermediate expression is combine_simplify_rtx'ed twice, so both steps above happen, and we get good code. In the second function we only have two instructions to start with (the not is missing): t2 = 255 & (a >> 63) flags = t2 == 0 Only the first step above happens, we're left with the (lshiftrt(subreg)) and nothing simplifies this further before it tries to recognize the insn (which ultimately fails). The same can also be seen when artificially forcing the expression to be a tad more complicated: int bar2 (long long int a, long long int a2, int b) { if ((a+a2) < 0 || b) baz(); } Here, we also get good code again, simply because combine can have two attempts at the intermediate expression. After some amount of tracing combine I've come up with the below patch. The simplify_comparison function already contains loops that effectively retry simplification after a change occured. But the code that actually removes a useless subreg is after that loop. Putting it into the loop as well fixes the problem. It can't be removed from the old place because between the loop and subreg removal it might actually change the expression further due to make_compound_operation (though I don't know if that really creates subregs often). combines normal facilities of not doing combines when intermediate results are really used outside will take care of not creating useless code. Index: combine.c =================================================================== --- combine.c (revision 235171) +++ combine.c (working copy) @@ -11891,6 +11891,27 @@ simplify_comparison (enum rtx_code code, if (subreg_lowpart_p (op0) && GET_MODE_PRECISION (GET_MODE (SUBREG_REG (op0))) < mode_width) /* Fall through */ ; + else if (subreg_lowpart_p (op0) + && GET_MODE_CLASS (GET_MODE (op0)) == MODE_INT + && GET_MODE_CLASS (GET_MODE (SUBREG_REG (op0))) == MODE_INT + && (code == NE || code == EQ) + && (GET_MODE_PRECISION (GET_MODE (SUBREG_REG (op0))) + <= HOST_BITS_PER_WIDE_INT) + && !paradoxical_subreg_p (op0) + && (nonzero_bits (SUBREG_REG (op0), + GET_MODE (SUBREG_REG (op0))) + & ~GET_MODE_MASK (GET_MODE (op0))) == 0) + { + tem = gen_lowpart (GET_MODE (SUBREG_REG (op0)), op1); + + if ((nonzero_bits (tem, GET_MODE (SUBREG_REG (op0))) + & ~GET_MODE_MASK (GET_MODE (op0))) == 0) + { + op0 = SUBREG_REG (op0), op1 = tem; + continue; + } + break; + } else break;