https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115352
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> --- I've first tried --- gcc/gimple-lower-bitint.cc.jj 2024-04-12 10:59:48.233153262 +0200 +++ gcc/gimple-lower-bitint.cc 2024-06-06 11:05:29.845597763 +0200 @@ -4324,7 +4324,8 @@ bitint_large_huge::lower_addsub_overflow else g = gimple_build_assign (this_ovf, NE_EXPR, l, cmp); insert_before (g); - if (cmp_code == GT_EXPR) + if (cmp_code == GT_EXPR + || (cmp_code == GE_EXPR && single_comparison)) { tree t = make_ssa_name (boolean_type_node); g = gimple_build_assign (t, BIT_IOR_EXPR, ovf, this_ovf); which fixes the #c0 testcase. But on the reduced int foo (_BitInt (385) b) { return __builtin_sub_overflow_p (0, b, (_BitInt (65)) 0); } int main () { if (!foo (-(_BitInt (385)) 0x00000000000000000c377e8a3fd1881fff035bb487a51c9ed1f7350befa7ec4450000000000000000a3cf8d1ebb723981wb)) __builtin_abort (); if (!foo (-0x1ffffffffffffffffc377e8a3fd1881fff035bb487a51c9ed1f7350befa7ec445ffffffffffffffffa3cf8d1ebb723981uwb)) __builtin_abort (); if (!foo (-(_BitInt (385)) 0x00000000000000000ffffffffffffffffffffffffffffffff00000000000000000000000000000000a3cf8d1ebb723981wb)) __builtin_abort (); if (!foo (-0x1ffffffffffffffff00000000000000000000000000000000ffffffffffffffffffffffffffffffffa3cf8d1ebb723981uwb)) __builtin_abort (); } testcase it only fixes the first 2 calls but not the last 2. So, I'm afraid I just need to kill the optimization instead: --- gcc/gimple-lower-bitint.cc.jj 2024-04-12 10:59:48.233153262 +0200 +++ gcc/gimple-lower-bitint.cc 2024-06-06 12:06:57.065717651 +0200 @@ -4286,11 +4286,7 @@ bitint_large_huge::lower_addsub_overflow bool single_comparison = (startlimb + 2 >= fin || (startlimb & 1) != (i & 1)); if (!single_comparison) - { - cmp_code = GE_EXPR; - if (!check_zero && (start % limb_prec) == 0) - single_comparison = true; - } + cmp_code = GE_EXPR; else if ((startlimb & 1) == (i & 1)) cmp_code = EQ_EXPR; else The idea behind the optimization was that arith_overflow_extract_bits in those cases is the same, we just extract all bits from the limb, whether it is the limb at the boundary (i.e. EQ_EXPR to the compared limb index) or above (GT_EXPR), so GE_EXPR would do. Except without the first patch it completely ignored the previously accumulated mismatches (i.e. overflows) from lower limbs, and while that no longer is the case with the first patch, it still ignores whether the upper bits were all 0s or all 1s previously and as long as they are again all 0s or all 1s, it happily makes it non-overflow case and what the next limb should compare against.