On Tue, Jun 13, 2023 at 08:40:36AM +0000, Richard Biener wrote:
> I suspect re-association can wreck things even more here.  I have
> to say the matching code is very hard to follow, not sure if
> splitting out a function matching
> 
>    _22 = .{ADD,SUB}_OVERFLOW (_6, _5);
>    _23 = REALPART_EXPR <_22>;
>    _24 = IMAGPART_EXPR <_22>;
> 
> from _23 and _24 would help?

I've outlined 3 most often used sequences of statements or checks
into 3 helper functions, hope that helps.

> > +      while (TREE_CODE (rhs[0]) == SSA_NAME && !rhs[3])
> > +   {
> > +     gimple *g = SSA_NAME_DEF_STMT (rhs[0]);
> > +     if (has_single_use (rhs[0])
> > +         && is_gimple_assign (g)
> > +         && (gimple_assign_rhs_code (g) == code
> > +             || (code == MINUS_EXPR
> > +                 && gimple_assign_rhs_code (g) == PLUS_EXPR
> > +                 && TREE_CODE (gimple_assign_rhs2 (g)) == INTEGER_CST)))
> > +       {
> > +         rhs[0] = gimple_assign_rhs1 (g);
> > +         tree &r = rhs[2] ? rhs[3] : rhs[2];
> > +         r = gimple_assign_rhs2 (g);
> > +         if (gimple_assign_rhs_code (g) != code)
> > +           r = fold_build1 (NEGATE_EXPR, TREE_TYPE (r), r);
> 
> Can you use const_unop here?  In fact both will not reliably
> negate all constants (ick), so maybe we want a force_const_negate ()?

It is unsigned type NEGATE_EXPR of INTEGER_CST, so I think it should
work.  That said, changed it to const_unop and am just giving up on it
as if it wasn't a PLUS_EXPR with INTEGER_CST addend if const_unop doesn't
simplify.

> > +     else if (addc_subc)
> > +       {
> > +         if (!integer_zerop (arg2))
> > +           ;
> > +         /* x = y + 0 + 0; x = y - 0 - 0; */
> > +         else if (integer_zerop (arg1))
> > +           result = arg0;
> > +         /* x = 0 + y + 0; */
> > +         else if (subcode != MINUS_EXPR && integer_zerop (arg0))
> > +           result = arg1;
> > +         /* x = y - y - 0; */
> > +         else if (subcode == MINUS_EXPR
> > +                  && operand_equal_p (arg0, arg1, 0))
> > +           result = integer_zero_node;
> > +       }
> 
> So this all performs simplifications but also constant folding.  In
> particular the match.pd re-simplification will invoke fold_const_call
> on all-constant argument function calls but does not do extra folding
> on partially constant arg cases but instead relies on patterns here.
> 
> Can you add all-constant arg handling to fold_const_call and
> consider moving cases like y + 0 + 0 to match.pd?

The reason I've done this here is that this is the spot where all other
similar internal functions are handled, be it the ubsan ones
- IFN_UBSAN_CHECK_{ADD,SUB,MUL}, or __builtin_*_overflow ones
- IFN_{ADD,SUB,MUL}_OVERFLOW, or these 2 new ones.  The code handles
there 2 constant arguments as well as various patterns that can be
simplified and has code to clean it up later, build a COMPLEX_CST,
or COMPLEX_EXPR etc. as needed.  So, I think we want to handle those
elsewhere, we should do it for all of those functions, but then
probably incrementally.

> > +@cindex @code{addc@var{m}5} instruction pattern
> > +@item @samp{addc@var{m}5}
> > +Adds operands 2, 3 and 4 (where the last operand is guaranteed to have
> > +only values 0 or 1) together, sets operand 0 to the result of the
> > +addition of the 3 operands and sets operand 1 to 1 iff there was no
> > +overflow on the unsigned additions, and to 0 otherwise.  So, it is
> > +an addition with carry in (operand 4) and carry out (operand 1).
> > +All operands have the same mode.
> 
> operand 1 set to 1 for no overflow sounds weird when specifying it
> as carry out - can you double check?

Fixed.

> > +@cindex @code{subc@var{m}5} instruction pattern
> > +@item @samp{subc@var{m}5}
> > +Similarly to @samp{addc@var{m}5}, except subtracts operands 3 and 4
> > +from operand 2 instead of adding them.  So, it is
> > +a subtraction with carry/borrow in (operand 4) and carry/borrow out
> > +(operand 1).  All operands have the same mode.
> > +
> 
> I wonder if we want to name them uaddc and usubc?  Or is this supposed
> to be simply the twos-complement "carry"?  I think the docs should
> say so then (note we do have uaddv and addv).

Makes sense, I've actually renamed even the internal functions etc.

Here is only lightly tested patch with everything but gimple-fold.cc
changed.

2023-06-13  Jakub Jelinek  <ja...@redhat.com>

        PR middle-end/79173
        * internal-fn.def (UADDC, USUBC): New internal functions.
        * internal-fn.cc (expand_UADDC, expand_USUBC): New functions.
        (commutative_ternary_fn_p): Return true also for IFN_UADDC.
        * optabs.def (uaddc5_optab, usubc5_optab): New optabs.
        * tree-ssa-math-opts.cc (uaddc_cast, uaddc_ne0, uaddc_is_cplxpart,
        match_uaddc_usubc): New functions.
        (math_opts_dom_walker::after_dom_children): Call match_uaddc_usubc
        for PLUS_EXPR, MINUS_EXPR, BIT_IOR_EXPR and BIT_XOR_EXPR unless
        other optimizations have been successful for those.
        * gimple-fold.cc (gimple_fold_call): Handle IFN_UADDC and IFN_USUBC.
        * gimple-range-fold.cc (adjust_imagpart_expr): Likewise.
        * tree-ssa-dce.cc (eliminate_unnecessary_stmts): Likewise.
        * doc/md.texi (uaddc<mode>5, usubc<mode>5): Document new named
        patterns.
        * config/i386/i386.md (subborrow<mode>): Add alternative with
        memory destination.
        (uaddc<mode>5, usubc<mode>5): New define_expand patterns.
        (*sub<mode>_3, @add<mode>3_carry, addcarry<mode>, @sub<mode>3_carry,
        subborrow<mode>, *add<mode>3_cc_overflow_1): Add define_peephole2
        TARGET_READ_MODIFY_WRITE/-Os patterns to prefer using memory
        destination in these patterns.

        * gcc.target/i386/pr79173-1.c: New test.
        * gcc.target/i386/pr79173-2.c: New test.
        * gcc.target/i386/pr79173-3.c: New test.
        * gcc.target/i386/pr79173-4.c: New test.
        * gcc.target/i386/pr79173-5.c: New test.
        * gcc.target/i386/pr79173-6.c: New test.
        * gcc.target/i386/pr79173-7.c: New test.
        * gcc.target/i386/pr79173-8.c: New test.
        * gcc.target/i386/pr79173-9.c: New test.
        * gcc.target/i386/pr79173-10.c: New test.

--- gcc/internal-fn.def.jj      2023-06-12 15:47:22.190506569 +0200
+++ gcc/internal-fn.def 2023-06-13 12:30:22.951974357 +0200
@@ -416,6 +416,8 @@ DEF_INTERNAL_FN (ASAN_POISON_USE, ECF_LE
 DEF_INTERNAL_FN (ADD_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (SUB_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (MUL_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
+DEF_INTERNAL_FN (UADDC, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
+DEF_INTERNAL_FN (USUBC, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (TSAN_FUNC_EXIT, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (VA_ARG, ECF_NOTHROW | ECF_LEAF, NULL)
 DEF_INTERNAL_FN (VEC_CONVERT, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
--- gcc/internal-fn.cc.jj       2023-06-07 09:42:14.680130597 +0200
+++ gcc/internal-fn.cc  2023-06-13 12:30:23.361968621 +0200
@@ -2776,6 +2776,44 @@ expand_MUL_OVERFLOW (internal_fn, gcall
   expand_arith_overflow (MULT_EXPR, stmt);
 }
 
+/* Expand UADDC STMT.  */
+
+static void
+expand_UADDC (internal_fn ifn, gcall *stmt)
+{
+  tree lhs = gimple_call_lhs (stmt);
+  tree arg1 = gimple_call_arg (stmt, 0);
+  tree arg2 = gimple_call_arg (stmt, 1);
+  tree arg3 = gimple_call_arg (stmt, 2);
+  tree type = TREE_TYPE (arg1);
+  machine_mode mode = TYPE_MODE (type);
+  insn_code icode = optab_handler (ifn == IFN_UADDC
+                                  ? uaddc5_optab : usubc5_optab, mode);
+  rtx op1 = expand_normal (arg1);
+  rtx op2 = expand_normal (arg2);
+  rtx op3 = expand_normal (arg3);
+  rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
+  rtx re = gen_reg_rtx (mode);
+  rtx im = gen_reg_rtx (mode);
+  class expand_operand ops[5];
+  create_output_operand (&ops[0], re, mode);
+  create_output_operand (&ops[1], im, mode);
+  create_input_operand (&ops[2], op1, mode);
+  create_input_operand (&ops[3], op2, mode);
+  create_input_operand (&ops[4], op3, mode);
+  expand_insn (icode, 5, ops);
+  write_complex_part (target, re, false, false);
+  write_complex_part (target, im, true, false);
+}
+
+/* Expand USUBC STMT.  */
+
+static void
+expand_USUBC (internal_fn ifn, gcall *stmt)
+{
+  expand_UADDC (ifn, stmt);
+}
+
 /* This should get folded in tree-vectorizer.cc.  */
 
 static void
@@ -4049,6 +4087,7 @@ commutative_ternary_fn_p (internal_fn fn
     case IFN_FMS:
     case IFN_FNMA:
     case IFN_FNMS:
+    case IFN_UADDC:
       return true;
 
     default:
--- gcc/optabs.def.jj   2023-06-12 15:47:22.261505587 +0200
+++ gcc/optabs.def      2023-06-13 12:30:23.372968467 +0200
@@ -260,6 +260,8 @@ OPTAB_D (uaddv4_optab, "uaddv$I$a4")
 OPTAB_D (usubv4_optab, "usubv$I$a4")
 OPTAB_D (umulv4_optab, "umulv$I$a4")
 OPTAB_D (negv3_optab, "negv$I$a3")
+OPTAB_D (uaddc5_optab, "uaddc$I$a5")
+OPTAB_D (usubc5_optab, "usubc$I$a5")
 OPTAB_D (addptr3_optab, "addptr$a3")
 OPTAB_D (spaceship_optab, "spaceship$a3")
 
--- gcc/tree-ssa-math-opts.cc.jj        2023-06-07 09:41:49.573479611 +0200
+++ gcc/tree-ssa-math-opts.cc   2023-06-13 13:04:43.699152339 +0200
@@ -4441,6 +4441,434 @@ match_arith_overflow (gimple_stmt_iterat
   return false;
 }
 
+/* Helper of match_uaddc_usubc.  Look through an integral cast
+   which should preserve [0, 1] range value (unless source has
+   1-bit signed type) and the cast has single use.  */
+
+static gimple *
+uaddc_cast (gimple *g)
+{
+  if (!gimple_assign_cast_p (g))
+    return g;
+  tree op = gimple_assign_rhs1 (g);
+  if (TREE_CODE (op) == SSA_NAME
+      && INTEGRAL_TYPE_P (TREE_TYPE (op))
+      && (TYPE_PRECISION (TREE_TYPE (op)) > 1
+         || TYPE_UNSIGNED (TREE_TYPE (op)))
+      && has_single_use (gimple_assign_lhs (g)))
+    return SSA_NAME_DEF_STMT (op);
+  return g;
+}
+
+/* Helper of match_uaddc_usubc.  Look through a NE_EXPR
+   comparison with 0 which also preserves [0, 1] value range.  */
+
+static gimple *
+uaddc_ne0 (gimple *g)
+{
+  if (is_gimple_assign (g)
+      && gimple_assign_rhs_code (g) == NE_EXPR
+      && integer_zerop (gimple_assign_rhs2 (g))
+      && TREE_CODE (gimple_assign_rhs1 (g)) == SSA_NAME
+      && has_single_use (gimple_assign_lhs (g)))
+    return SSA_NAME_DEF_STMT (gimple_assign_rhs1 (g));
+  return g;
+}
+
+/* Return true if G is {REAL,IMAG}PART_EXPR PART with SSA_NAME
+   operand.  */
+
+static bool
+uaddc_is_cplxpart (gimple *g, tree_code part)
+{
+  return (is_gimple_assign (g)
+         && gimple_assign_rhs_code (g) == part
+         && TREE_CODE (TREE_OPERAND (gimple_assign_rhs1 (g), 0)) == SSA_NAME);
+}
+
+/* Try to match e.g.
+   _29 = .ADD_OVERFLOW (_3, _4);
+   _30 = REALPART_EXPR <_29>;
+   _31 = IMAGPART_EXPR <_29>;
+   _32 = .ADD_OVERFLOW (_30, _38);
+   _33 = REALPART_EXPR <_32>;
+   _34 = IMAGPART_EXPR <_32>;
+   _35 = _31 + _34;
+   as
+   _36 = .UADDC (_3, _4, _38);
+   _33 = REALPART_EXPR <_36>;
+   _35 = IMAGPART_EXPR <_36>;
+   or
+   _22 = .SUB_OVERFLOW (_6, _5);
+   _23 = REALPART_EXPR <_22>;
+   _24 = IMAGPART_EXPR <_22>;
+   _25 = .SUB_OVERFLOW (_23, _37);
+   _26 = REALPART_EXPR <_25>;
+   _27 = IMAGPART_EXPR <_25>;
+   _28 = _24 | _27;
+   as
+   _29 = .USUBC (_6, _5, _37);
+   _26 = REALPART_EXPR <_29>;
+   _288 = IMAGPART_EXPR <_29>;
+   provided _38 or _37 above have [0, 1] range
+   and _3, _4 and _30 or _6, _5 and _23 are unsigned
+   integral types with the same precision.  Whether + or | or ^ is
+   used on the IMAGPART_EXPR results doesn't matter, with one of
+   added or subtracted operands in [0, 1] range at most one
+   .ADD_OVERFLOW or .SUB_OVERFLOW will indicate overflow.  */
+
+static bool
+match_uaddc_usubc (gimple_stmt_iterator *gsi, gimple *stmt, tree_code code)
+{
+  tree rhs[4];
+  rhs[0] = gimple_assign_rhs1 (stmt);
+  rhs[1] = gimple_assign_rhs2 (stmt);
+  rhs[2] = NULL_TREE;
+  rhs[3] = NULL_TREE;
+  tree type = TREE_TYPE (rhs[0]);
+  if (!INTEGRAL_TYPE_P (type) || !TYPE_UNSIGNED (type))
+    return false;
+
+  if (code != BIT_IOR_EXPR && code != BIT_XOR_EXPR)
+    {
+      /* If overflow flag is ignored on the MSB limb, we can end up with
+        the most significant limb handled as r = op1 + op2 + ovf1 + ovf2;
+        or r = op1 - op2 - ovf1 - ovf2; or various equivalent expressions
+        thereof.  Handle those like the ovf = ovf1 + ovf2; case to recognize
+        the limb below the MSB, but also create another .UADDC/.USUBC call
+        for the last limb.  */
+      while (TREE_CODE (rhs[0]) == SSA_NAME && !rhs[3])
+       {
+         gimple *g = SSA_NAME_DEF_STMT (rhs[0]);
+         if (has_single_use (rhs[0])
+             && is_gimple_assign (g)
+             && (gimple_assign_rhs_code (g) == code
+                 || (code == MINUS_EXPR
+                     && gimple_assign_rhs_code (g) == PLUS_EXPR
+                     && TREE_CODE (gimple_assign_rhs2 (g)) == INTEGER_CST)))
+           {
+             tree r2 = gimple_assign_rhs2 (g);
+             if (gimple_assign_rhs_code (g) != code)
+               {
+                 r2 = const_unop (NEGATE_EXPR, TREE_TYPE (r2), r2);
+                 if (!r2)
+                   break;
+               }
+             rhs[0] = gimple_assign_rhs1 (g);
+             tree &r = rhs[2] ? rhs[3] : rhs[2];
+             r = r2;
+           }
+         else
+           break;
+       }
+      while (TREE_CODE (rhs[1]) == SSA_NAME && !rhs[3])
+       {
+         gimple *g = SSA_NAME_DEF_STMT (rhs[1]);
+         if (has_single_use (rhs[1])
+             && is_gimple_assign (g)
+             && gimple_assign_rhs_code (g) == PLUS_EXPR)
+           {
+             rhs[1] = gimple_assign_rhs1 (g);
+             if (rhs[2])
+               rhs[3] = gimple_assign_rhs2 (g);
+             else
+               rhs[2] = gimple_assign_rhs2 (g);
+           }
+         else
+           break;
+       }
+      if (rhs[2] && !rhs[3])
+       {
+         for (int i = (code == MINUS_EXPR ? 1 : 0); i < 3; ++i)
+           if (TREE_CODE (rhs[i]) == SSA_NAME)
+             {
+               gimple *im = uaddc_cast (SSA_NAME_DEF_STMT (rhs[i]));
+               im = uaddc_ne0 (im);
+               if (uaddc_is_cplxpart (im, IMAGPART_EXPR))
+                 {
+                   tree rhs1 = gimple_assign_rhs1 (im);
+                   gimple *ovf = SSA_NAME_DEF_STMT (TREE_OPERAND (rhs1, 0));
+                   if (gimple_call_internal_p (ovf, code == PLUS_EXPR
+                                                    ? IFN_UADDC : IFN_USUBC)
+                       && (optab_handler (code == PLUS_EXPR
+                                          ? uaddc5_optab : usubc5_optab,
+                                          TYPE_MODE (type))
+                           != CODE_FOR_nothing))
+                     {
+                       if (i != 2)
+                         std::swap (rhs[i], rhs[2]);
+                       gimple *g
+                         = gimple_build_call_internal (code == PLUS_EXPR
+                                                       ? IFN_UADDC
+                                                       : IFN_USUBC,
+                                                       3, rhs[0], rhs[1],
+                                                       rhs[2]);
+                       tree nlhs = make_ssa_name (build_complex_type (type));
+                       gimple_call_set_lhs (g, nlhs);
+                       gsi_insert_before (gsi, g, GSI_SAME_STMT);
+                       tree ilhs = gimple_assign_lhs (stmt);
+                       g = gimple_build_assign (ilhs, REALPART_EXPR,
+                                                build1 (REALPART_EXPR,
+                                                        TREE_TYPE (ilhs),
+                                                        nlhs));
+                       gsi_replace (gsi, g, true);
+                       return true;
+                     }
+                 }
+             }
+         return false;
+       }
+      if (code == MINUS_EXPR && !rhs[2])
+       return false;
+      if (code == MINUS_EXPR)
+       /* Code below expects rhs[0] and rhs[1] to have the IMAGPART_EXPRs.
+          So, for MINUS_EXPR swap the single added rhs operand (others are
+          subtracted) to rhs[3].  */
+       std::swap (rhs[0], rhs[3]);
+    }
+  gimple *im1 = NULL, *im2 = NULL;
+  for (int i = 0; i < (code == MINUS_EXPR ? 3 : 4); i++)
+    if (rhs[i] && TREE_CODE (rhs[i]) == SSA_NAME)
+      {
+       gimple *im = uaddc_cast (SSA_NAME_DEF_STMT (rhs[i]));
+       im = uaddc_ne0 (im);
+       if (uaddc_is_cplxpart (im, IMAGPART_EXPR))
+         {
+           if (im1 == NULL)
+             {
+               im1 = im;
+               if (i != 0)
+                 std::swap (rhs[0], rhs[i]);
+             }
+           else
+             {
+               im2 = im;
+               if (i != 1)
+                 std::swap (rhs[1], rhs[i]);
+               break;
+             }
+         }
+      }
+  if (!im2)
+    return false;
+  gimple *ovf1
+    = SSA_NAME_DEF_STMT (TREE_OPERAND (gimple_assign_rhs1 (im1), 0));
+  gimple *ovf2
+    = SSA_NAME_DEF_STMT (TREE_OPERAND (gimple_assign_rhs1 (im2), 0));
+  internal_fn ifn;
+  if (!is_gimple_call (ovf1)
+      || !gimple_call_internal_p (ovf1)
+      || ((ifn = gimple_call_internal_fn (ovf1)) != IFN_ADD_OVERFLOW
+         && ifn != IFN_SUB_OVERFLOW)
+      || !gimple_call_internal_p (ovf2, ifn)
+      || optab_handler (ifn == IFN_ADD_OVERFLOW ? uaddc5_optab : usubc5_optab,
+                       TYPE_MODE (type)) == CODE_FOR_nothing
+      || (rhs[2]
+         && optab_handler (code == PLUS_EXPR ? uaddc5_optab : usubc5_optab,
+                           TYPE_MODE (type)) == CODE_FOR_nothing))
+    return false;
+  tree arg1, arg2, arg3 = NULL_TREE;
+  gimple *re1 = NULL, *re2 = NULL;
+  for (int i = (ifn == IFN_ADD_OVERFLOW ? 1 : 0); i >= 0; --i)
+    for (gimple *ovf = ovf1; ovf; ovf = (ovf == ovf1 ? ovf2 : NULL))
+      {
+       tree arg = gimple_call_arg (ovf, i);
+       if (TREE_CODE (arg) != SSA_NAME)
+         continue;
+       re1 = SSA_NAME_DEF_STMT (arg);
+       if (uaddc_is_cplxpart (re1, REALPART_EXPR)
+           && (SSA_NAME_DEF_STMT (TREE_OPERAND (gimple_assign_rhs1 (re1), 0))
+               == (ovf == ovf1 ? ovf2 : ovf1)))
+         {
+           if (ovf == ovf1)
+             {
+               std::swap (rhs[0], rhs[1]);
+               std::swap (im1, im2);
+               std::swap (ovf1, ovf2);
+             }
+           arg3 = gimple_call_arg (ovf, 1 - i);
+           i = -1;
+           break;
+         }
+      }
+  if (!arg3)
+    return false;
+  arg1 = gimple_call_arg (ovf1, 0);
+  arg2 = gimple_call_arg (ovf1, 1);
+  if (!types_compatible_p (type, TREE_TYPE (arg1)))
+    return false;
+  int kind[2] = { 0, 0 };
+  /* At least one of arg2 and arg3 should have type compatible
+     with arg1/rhs[0], and the other one should have value in [0, 1]
+     range.  */
+  for (int i = 0; i < 2; ++i)
+    {
+      tree arg = i == 0 ? arg2 : arg3;
+      if (types_compatible_p (type, TREE_TYPE (arg)))
+       kind[i] = 1;
+      if (!INTEGRAL_TYPE_P (TREE_TYPE (arg))
+         || (TYPE_PRECISION (TREE_TYPE (arg)) == 1
+             && !TYPE_UNSIGNED (TREE_TYPE (arg))))
+       continue;
+      if (tree_zero_one_valued_p (arg))
+       kind[i] |= 2;
+      if (TREE_CODE (arg) == SSA_NAME)
+       {
+         gimple *g = SSA_NAME_DEF_STMT (arg);
+         if (gimple_assign_cast_p (g))
+           {
+             tree op = gimple_assign_rhs1 (g);
+             if (TREE_CODE (op) == SSA_NAME
+                 && INTEGRAL_TYPE_P (TREE_TYPE (op)))
+               g = SSA_NAME_DEF_STMT (op);
+           }
+         g = uaddc_ne0 (g);
+         if (!uaddc_is_cplxpart (g, IMAGPART_EXPR))
+           continue;
+         g = SSA_NAME_DEF_STMT (TREE_OPERAND (gimple_assign_rhs1 (g), 0));
+         if (!is_gimple_call (g) || !gimple_call_internal_p (g))
+           continue;
+         switch (gimple_call_internal_fn (g))
+           {
+           case IFN_ADD_OVERFLOW:
+           case IFN_SUB_OVERFLOW:
+           case IFN_UADDC:
+           case IFN_USUBC:
+             break;
+           default:
+             continue;
+           }
+         kind[i] |= 4;
+       }
+    }
+  /* Make arg2 the one with compatible type and arg3 the one
+     with [0, 1] range.  If both is true for both operands,
+     prefer as arg3 result of __imag__ of some ifn.  */
+  if ((kind[0] & 1) == 0 || ((kind[1] & 1) != 0 && kind[0] > kind[1]))
+    {
+      std::swap (arg2, arg3);
+      std::swap (kind[0], kind[1]);
+    }
+  if ((kind[0] & 1) == 0 || (kind[1] & 6) == 0)
+    return false;
+  if (!has_single_use (gimple_assign_lhs (im1))
+      || !has_single_use (gimple_assign_lhs (im2))
+      || !has_single_use (gimple_assign_lhs (re1))
+      || num_imm_uses (gimple_call_lhs (ovf1)) != 2)
+    return false;
+  use_operand_p use_p;
+  imm_use_iterator iter;
+  tree lhs = gimple_call_lhs (ovf2);
+  FOR_EACH_IMM_USE_FAST (use_p, iter, lhs)
+    {
+      gimple *use_stmt = USE_STMT (use_p);
+      if (is_gimple_debug (use_stmt))
+       continue;
+      if (use_stmt == im2)
+       continue;
+      if (re2)
+       return false;
+      if (!uaddc_is_cplxpart (use_stmt, REALPART_EXPR))
+       return false;
+      re2 = use_stmt;
+    }
+  gimple_stmt_iterator gsi2 = gsi_for_stmt (ovf2);
+  gimple *g;
+  if ((kind[1] & 1) == 0)
+    {
+      if (TREE_CODE (arg3) == INTEGER_CST)
+       arg3 = fold_convert (type, arg3);
+      else
+       {
+         g = gimple_build_assign (make_ssa_name (type), NOP_EXPR, arg3);
+         gsi_insert_before (&gsi2, g, GSI_SAME_STMT);
+         arg3 = gimple_assign_lhs (g);
+       }
+    }
+  g = gimple_build_call_internal (ifn == IFN_ADD_OVERFLOW
+                                 ? IFN_UADDC : IFN_USUBC,
+                                 3, arg1, arg2, arg3);
+  tree nlhs = make_ssa_name (TREE_TYPE (lhs));
+  gimple_call_set_lhs (g, nlhs);
+  gsi_insert_before (&gsi2, g, GSI_SAME_STMT);
+  tree ilhs = rhs[2] ? make_ssa_name (type) : gimple_assign_lhs (stmt);
+  g = gimple_build_assign (ilhs, IMAGPART_EXPR,
+                          build1 (IMAGPART_EXPR, TREE_TYPE (ilhs), nlhs));
+  if (rhs[2])
+    gsi_insert_before (gsi, g, GSI_SAME_STMT);
+  else
+    gsi_replace (gsi, g, true);
+  tree rhs1 = rhs[1];
+  for (int i = 0; i < 2; i++)
+    if (rhs1 == gimple_assign_lhs (im2))
+      break;
+    else
+      {
+       g = SSA_NAME_DEF_STMT (rhs1);
+       rhs1 = gimple_assign_rhs1 (g);
+       gsi2 = gsi_for_stmt (g);
+       gsi_remove (&gsi2, true);
+      }
+  gcc_checking_assert (rhs1 == gimple_assign_lhs (im2));
+  gsi2 = gsi_for_stmt (im2);
+  gsi_remove (&gsi2, true);
+  gsi2 = gsi_for_stmt (re2);
+  tree rlhs = gimple_assign_lhs (re2);
+  g = gimple_build_assign (rlhs, REALPART_EXPR,
+                          build1 (REALPART_EXPR, TREE_TYPE (rlhs), nlhs));
+  gsi_replace (&gsi2, g, true);
+  if (rhs[2])
+    {
+      g = gimple_build_call_internal (code == PLUS_EXPR
+                                     ? IFN_UADDC : IFN_USUBC,
+                                     3, rhs[3], rhs[2], ilhs);
+      nlhs = make_ssa_name (TREE_TYPE (lhs));
+      gimple_call_set_lhs (g, nlhs);
+      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+      ilhs = gimple_assign_lhs (stmt);
+      g = gimple_build_assign (ilhs, REALPART_EXPR,
+                              build1 (REALPART_EXPR, TREE_TYPE (ilhs), nlhs));
+      gsi_replace (gsi, g, true);
+    }
+  if (TREE_CODE (arg3) == SSA_NAME)
+    {
+      gimple *im3 = SSA_NAME_DEF_STMT (arg3);
+      for (int i = 0; i < 2; ++i)
+       {
+         gimple *im4 = uaddc_cast (im3);
+         if (im4 == im3)
+           break;
+         else
+           im3 = im4;
+       }
+      im3 = uaddc_ne0 (im3);
+      if (uaddc_is_cplxpart (im3, IMAGPART_EXPR))
+       {
+         gimple *ovf3
+           = SSA_NAME_DEF_STMT (TREE_OPERAND (gimple_assign_rhs1 (im3), 0));
+         if (gimple_call_internal_p (ovf3, ifn))
+           {
+             lhs = gimple_call_lhs (ovf3);
+             arg1 = gimple_call_arg (ovf3, 0);
+             arg2 = gimple_call_arg (ovf3, 1);
+             if (types_compatible_p (type, TREE_TYPE (TREE_TYPE (lhs)))
+                 && types_compatible_p (type, TREE_TYPE (arg1))
+                 && types_compatible_p (type, TREE_TYPE (arg2)))
+               {
+                 g = gimple_build_call_internal (ifn == IFN_ADD_OVERFLOW
+                                                 ? IFN_UADDC : IFN_USUBC,
+                                                 3, arg1, arg2,
+                                                 build_zero_cst (type));
+                 gimple_call_set_lhs (g, lhs);
+                 gsi2 = gsi_for_stmt (ovf3);
+                 gsi_replace (&gsi2, g, true);
+               }
+           }
+       }
+    }
+  return true;
+}
+
 /* Return true if target has support for divmod.  */
 
 static bool
@@ -5068,8 +5496,9 @@ math_opts_dom_walker::after_dom_children
 
            case PLUS_EXPR:
            case MINUS_EXPR:
-             if (!convert_plusminus_to_widen (&gsi, stmt, code))
-               match_arith_overflow (&gsi, stmt, code, m_cfg_changed_p);
+             if (!convert_plusminus_to_widen (&gsi, stmt, code)
+                 && !match_arith_overflow (&gsi, stmt, code, m_cfg_changed_p))
+               match_uaddc_usubc (&gsi, stmt, code);
              break;
 
            case BIT_NOT_EXPR:
@@ -5085,6 +5514,11 @@ math_opts_dom_walker::after_dom_children
              convert_mult_to_highpart (as_a<gassign *> (stmt), &gsi);
              break;
 
+           case BIT_IOR_EXPR:
+           case BIT_XOR_EXPR:
+             match_uaddc_usubc (&gsi, stmt, code);
+             break;
+
            default:;
            }
        }
--- gcc/gimple-fold.cc.jj       2023-06-07 09:41:49.117485950 +0200
+++ gcc/gimple-fold.cc  2023-06-13 12:30:23.392968187 +0200
@@ -5585,6 +5585,7 @@ gimple_fold_call (gimple_stmt_iterator *
       enum tree_code subcode = ERROR_MARK;
       tree result = NULL_TREE;
       bool cplx_result = false;
+      bool uaddc_usubc = false;
       tree overflow = NULL_TREE;
       switch (gimple_call_internal_fn (stmt))
        {
@@ -5658,6 +5659,16 @@ gimple_fold_call (gimple_stmt_iterator *
          subcode = MULT_EXPR;
          cplx_result = true;
          break;
+       case IFN_UADDC:
+         subcode = PLUS_EXPR;
+         cplx_result = true;
+         uaddc_usubc = true;
+         break;
+       case IFN_USUBC:
+         subcode = MINUS_EXPR;
+         cplx_result = true;
+         uaddc_usubc = true;
+         break;
        case IFN_MASK_LOAD:
          changed |= gimple_fold_partial_load (gsi, stmt, true);
          break;
@@ -5677,6 +5688,7 @@ gimple_fold_call (gimple_stmt_iterator *
        {
          tree arg0 = gimple_call_arg (stmt, 0);
          tree arg1 = gimple_call_arg (stmt, 1);
+         tree arg2 = NULL_TREE;
          tree type = TREE_TYPE (arg0);
          if (cplx_result)
            {
@@ -5685,9 +5697,26 @@ gimple_fold_call (gimple_stmt_iterator *
                type = NULL_TREE;
              else
                type = TREE_TYPE (TREE_TYPE (lhs));
+             if (uaddc_usubc)
+               arg2 = gimple_call_arg (stmt, 2);
            }
          if (type == NULL_TREE)
            ;
+         else if (uaddc_usubc)
+           {
+             if (!integer_zerop (arg2))
+               ;
+             /* x = y + 0 + 0; x = y - 0 - 0; */
+             else if (integer_zerop (arg1))
+               result = arg0;
+             /* x = 0 + y + 0; */
+             else if (subcode != MINUS_EXPR && integer_zerop (arg0))
+               result = arg1;
+             /* x = y - y - 0; */
+             else if (subcode == MINUS_EXPR
+                      && operand_equal_p (arg0, arg1, 0))
+               result = integer_zero_node;
+           }
          /* x = y + 0; x = y - 0; x = y * 0; */
          else if (integer_zerop (arg1))
            result = subcode == MULT_EXPR ? integer_zero_node : arg0;
@@ -5702,8 +5731,11 @@ gimple_fold_call (gimple_stmt_iterator *
            result = arg0;
          else if (subcode == MULT_EXPR && integer_onep (arg0))
            result = arg1;
-         else if (TREE_CODE (arg0) == INTEGER_CST
-                  && TREE_CODE (arg1) == INTEGER_CST)
+         if (type
+             && result == NULL_TREE
+             && TREE_CODE (arg0) == INTEGER_CST
+             && TREE_CODE (arg1) == INTEGER_CST
+             && (!uaddc_usubc || TREE_CODE (arg2) == INTEGER_CST))
            {
              if (cplx_result)
                result = int_const_binop (subcode, fold_convert (type, arg0),
@@ -5717,6 +5749,15 @@ gimple_fold_call (gimple_stmt_iterator *
                  else
                    result = NULL_TREE;
                }
+             if (uaddc_usubc && result)
+               {
+                 tree r = int_const_binop (subcode, result,
+                                           fold_convert (type, arg2));
+                 if (r == NULL_TREE)
+                   result = NULL_TREE;
+                 else if (arith_overflowed_p (subcode, type, result, arg2))
+                   overflow = build_one_cst (type);
+               }
            }
          if (result)
            {
--- gcc/gimple-range-fold.cc.jj 2023-06-07 09:41:49.125485839 +0200
+++ gcc/gimple-range-fold.cc    2023-06-13 12:30:23.405968006 +0200
@@ -489,6 +489,8 @@ adjust_imagpart_expr (vrange &res, const
        case IFN_ADD_OVERFLOW:
        case IFN_SUB_OVERFLOW:
        case IFN_MUL_OVERFLOW:
+       case IFN_UADDC:
+       case IFN_USUBC:
        case IFN_ATOMIC_COMPARE_EXCHANGE:
          {
            int_range<2> r;
--- gcc/tree-ssa-dce.cc.jj      2023-06-07 09:41:49.272483796 +0200
+++ gcc/tree-ssa-dce.cc 2023-06-13 12:30:23.415967865 +0200
@@ -1481,6 +1481,14 @@ eliminate_unnecessary_stmts (bool aggres
                  case IFN_MUL_OVERFLOW:
                    maybe_optimize_arith_overflow (&gsi, MULT_EXPR);
                    break;
+                 case IFN_UADDC:
+                   if (integer_zerop (gimple_call_arg (stmt, 2)))
+                     maybe_optimize_arith_overflow (&gsi, PLUS_EXPR);
+                   break;
+                 case IFN_USUBC:
+                   if (integer_zerop (gimple_call_arg (stmt, 2)))
+                     maybe_optimize_arith_overflow (&gsi, MINUS_EXPR);
+                   break;
                  default:
                    break;
                  }
--- gcc/doc/md.texi.jj  2023-06-12 15:47:22.145507192 +0200
+++ gcc/doc/md.texi     2023-06-13 13:09:50.699868708 +0200
@@ -5224,6 +5224,22 @@ is taken only on unsigned overflow.
 @item @samp{usubv@var{m}4}, @samp{umulv@var{m}4}
 Similar, for other unsigned arithmetic operations.
 
+@cindex @code{uaddc@var{m}5} instruction pattern
+@item @samp{uaddc@var{m}5}
+Adds unsigned operands 2, 3 and 4 (where the last operand is guaranteed to
+have only values 0 or 1) together, sets operand 0 to the result of the
+addition of the 3 operands and sets operand 1 to 1 iff there was
+overflow on the unsigned additions, and to 0 otherwise.  So, it is
+an addition with carry in (operand 4) and carry out (operand 1).
+All operands have the same mode.
+
+@cindex @code{usubc@var{m}5} instruction pattern
+@item @samp{usubc@var{m}5}
+Similarly to @samp{uaddc@var{m}5}, except subtracts unsigned operands 3
+and 4 from operand 2 instead of adding them.  So, it is
+a subtraction with carry/borrow in (operand 4) and carry/borrow out
+(operand 1).  All operands have the same mode.
+
 @cindex @code{addptr@var{m}3} instruction pattern
 @item @samp{addptr@var{m}3}
 Like @code{add@var{m}3} but is guaranteed to only be used for address
--- gcc/config/i386/i386.md.jj  2023-06-12 15:47:21.894510663 +0200
+++ gcc/config/i386/i386.md     2023-06-13 12:30:23.465967165 +0200
@@ -7733,6 +7733,25 @@ (define_peephole2
   [(set (reg:CC FLAGS_REG)
        (compare:CC (match_dup 0) (match_dup 1)))])
 
+(define_peephole2
+  [(set (match_operand:SWI 0 "general_reg_operand")
+       (match_operand:SWI 1 "memory_operand"))
+   (parallel [(set (reg:CC FLAGS_REG)
+                  (compare:CC (match_dup 0)
+                              (match_operand:SWI 2 "memory_operand")))
+             (set (match_dup 0)
+                  (minus:SWI (match_dup 0) (match_dup 2)))])
+   (set (match_dup 1) (match_dup 0))]
+  "(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ())
+   && peep2_reg_dead_p (3, operands[0])
+   && !reg_overlap_mentioned_p (operands[0], operands[1])
+   && !reg_overlap_mentioned_p (operands[0], operands[2])"
+  [(set (match_dup 0) (match_dup 2))
+   (parallel [(set (reg:CC FLAGS_REG)
+                  (compare:CC (match_dup 1) (match_dup 0)))
+             (set (match_dup 1)
+                  (minus:SWI (match_dup 1) (match_dup 0)))])])
+
 ;; decl %eax; cmpl $-1, %eax; jne .Lxx; can be optimized into
 ;; subl $1, %eax; jnc .Lxx;
 (define_peephole2
@@ -7818,6 +7837,59 @@ (define_insn "@add<mode>3_carry"
    (set_attr "pent_pair" "pu")
    (set_attr "mode" "<MODE>")])
 
+(define_peephole2
+  [(set (match_operand:SWI 0 "general_reg_operand")
+       (match_operand:SWI 1 "memory_operand"))
+   (parallel [(set (match_dup 0)
+                  (plus:SWI
+                    (plus:SWI
+                      (match_operator:SWI 4 "ix86_carry_flag_operator"
+                        [(match_operand 3 "flags_reg_operand")
+                         (const_int 0)])
+                      (match_dup 0))
+                    (match_operand:SWI 2 "memory_operand")))
+             (clobber (reg:CC FLAGS_REG))])
+   (set (match_dup 1) (match_dup 0))]
+  "(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ())
+   && peep2_reg_dead_p (3, operands[0])
+   && !reg_overlap_mentioned_p (operands[0], operands[1])
+   && !reg_overlap_mentioned_p (operands[0], operands[2])"
+  [(set (match_dup 0) (match_dup 2))
+   (parallel [(set (match_dup 1)
+                  (plus:SWI (plus:SWI (match_op_dup 4
+                                        [(match_dup 3) (const_int 0)])
+                                      (match_dup 1))
+                            (match_dup 0)))
+             (clobber (reg:CC FLAGS_REG))])])
+
+(define_peephole2
+  [(set (match_operand:SWI 0 "general_reg_operand")
+       (match_operand:SWI 1 "memory_operand"))
+   (parallel [(set (match_dup 0)
+                  (plus:SWI
+                    (plus:SWI
+                      (match_operator:SWI 4 "ix86_carry_flag_operator"
+                        [(match_operand 3 "flags_reg_operand")
+                         (const_int 0)])
+                      (match_dup 0))
+                    (match_operand:SWI 2 "memory_operand")))
+             (clobber (reg:CC FLAGS_REG))])
+   (set (match_operand:SWI 5 "general_reg_operand") (match_dup 0))
+   (set (match_dup 1) (match_dup 5))]
+  "(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ())
+   && peep2_reg_dead_p (3, operands[0])
+   && peep2_reg_dead_p (4, operands[5])
+   && !reg_overlap_mentioned_p (operands[0], operands[1])
+   && !reg_overlap_mentioned_p (operands[0], operands[2])
+   && !reg_overlap_mentioned_p (operands[5], operands[1])"
+  [(set (match_dup 0) (match_dup 2))
+   (parallel [(set (match_dup 1)
+                  (plus:SWI (plus:SWI (match_op_dup 4
+                                        [(match_dup 3) (const_int 0)])
+                                      (match_dup 1))
+                            (match_dup 0)))
+             (clobber (reg:CC FLAGS_REG))])])
+
 (define_insn "*add<mode>3_carry_0"
   [(set (match_operand:SWI 0 "nonimmediate_operand" "=<r>m")
        (plus:SWI
@@ -7918,6 +7990,159 @@ (define_insn "addcarry<mode>"
    (set_attr "pent_pair" "pu")
    (set_attr "mode" "<MODE>")])
 
+;; Helper peephole2 for the addcarry<mode> and subborrow<mode>
+;; peephole2s, to optimize away nop which resulted from uaddc/usubc
+;; expansion optimization.
+(define_peephole2
+  [(set (match_operand:SWI48 0 "general_reg_operand")
+       (match_operand:SWI48 1 "memory_operand"))
+   (const_int 0)]
+  ""
+  [(set (match_dup 0) (match_dup 1))])
+
+(define_peephole2
+  [(parallel [(set (reg:CCC FLAGS_REG)
+                  (compare:CCC
+                    (zero_extend:<DWI>
+                      (plus:SWI48
+                        (plus:SWI48
+                          (match_operator:SWI48 4 "ix86_carry_flag_operator"
+                            [(match_operand 2 "flags_reg_operand")
+                             (const_int 0)])
+                          (match_operand:SWI48 0 "general_reg_operand"))
+                        (match_operand:SWI48 1 "memory_operand")))
+                    (plus:<DWI>
+                      (zero_extend:<DWI> (match_dup 1))
+                      (match_operator:<DWI> 3 "ix86_carry_flag_operator"
+                        [(match_dup 2) (const_int 0)]))))
+             (set (match_dup 0)
+                  (plus:SWI48 (plus:SWI48 (match_op_dup 4
+                                            [(match_dup 2) (const_int 0)])
+                                          (match_dup 0))
+                              (match_dup 1)))])
+   (set (match_dup 1) (match_dup 0))]
+  "(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ())
+   && peep2_reg_dead_p (2, operands[0])
+   && !reg_overlap_mentioned_p (operands[0], operands[1])"
+  [(parallel [(set (reg:CCC FLAGS_REG)
+                  (compare:CCC
+                    (zero_extend:<DWI>
+                      (plus:SWI48
+                        (plus:SWI48
+                          (match_op_dup 4
+                            [(match_dup 2) (const_int 0)])
+                          (match_dup 1))
+                        (match_dup 0)))
+                    (plus:<DWI>
+                      (zero_extend:<DWI> (match_dup 0))
+                      (match_op_dup 3
+                        [(match_dup 2) (const_int 0)]))))
+             (set (match_dup 1)
+                  (plus:SWI48 (plus:SWI48 (match_op_dup 4
+                                            [(match_dup 2) (const_int 0)])
+                                          (match_dup 1))
+                              (match_dup 0)))])])
+
+(define_peephole2
+  [(set (match_operand:SWI48 0 "general_reg_operand")
+       (match_operand:SWI48 1 "memory_operand"))
+   (parallel [(set (reg:CCC FLAGS_REG)
+                  (compare:CCC
+                    (zero_extend:<DWI>
+                      (plus:SWI48
+                        (plus:SWI48
+                          (match_operator:SWI48 5 "ix86_carry_flag_operator"
+                            [(match_operand 3 "flags_reg_operand")
+                             (const_int 0)])
+                          (match_dup 0))
+                        (match_operand:SWI48 2 "memory_operand")))
+                    (plus:<DWI>
+                      (zero_extend:<DWI> (match_dup 2))
+                      (match_operator:<DWI> 4 "ix86_carry_flag_operator"
+                        [(match_dup 3) (const_int 0)]))))
+             (set (match_dup 0)
+                  (plus:SWI48 (plus:SWI48 (match_op_dup 5
+                                            [(match_dup 3) (const_int 0)])
+                                          (match_dup 0))
+                              (match_dup 2)))])
+   (set (match_dup 1) (match_dup 0))]
+  "(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ())
+   && peep2_reg_dead_p (3, operands[0])
+   && !reg_overlap_mentioned_p (operands[0], operands[1])
+   && !reg_overlap_mentioned_p (operands[0], operands[2])"
+  [(set (match_dup 0) (match_dup 2))
+   (parallel [(set (reg:CCC FLAGS_REG)
+                  (compare:CCC
+                    (zero_extend:<DWI>
+                      (plus:SWI48
+                        (plus:SWI48
+                          (match_op_dup 5
+                            [(match_dup 3) (const_int 0)])
+                          (match_dup 1))
+                        (match_dup 0)))
+                    (plus:<DWI>
+                      (zero_extend:<DWI> (match_dup 0))
+                      (match_op_dup 4
+                        [(match_dup 3) (const_int 0)]))))
+             (set (match_dup 1)
+                  (plus:SWI48 (plus:SWI48 (match_op_dup 5
+                                            [(match_dup 3) (const_int 0)])
+                                          (match_dup 1))
+                              (match_dup 0)))])])
+
+(define_peephole2
+  [(parallel [(set (reg:CCC FLAGS_REG)
+                  (compare:CCC
+                    (zero_extend:<DWI>
+                      (plus:SWI48
+                        (plus:SWI48
+                          (match_operator:SWI48 4 "ix86_carry_flag_operator"
+                            [(match_operand 2 "flags_reg_operand")
+                             (const_int 0)])
+                          (match_operand:SWI48 0 "general_reg_operand"))
+                        (match_operand:SWI48 1 "memory_operand")))
+                    (plus:<DWI>
+                      (zero_extend:<DWI> (match_dup 1))
+                      (match_operator:<DWI> 3 "ix86_carry_flag_operator"
+                        [(match_dup 2) (const_int 0)]))))
+             (set (match_dup 0)
+                  (plus:SWI48 (plus:SWI48 (match_op_dup 4
+                                            [(match_dup 2) (const_int 0)])
+                                          (match_dup 0))
+                              (match_dup 1)))])
+   (set (match_operand:QI 5 "general_reg_operand")
+       (ltu:QI (reg:CCC FLAGS_REG) (const_int 0)))
+   (set (match_operand:SWI48 6 "general_reg_operand")
+       (zero_extend:SWI48 (match_dup 5)))
+   (set (match_dup 1) (match_dup 0))]
+  "(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ())
+   && peep2_reg_dead_p (4, operands[0])
+   && !reg_overlap_mentioned_p (operands[0], operands[1])
+   && !reg_overlap_mentioned_p (operands[0], operands[5])
+   && !reg_overlap_mentioned_p (operands[5], operands[1])
+   && !reg_overlap_mentioned_p (operands[0], operands[6])
+   && !reg_overlap_mentioned_p (operands[6], operands[1])"
+  [(parallel [(set (reg:CCC FLAGS_REG)
+                  (compare:CCC
+                    (zero_extend:<DWI>
+                      (plus:SWI48
+                        (plus:SWI48
+                          (match_op_dup 4
+                            [(match_dup 2) (const_int 0)])
+                          (match_dup 1))
+                        (match_dup 0)))
+                    (plus:<DWI>
+                      (zero_extend:<DWI> (match_dup 0))
+                      (match_op_dup 3
+                        [(match_dup 2) (const_int 0)]))))
+             (set (match_dup 1)
+                  (plus:SWI48 (plus:SWI48 (match_op_dup 4
+                                            [(match_dup 2) (const_int 0)])
+                                          (match_dup 1))
+                              (match_dup 0)))])
+   (set (match_dup 5) (ltu:QI (reg:CCC FLAGS_REG) (const_int 0)))
+   (set (match_dup 6) (zero_extend:SWI48 (match_dup 5)))])
+
 (define_expand "addcarry<mode>_0"
   [(parallel
      [(set (reg:CCC FLAGS_REG)
@@ -7988,6 +8213,59 @@ (define_insn "@sub<mode>3_carry"
    (set_attr "pent_pair" "pu")
    (set_attr "mode" "<MODE>")])
 
+(define_peephole2
+  [(set (match_operand:SWI 0 "general_reg_operand")
+       (match_operand:SWI 1 "memory_operand"))
+   (parallel [(set (match_dup 0)
+                  (minus:SWI
+                    (minus:SWI
+                      (match_dup 0)
+                      (match_operator:SWI 4 "ix86_carry_flag_operator"
+                        [(match_operand 3 "flags_reg_operand")
+                         (const_int 0)]))
+                    (match_operand:SWI 2 "memory_operand")))
+             (clobber (reg:CC FLAGS_REG))])
+   (set (match_dup 1) (match_dup 0))]
+  "(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ())
+   && peep2_reg_dead_p (3, operands[0])
+   && !reg_overlap_mentioned_p (operands[0], operands[1])
+   && !reg_overlap_mentioned_p (operands[0], operands[2])"
+  [(set (match_dup 0) (match_dup 2))
+   (parallel [(set (match_dup 1)
+                  (minus:SWI (minus:SWI (match_dup 1)
+                                        (match_op_dup 4
+                                          [(match_dup 3) (const_int 0)]))
+                             (match_dup 0)))
+             (clobber (reg:CC FLAGS_REG))])])
+
+(define_peephole2
+  [(set (match_operand:SWI 0 "general_reg_operand")
+       (match_operand:SWI 1 "memory_operand"))
+   (parallel [(set (match_dup 0)
+                  (minus:SWI
+                    (minus:SWI
+                      (match_dup 0)
+                      (match_operator:SWI 4 "ix86_carry_flag_operator"
+                        [(match_operand 3 "flags_reg_operand")
+                         (const_int 0)]))
+                    (match_operand:SWI 2 "memory_operand")))
+             (clobber (reg:CC FLAGS_REG))])
+   (set (match_operand:SWI 5 "general_reg_operand") (match_dup 0))
+   (set (match_dup 1) (match_dup 5))]
+  "(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ())
+   && peep2_reg_dead_p (3, operands[0])
+   && peep2_reg_dead_p (4, operands[5])
+   && !reg_overlap_mentioned_p (operands[0], operands[1])
+   && !reg_overlap_mentioned_p (operands[0], operands[2])
+   && !reg_overlap_mentioned_p (operands[5], operands[1])"
+  [(set (match_dup 0) (match_dup 2))
+   (parallel [(set (match_dup 1)
+                  (minus:SWI (minus:SWI (match_dup 1)
+                                        (match_op_dup 4
+                                          [(match_dup 3) (const_int 0)]))
+                             (match_dup 0)))
+             (clobber (reg:CC FLAGS_REG))])])
+
 (define_insn "*sub<mode>3_carry_0"
   [(set (match_operand:SWI 0 "nonimmediate_operand" "=<r>m")
        (minus:SWI
@@ -8113,13 +8391,13 @@ (define_insn "subborrow<mode>"
   [(set (reg:CCC FLAGS_REG)
        (compare:CCC
          (zero_extend:<DWI>
-           (match_operand:SWI48 1 "nonimmediate_operand" "0"))
+           (match_operand:SWI48 1 "nonimmediate_operand" "0,0"))
          (plus:<DWI>
            (match_operator:<DWI> 4 "ix86_carry_flag_operator"
              [(match_operand 3 "flags_reg_operand") (const_int 0)])
            (zero_extend:<DWI>
-             (match_operand:SWI48 2 "nonimmediate_operand" "rm")))))
-   (set (match_operand:SWI48 0 "register_operand" "=r")
+             (match_operand:SWI48 2 "nonimmediate_operand" "r,rm")))))
+   (set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r")
        (minus:SWI48 (minus:SWI48
                       (match_dup 1)
                       (match_operator:SWI48 5 "ix86_carry_flag_operator"
@@ -8132,6 +8410,154 @@ (define_insn "subborrow<mode>"
    (set_attr "pent_pair" "pu")
    (set_attr "mode" "<MODE>")])
 
+(define_peephole2
+  [(set (match_operand:SWI48 0 "general_reg_operand")
+       (match_operand:SWI48 1 "memory_operand"))
+   (parallel [(set (reg:CCC FLAGS_REG)
+                  (compare:CCC
+                    (zero_extend:<DWI> (match_dup 0))
+                    (plus:<DWI>
+                      (match_operator:<DWI> 4 "ix86_carry_flag_operator"
+                        [(match_operand 3 "flags_reg_operand") (const_int 0)])
+                      (zero_extend:<DWI>
+                        (match_operand:SWI48 2 "memory_operand")))))
+             (set (match_dup 0)
+                  (minus:SWI48
+                    (minus:SWI48
+                      (match_dup 0)
+                      (match_operator:SWI48 5 "ix86_carry_flag_operator"
+                        [(match_dup 3) (const_int 0)]))
+                    (match_dup 2)))])
+   (set (match_dup 1) (match_dup 0))]
+  "(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ())
+   && peep2_reg_dead_p (3, operands[0])
+   && !reg_overlap_mentioned_p (operands[0], operands[1])
+   && !reg_overlap_mentioned_p (operands[0], operands[2])"
+  [(set (match_dup 0) (match_dup 2))
+   (parallel [(set (reg:CCC FLAGS_REG)
+                  (compare:CCC
+                    (zero_extend:<DWI> (match_dup 1))
+                    (plus:<DWI> (match_op_dup 4
+                                  [(match_dup 3) (const_int 0)])
+                                (zero_extend:<DWI> (match_dup 0)))))
+             (set (match_dup 1)
+                  (minus:SWI48 (minus:SWI48 (match_dup 1)
+                                            (match_op_dup 5
+                                              [(match_dup 3) (const_int 0)]))
+                               (match_dup 0)))])])
+
+(define_peephole2
+  [(set (match_operand:SWI48 6 "general_reg_operand")
+       (match_operand:SWI48 7 "memory_operand"))
+   (set (match_operand:SWI48 8 "general_reg_operand")
+       (match_operand:SWI48 9 "memory_operand"))
+   (parallel [(set (reg:CCC FLAGS_REG)
+                  (compare:CCC
+                    (zero_extend:<DWI>
+                      (match_operand:SWI48 0 "general_reg_operand"))
+                    (plus:<DWI>
+                      (match_operator:<DWI> 4 "ix86_carry_flag_operator"
+                        [(match_operand 3 "flags_reg_operand") (const_int 0)])
+                      (zero_extend:<DWI>
+                        (match_operand:SWI48 2 "general_reg_operand")))))
+             (set (match_dup 0)
+                  (minus:SWI48
+                    (minus:SWI48
+                      (match_dup 0)
+                      (match_operator:SWI48 5 "ix86_carry_flag_operator"
+                        [(match_dup 3) (const_int 0)]))
+                    (match_dup 2)))])
+   (set (match_operand:SWI48 1 "memory_operand") (match_dup 0))]
+  "(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ())
+   && peep2_reg_dead_p (4, operands[0])
+   && peep2_reg_dead_p (3, operands[2])
+   && !reg_overlap_mentioned_p (operands[0], operands[1])
+   && !reg_overlap_mentioned_p (operands[2], operands[1])
+   && !reg_overlap_mentioned_p (operands[6], operands[9])
+   && (rtx_equal_p (operands[6], operands[0])
+       ? (rtx_equal_p (operands[7], operands[1])
+         && rtx_equal_p (operands[8], operands[2]))
+       : (rtx_equal_p (operands[8], operands[0])
+         && rtx_equal_p (operands[9], operands[1])
+         && rtx_equal_p (operands[6], operands[2])))"
+  [(set (match_dup 0) (match_dup 9))
+   (parallel [(set (reg:CCC FLAGS_REG)
+                  (compare:CCC
+                    (zero_extend:<DWI> (match_dup 1))
+                    (plus:<DWI> (match_op_dup 4
+                                  [(match_dup 3) (const_int 0)])
+                                (zero_extend:<DWI> (match_dup 0)))))
+             (set (match_dup 1)
+                  (minus:SWI48 (minus:SWI48 (match_dup 1)
+                                            (match_op_dup 5
+                                              [(match_dup 3) (const_int 0)]))
+                               (match_dup 0)))])]
+{
+  if (!rtx_equal_p (operands[6], operands[0]))
+    operands[9] = operands[7];
+})
+
+(define_peephole2
+  [(set (match_operand:SWI48 6 "general_reg_operand")
+       (match_operand:SWI48 7 "memory_operand"))
+   (set (match_operand:SWI48 8 "general_reg_operand")
+       (match_operand:SWI48 9 "memory_operand"))
+   (parallel [(set (reg:CCC FLAGS_REG)
+                  (compare:CCC
+                    (zero_extend:<DWI>
+                      (match_operand:SWI48 0 "general_reg_operand"))
+                    (plus:<DWI>
+                      (match_operator:<DWI> 4 "ix86_carry_flag_operator"
+                        [(match_operand 3 "flags_reg_operand") (const_int 0)])
+                      (zero_extend:<DWI>
+                        (match_operand:SWI48 2 "general_reg_operand")))))
+             (set (match_dup 0)
+                  (minus:SWI48
+                    (minus:SWI48
+                      (match_dup 0)
+                      (match_operator:SWI48 5 "ix86_carry_flag_operator"
+                        [(match_dup 3) (const_int 0)]))
+                    (match_dup 2)))])
+   (set (match_operand:QI 10 "general_reg_operand")
+       (ltu:QI (reg:CCC FLAGS_REG) (const_int 0)))
+   (set (match_operand:SWI48 11 "general_reg_operand")
+       (zero_extend:SWI48 (match_dup 10)))
+   (set (match_operand:SWI48 1 "memory_operand") (match_dup 0))]
+  "(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ())
+   && peep2_reg_dead_p (6, operands[0])
+   && peep2_reg_dead_p (3, operands[2])
+   && !reg_overlap_mentioned_p (operands[0], operands[1])
+   && !reg_overlap_mentioned_p (operands[2], operands[1])
+   && !reg_overlap_mentioned_p (operands[6], operands[9])
+   && !reg_overlap_mentioned_p (operands[0], operands[10])
+   && !reg_overlap_mentioned_p (operands[10], operands[1])
+   && !reg_overlap_mentioned_p (operands[0], operands[11])
+   && !reg_overlap_mentioned_p (operands[11], operands[1])
+   && (rtx_equal_p (operands[6], operands[0])
+       ? (rtx_equal_p (operands[7], operands[1])
+         && rtx_equal_p (operands[8], operands[2]))
+       : (rtx_equal_p (operands[8], operands[0])
+         && rtx_equal_p (operands[9], operands[1])
+         && rtx_equal_p (operands[6], operands[2])))"
+  [(set (match_dup 0) (match_dup 9))
+   (parallel [(set (reg:CCC FLAGS_REG)
+                  (compare:CCC
+                    (zero_extend:<DWI> (match_dup 1))
+                    (plus:<DWI> (match_op_dup 4
+                                  [(match_dup 3) (const_int 0)])
+                                (zero_extend:<DWI> (match_dup 0)))))
+             (set (match_dup 1)
+                  (minus:SWI48 (minus:SWI48 (match_dup 1)
+                                            (match_op_dup 5
+                                              [(match_dup 3) (const_int 0)]))
+                               (match_dup 0)))])
+   (set (match_dup 10) (ltu:QI (reg:CCC FLAGS_REG) (const_int 0)))
+   (set (match_dup 11) (zero_extend:SWI48 (match_dup 10)))]
+{
+  if (!rtx_equal_p (operands[6], operands[0]))
+    operands[9] = operands[7];
+})
+
 (define_expand "subborrow<mode>_0"
   [(parallel
      [(set (reg:CC FLAGS_REG)
@@ -8142,6 +8568,67 @@ (define_expand "subborrow<mode>_0"
           (minus:SWI48 (match_dup 1) (match_dup 2)))])]
   "ix86_binary_operator_ok (MINUS, <MODE>mode, operands)")
 
+(define_expand "uaddc<mode>5"
+  [(match_operand:SWI48 0 "register_operand")
+   (match_operand:SWI48 1 "register_operand")
+   (match_operand:SWI48 2 "register_operand")
+   (match_operand:SWI48 3 "register_operand")
+   (match_operand:SWI48 4 "nonmemory_operand")]
+  ""
+{
+  rtx cf = gen_rtx_REG (CCCmode, FLAGS_REG), pat, pat2;
+  if (operands[4] == const0_rtx)
+    emit_insn (gen_addcarry<mode>_0 (operands[0], operands[2], operands[3]));
+  else
+    {
+      rtx op4 = copy_to_mode_reg (QImode,
+                                 convert_to_mode (QImode, operands[4], 1));
+      emit_insn (gen_addqi3_cconly_overflow (op4, constm1_rtx));
+      pat = gen_rtx_LTU (<DWI>mode, cf, const0_rtx);
+      pat2 = gen_rtx_LTU (<MODE>mode, cf, const0_rtx);
+      emit_insn (gen_addcarry<mode> (operands[0], operands[2], operands[3],
+                                    cf, pat, pat2));
+    }
+  rtx cc = gen_reg_rtx (QImode);
+  pat = gen_rtx_LTU (QImode, cf, const0_rtx);
+  emit_insn (gen_rtx_SET (cc, pat));
+  emit_insn (gen_zero_extendqi<mode>2 (operands[1], cc));
+  DONE;
+})
+
+(define_expand "usubc<mode>5"
+  [(match_operand:SWI48 0 "register_operand")
+   (match_operand:SWI48 1 "register_operand")
+   (match_operand:SWI48 2 "register_operand")
+   (match_operand:SWI48 3 "register_operand")
+   (match_operand:SWI48 4 "nonmemory_operand")]
+  ""
+{
+  rtx cf, pat, pat2;
+  if (operands[4] == const0_rtx)
+    {
+      cf = gen_rtx_REG (CCmode, FLAGS_REG);
+      emit_insn (gen_subborrow<mode>_0 (operands[0], operands[2],
+                                       operands[3]));
+    }
+  else
+    {
+      cf = gen_rtx_REG (CCCmode, FLAGS_REG);
+      rtx op4 = copy_to_mode_reg (QImode,
+                                 convert_to_mode (QImode, operands[4], 1));
+      emit_insn (gen_addqi3_cconly_overflow (op4, constm1_rtx));
+      pat = gen_rtx_LTU (<DWI>mode, cf, const0_rtx);
+      pat2 = gen_rtx_LTU (<MODE>mode, cf, const0_rtx);
+      emit_insn (gen_subborrow<mode> (operands[0], operands[2], operands[3],
+                                     cf, pat, pat2));
+    }
+  rtx cc = gen_reg_rtx (QImode);
+  pat = gen_rtx_LTU (QImode, cf, const0_rtx);
+  emit_insn (gen_rtx_SET (cc, pat));
+  emit_insn (gen_zero_extendqi<mode>2 (operands[1], cc));
+  DONE;
+})
+
 (define_mode_iterator CC_CCC [CC CCC])
 
 ;; Pre-reload splitter to optimize
@@ -8239,6 +8726,27 @@ (define_peephole2
                   (compare:CCC
                     (plus:SWI (match_dup 1) (match_dup 0))
                     (match_dup 1)))
+             (set (match_dup 1) (plus:SWI (match_dup 1) (match_dup 0)))])])
+
+(define_peephole2
+  [(set (match_operand:SWI 0 "general_reg_operand")
+       (match_operand:SWI 1 "memory_operand"))
+   (parallel [(set (reg:CCC FLAGS_REG)
+                  (compare:CCC
+                    (plus:SWI (match_dup 0)
+                              (match_operand:SWI 2 "memory_operand"))
+                    (match_dup 0)))
+             (set (match_dup 0) (plus:SWI (match_dup 0) (match_dup 2)))])
+   (set (match_dup 1) (match_dup 0))]
+  "(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ())
+   && peep2_reg_dead_p (3, operands[0])
+   && !reg_overlap_mentioned_p (operands[0], operands[1])
+   && !reg_overlap_mentioned_p (operands[0], operands[2])"
+  [(set (match_dup 0) (match_dup 2))
+   (parallel [(set (reg:CCC FLAGS_REG)
+                  (compare:CCC
+                    (plus:SWI (match_dup 1) (match_dup 0))
+                    (match_dup 1)))
              (set (match_dup 1) (plus:SWI (match_dup 1) (match_dup 0)))])])
 
 (define_insn "*addsi3_zext_cc_overflow_1"
--- gcc/testsuite/gcc.target/i386/pr79173-1.c.jj        2023-06-13 
12:30:23.466967151 +0200
+++ gcc/testsuite/gcc.target/i386/pr79173-1.c   2023-06-13 12:30:23.466967151 
+0200
@@ -0,0 +1,59 @@
+/* PR middle-end/79173 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-stack-protector -masm=att" } */
+/* { dg-final { scan-assembler-times "addq\t%r\[^\n\r]*, \\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 8\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 16\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 24\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "subq\t%r\[^\n\r]*, \\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "sbbq\t%r\[^\n\r]*, 8\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "sbbq\t%r\[^\n\r]*, 16\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "sbbq\t%r\[^\n\r]*, 24\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "addl\t%e\[^\n\r]*, \\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 4\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 8\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 
12\\\(%e\[^\n\r]*\\\)" 1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "subl\t%e\[^\n\r]*, \\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "sbbl\t%e\[^\n\r]*, 4\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "sbbl\t%e\[^\n\r]*, 8\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "sbbl\t%e\[^\n\r]*, 
12\\\(%e\[^\n\r]*\\\)" 1 { target ia32 } } } */
+
+static unsigned long
+uaddc (unsigned long x, unsigned long y, unsigned long carry_in, unsigned long 
*carry_out)
+{
+  unsigned long r;
+  unsigned long c1 = __builtin_add_overflow (x, y, &r);
+  unsigned long c2 = __builtin_add_overflow (r, carry_in, &r);
+  *carry_out = c1 + c2;
+  return r;
+}
+
+static unsigned long
+usubc (unsigned long x, unsigned long y, unsigned long carry_in, unsigned long 
*carry_out)
+{
+  unsigned long r;
+  unsigned long c1 = __builtin_sub_overflow (x, y, &r);
+  unsigned long c2 = __builtin_sub_overflow (r, carry_in, &r);
+  *carry_out = c1 + c2;
+  return r;
+}
+
+void
+foo (unsigned long *p, unsigned long *q)
+{
+  unsigned long c;
+  p[0] = uaddc (p[0], q[0], 0, &c);
+  p[1] = uaddc (p[1], q[1], c, &c);
+  p[2] = uaddc (p[2], q[2], c, &c);
+  p[3] = uaddc (p[3], q[3], c, &c);
+}
+
+void
+bar (unsigned long *p, unsigned long *q)
+{
+  unsigned long c;
+  p[0] = usubc (p[0], q[0], 0, &c);
+  p[1] = usubc (p[1], q[1], c, &c);
+  p[2] = usubc (p[2], q[2], c, &c);
+  p[3] = usubc (p[3], q[3], c, &c);
+}
--- gcc/testsuite/gcc.target/i386/pr79173-2.c.jj        2023-06-13 
12:30:23.466967151 +0200
+++ gcc/testsuite/gcc.target/i386/pr79173-2.c   2023-06-13 12:30:23.466967151 
+0200
@@ -0,0 +1,59 @@
+/* PR middle-end/79173 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-stack-protector -masm=att" } */
+/* { dg-final { scan-assembler-times "addq\t%r\[^\n\r]*, \\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 8\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 16\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 24\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "subq\t%r\[^\n\r]*, \\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "sbbq\t%r\[^\n\r]*, 8\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "sbbq\t%r\[^\n\r]*, 16\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "sbbq\t%r\[^\n\r]*, 24\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "addl\t%e\[^\n\r]*, \\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 4\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 8\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 
12\\\(%e\[^\n\r]*\\\)" 1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "subl\t%e\[^\n\r]*, \\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "sbbl\t%e\[^\n\r]*, 4\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "sbbl\t%e\[^\n\r]*, 8\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "sbbl\t%e\[^\n\r]*, 
12\\\(%e\[^\n\r]*\\\)" 1 { target ia32 } } } */
+
+static unsigned long
+uaddc (unsigned long x, unsigned long y, _Bool carry_in, _Bool *carry_out)
+{
+  unsigned long r;
+  _Bool c1 = __builtin_add_overflow (x, y, &r);
+  _Bool c2 = __builtin_add_overflow (r, carry_in, &r);
+  *carry_out = c1 | c2;
+  return r;
+}
+
+static unsigned long
+usubc (unsigned long x, unsigned long y, _Bool carry_in, _Bool *carry_out)
+{
+  unsigned long r;
+  _Bool c1 = __builtin_sub_overflow (x, y, &r);
+  _Bool c2 = __builtin_sub_overflow (r, carry_in, &r);
+  *carry_out = c1 | c2;
+  return r;
+}
+
+void
+foo (unsigned long *p, unsigned long *q)
+{
+  _Bool c;
+  p[0] = uaddc (p[0], q[0], 0, &c);
+  p[1] = uaddc (p[1], q[1], c, &c);
+  p[2] = uaddc (p[2], q[2], c, &c);
+  p[3] = uaddc (p[3], q[3], c, &c);
+}
+
+void
+bar (unsigned long *p, unsigned long *q)
+{
+  _Bool c;
+  p[0] = usubc (p[0], q[0], 0, &c);
+  p[1] = usubc (p[1], q[1], c, &c);
+  p[2] = usubc (p[2], q[2], c, &c);
+  p[3] = usubc (p[3], q[3], c, &c);
+}
--- gcc/testsuite/gcc.target/i386/pr79173-3.c.jj        2023-06-13 
12:30:23.467967137 +0200
+++ gcc/testsuite/gcc.target/i386/pr79173-3.c   2023-06-13 12:30:23.467967137 
+0200
@@ -0,0 +1,61 @@
+/* PR middle-end/79173 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-stack-protector -masm=att" } */
+/* { dg-final { scan-assembler-times "addq\t%r\[^\n\r]*, \\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 8\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 16\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 24\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "subq\t%r\[^\n\r]*, \\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "sbbq\t%r\[^\n\r]*, 8\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "sbbq\t%r\[^\n\r]*, 16\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "sbbq\t%r\[^\n\r]*, 24\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "addl\t%e\[^\n\r]*, \\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 4\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 8\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 
12\\\(%e\[^\n\r]*\\\)" 1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "subl\t%e\[^\n\r]*, \\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "sbbl\t%e\[^\n\r]*, 4\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "sbbl\t%e\[^\n\r]*, 8\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "sbbl\t%e\[^\n\r]*, 
12\\\(%e\[^\n\r]*\\\)" 1 { target ia32 } } } */
+
+static unsigned long
+uaddc (unsigned long x, unsigned long y, unsigned long carry_in, unsigned long 
*carry_out)
+{
+  unsigned long r;
+  unsigned long c1 = __builtin_add_overflow (x, y, &r);
+  unsigned long c2 = __builtin_add_overflow (r, carry_in, &r);
+  *carry_out = c1 + c2;
+  return r;
+}
+
+static unsigned long
+usubc (unsigned long x, unsigned long y, unsigned long carry_in, unsigned long 
*carry_out)
+{
+  unsigned long r;
+  unsigned long c1 = __builtin_sub_overflow (x, y, &r);
+  unsigned long c2 = __builtin_sub_overflow (r, carry_in, &r);
+  *carry_out = c1 + c2;
+  return r;
+}
+
+unsigned long
+foo (unsigned long *p, unsigned long *q)
+{
+  unsigned long c;
+  p[0] = uaddc (p[0], q[0], 0, &c);
+  p[1] = uaddc (p[1], q[1], c, &c);
+  p[2] = uaddc (p[2], q[2], c, &c);
+  p[3] = uaddc (p[3], q[3], c, &c);
+  return c;
+}
+
+unsigned long
+bar (unsigned long *p, unsigned long *q)
+{
+  unsigned long c;
+  p[0] = usubc (p[0], q[0], 0, &c);
+  p[1] = usubc (p[1], q[1], c, &c);
+  p[2] = usubc (p[2], q[2], c, &c);
+  p[3] = usubc (p[3], q[3], c, &c);
+  return c;
+}
--- gcc/testsuite/gcc.target/i386/pr79173-4.c.jj        2023-06-13 
12:30:23.467967137 +0200
+++ gcc/testsuite/gcc.target/i386/pr79173-4.c   2023-06-13 12:30:23.467967137 
+0200
@@ -0,0 +1,61 @@
+/* PR middle-end/79173 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-stack-protector -masm=att" } */
+/* { dg-final { scan-assembler-times "addq\t%r\[^\n\r]*, \\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 8\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 16\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 24\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "subq\t%r\[^\n\r]*, \\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "sbbq\t%r\[^\n\r]*, 8\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "sbbq\t%r\[^\n\r]*, 16\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "sbbq\t%r\[^\n\r]*, 24\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "addl\t%e\[^\n\r]*, \\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 4\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 8\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 
12\\\(%e\[^\n\r]*\\\)" 1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "subl\t%e\[^\n\r]*, \\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "sbbl\t%e\[^\n\r]*, 4\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "sbbl\t%e\[^\n\r]*, 8\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "sbbl\t%e\[^\n\r]*, 
12\\\(%e\[^\n\r]*\\\)" 1 { target ia32 } } } */
+
+static unsigned long
+uaddc (unsigned long x, unsigned long y, _Bool carry_in, _Bool *carry_out)
+{
+  unsigned long r;
+  _Bool c1 = __builtin_add_overflow (x, y, &r);
+  _Bool c2 = __builtin_add_overflow (r, carry_in, &r);
+  *carry_out = c1 ^ c2;
+  return r;
+}
+
+static unsigned long
+usubc (unsigned long x, unsigned long y, _Bool carry_in, _Bool *carry_out)
+{
+  unsigned long r;
+  _Bool c1 = __builtin_sub_overflow (x, y, &r);
+  _Bool c2 = __builtin_sub_overflow (r, carry_in, &r);
+  *carry_out = c1 ^ c2;
+  return r;
+}
+
+_Bool
+foo (unsigned long *p, unsigned long *q)
+{
+  _Bool c;
+  p[0] = uaddc (p[0], q[0], 0, &c);
+  p[1] = uaddc (p[1], q[1], c, &c);
+  p[2] = uaddc (p[2], q[2], c, &c);
+  p[3] = uaddc (p[3], q[3], c, &c);
+  return c;
+}
+
+_Bool
+bar (unsigned long *p, unsigned long *q)
+{
+  _Bool c;
+  p[0] = usubc (p[0], q[0], 0, &c);
+  p[1] = usubc (p[1], q[1], c, &c);
+  p[2] = usubc (p[2], q[2], c, &c);
+  p[3] = usubc (p[3], q[3], c, &c);
+  return c;
+}
--- gcc/testsuite/gcc.target/i386/pr79173-5.c.jj        2023-06-13 
12:30:23.467967137 +0200
+++ gcc/testsuite/gcc.target/i386/pr79173-5.c   2023-06-13 12:30:23.467967137 
+0200
@@ -0,0 +1,32 @@
+/* PR middle-end/79173 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-stack-protector -masm=att" } */
+/* { dg-final { scan-assembler-times "addq\t%r\[^\n\r]*, \\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 8\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 16\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 24\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "addl\t%e\[^\n\r]*, \\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 4\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 8\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 
12\\\(%e\[^\n\r]*\\\)" 1 { target ia32 } } } */
+
+static unsigned long
+uaddc (unsigned long x, unsigned long y, unsigned long carry_in, unsigned long 
*carry_out)
+{
+  unsigned long r = x + y;
+  unsigned long c1 = r < x;
+  r += carry_in;
+  unsigned long c2 = r < carry_in;
+  *carry_out = c1 + c2;
+  return r;
+}
+
+void
+foo (unsigned long *p, unsigned long *q)
+{
+  unsigned long c;
+  p[0] = uaddc (p[0], q[0], 0, &c);
+  p[1] = uaddc (p[1], q[1], c, &c);
+  p[2] = uaddc (p[2], q[2], c, &c);
+  p[3] = uaddc (p[3], q[3], c, &c);
+}
--- gcc/testsuite/gcc.target/i386/pr79173-6.c.jj        2023-06-13 
12:30:23.467967137 +0200
+++ gcc/testsuite/gcc.target/i386/pr79173-6.c   2023-06-13 12:30:23.467967137 
+0200
@@ -0,0 +1,33 @@
+/* PR middle-end/79173 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-stack-protector -masm=att" } */
+/* { dg-final { scan-assembler-times "addq\t%r\[^\n\r]*, \\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 8\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 16\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 24\\\(%rdi\\\)" 1 { 
target lp64 } } } */
+/* { dg-final { scan-assembler-times "addl\t%e\[^\n\r]*, \\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 4\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 8\\\(%e\[^\n\r]*\\\)" 
1 { target ia32 } } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 
12\\\(%e\[^\n\r]*\\\)" 1 { target ia32 } } } */
+
+static unsigned long
+uaddc (unsigned long x, unsigned long y, unsigned long carry_in, unsigned long 
*carry_out)
+{
+  unsigned long r = x + y;
+  unsigned long c1 = r < x;
+  r += carry_in;
+  unsigned long c2 = r < carry_in;
+  *carry_out = c1 + c2;
+  return r;
+}
+
+unsigned long
+foo (unsigned long *p, unsigned long *q)
+{
+  unsigned long c;
+  p[0] = uaddc (p[0], q[0], 0, &c);
+  p[1] = uaddc (p[1], q[1], c, &c);
+  p[2] = uaddc (p[2], q[2], c, &c);
+  p[3] = uaddc (p[3], q[3], c, &c);
+  return c;
+}
--- gcc/testsuite/gcc.target/i386/pr79173-7.c.jj        2023-06-13 
12:30:23.468967123 +0200
+++ gcc/testsuite/gcc.target/i386/pr79173-7.c   2023-06-13 12:30:23.468967123 
+0200
@@ -0,0 +1,31 @@
+/* PR middle-end/79173 */
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-O2 -fno-stack-protector -masm=att" } */
+/* { dg-final { scan-assembler-times "addq\t%r\[^\n\r]*, \\\(%rdi\\\)" 1 } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 8\\\(%rdi\\\)" 1 } } 
*/
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 16\\\(%rdi\\\)" 1 } } 
*/
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 24\\\(%rdi\\\)" 1 } } 
*/
+/* { dg-final { scan-assembler-times "subq\t%r\[^\n\r]*, \\\(%rdi\\\)" 1 } } */
+/* { dg-final { scan-assembler-times "sbbq\t%r\[^\n\r]*, 8\\\(%rdi\\\)" 1 } } 
*/
+/* { dg-final { scan-assembler-times "sbbq\t%r\[^\n\r]*, 16\\\(%rdi\\\)" 1 } } 
*/
+/* { dg-final { scan-assembler-times "sbbq\t%r\[^\n\r]*, 24\\\(%rdi\\\)" 1 } } 
*/
+
+#include <x86intrin.h>
+
+void
+foo (unsigned long long *p, unsigned long long *q)
+{
+  unsigned char c = _addcarry_u64 (0, p[0], q[0], &p[0]);
+  c = _addcarry_u64 (c, p[1], q[1], &p[1]);
+  c = _addcarry_u64 (c, p[2], q[2], &p[2]);
+  _addcarry_u64 (c, p[3], q[3], &p[3]);
+}
+
+void
+bar (unsigned long long *p, unsigned long long *q)
+{
+  unsigned char c = _subborrow_u64 (0, p[0], q[0], &p[0]);
+  c = _subborrow_u64 (c, p[1], q[1], &p[1]);
+  c = _subborrow_u64 (c, p[2], q[2], &p[2]);
+  _subborrow_u64 (c, p[3], q[3], &p[3]);
+}
--- gcc/testsuite/gcc.target/i386/pr79173-8.c.jj        2023-06-13 
12:30:23.468967123 +0200
+++ gcc/testsuite/gcc.target/i386/pr79173-8.c   2023-06-13 12:30:23.468967123 
+0200
@@ -0,0 +1,31 @@
+/* PR middle-end/79173 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-stack-protector -masm=att" } */
+/* { dg-final { scan-assembler-times "addl\t%e\[^\n\r]*, \\\(%\[^\n\r]*\\\)" 1 
} } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 4\\\(%\[^\n\r]*\\\)" 
1 } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 8\\\(%\[^\n\r]*\\\)" 
1 } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 12\\\(%\[^\n\r]*\\\)" 
1 } } */
+/* { dg-final { scan-assembler-times "subl\t%e\[^\n\r]*, \\\(%\[^\n\r]*\\\)" 1 
} } */
+/* { dg-final { scan-assembler-times "sbbl\t%e\[^\n\r]*, 4\\\(%\[^\n\r]*\\\)" 
1 } } */
+/* { dg-final { scan-assembler-times "sbbl\t%e\[^\n\r]*, 8\\\(%\[^\n\r]*\\\)" 
1 } } */
+/* { dg-final { scan-assembler-times "sbbl\t%e\[^\n\r]*, 12\\\(%\[^\n\r]*\\\)" 
1 } } */
+
+#include <x86intrin.h>
+
+void
+foo (unsigned int *p, unsigned int *q)
+{
+  unsigned char c = _addcarry_u32 (0, p[0], q[0], &p[0]);
+  c = _addcarry_u32 (c, p[1], q[1], &p[1]);
+  c = _addcarry_u32 (c, p[2], q[2], &p[2]);
+  _addcarry_u32 (c, p[3], q[3], &p[3]);
+}
+
+void
+bar (unsigned int *p, unsigned int *q)
+{
+  unsigned char c = _subborrow_u32 (0, p[0], q[0], &p[0]);
+  c = _subborrow_u32 (c, p[1], q[1], &p[1]);
+  c = _subborrow_u32 (c, p[2], q[2], &p[2]);
+  _subborrow_u32 (c, p[3], q[3], &p[3]);
+}
--- gcc/testsuite/gcc.target/i386/pr79173-9.c.jj        2023-06-13 
12:30:23.468967123 +0200
+++ gcc/testsuite/gcc.target/i386/pr79173-9.c   2023-06-13 12:30:23.468967123 
+0200
@@ -0,0 +1,31 @@
+/* PR middle-end/79173 */
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-O2 -fno-stack-protector -masm=att" } */
+/* { dg-final { scan-assembler-times "addq\t%r\[^\n\r]*, \\\(%rdi\\\)" 1 } } */
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 8\\\(%rdi\\\)" 1 } } 
*/
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 16\\\(%rdi\\\)" 1 } } 
*/
+/* { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 24\\\(%rdi\\\)" 1 } } 
*/
+/* { dg-final { scan-assembler-times "subq\t%r\[^\n\r]*, \\\(%rdi\\\)" 1 } } */
+/* { dg-final { scan-assembler-times "sbbq\t%r\[^\n\r]*, 8\\\(%rdi\\\)" 1 } } 
*/
+/* { dg-final { scan-assembler-times "sbbq\t%r\[^\n\r]*, 16\\\(%rdi\\\)" 1 } } 
*/
+/* { dg-final { scan-assembler-times "sbbq\t%r\[^\n\r]*, 24\\\(%rdi\\\)" 1 } } 
*/
+
+#include <x86intrin.h>
+
+unsigned long long
+foo (unsigned long long *p, unsigned long long *q)
+{
+  unsigned char c = _addcarry_u64 (0, p[0], q[0], &p[0]);
+  c = _addcarry_u64 (c, p[1], q[1], &p[1]);
+  c = _addcarry_u64 (c, p[2], q[2], &p[2]);
+  return _addcarry_u64 (c, p[3], q[3], &p[3]);
+}
+
+unsigned long long
+bar (unsigned long long *p, unsigned long long *q)
+{
+  unsigned char c = _subborrow_u64 (0, p[0], q[0], &p[0]);
+  c = _subborrow_u64 (c, p[1], q[1], &p[1]);
+  c = _subborrow_u64 (c, p[2], q[2], &p[2]);
+  return _subborrow_u64 (c, p[3], q[3], &p[3]);
+}
--- gcc/testsuite/gcc.target/i386/pr79173-10.c.jj       2023-06-13 
12:30:23.468967123 +0200
+++ gcc/testsuite/gcc.target/i386/pr79173-10.c  2023-06-13 12:30:23.468967123 
+0200
@@ -0,0 +1,31 @@
+/* PR middle-end/79173 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-stack-protector -masm=att" } */
+/* { dg-final { scan-assembler-times "addl\t%e\[^\n\r]*, \\\(%\[^\n\r]*\\\)" 1 
} } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 4\\\(%\[^\n\r]*\\\)" 
1 } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 8\\\(%\[^\n\r]*\\\)" 
1 } } */
+/* { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 12\\\(%\[^\n\r]*\\\)" 
1 } } */
+/* { dg-final { scan-assembler-times "subl\t%e\[^\n\r]*, \\\(%\[^\n\r]*\\\)" 1 
} } */
+/* { dg-final { scan-assembler-times "sbbl\t%e\[^\n\r]*, 4\\\(%\[^\n\r]*\\\)" 
1 } } */
+/* { dg-final { scan-assembler-times "sbbl\t%e\[^\n\r]*, 8\\\(%\[^\n\r]*\\\)" 
1 } } */
+/* { dg-final { scan-assembler-times "sbbl\t%e\[^\n\r]*, 12\\\(%\[^\n\r]*\\\)" 
1 } } */
+
+#include <x86intrin.h>
+
+unsigned int
+foo (unsigned int *p, unsigned int *q)
+{
+  unsigned char c = _addcarry_u32 (0, p[0], q[0], &p[0]);
+  c = _addcarry_u32 (c, p[1], q[1], &p[1]);
+  c = _addcarry_u32 (c, p[2], q[2], &p[2]);
+  return _addcarry_u32 (c, p[3], q[3], &p[3]);
+}
+
+unsigned int
+bar (unsigned int *p, unsigned int *q)
+{
+  unsigned char c = _subborrow_u32 (0, p[0], q[0], &p[0]);
+  c = _subborrow_u32 (c, p[1], q[1], &p[1]);
+  c = _subborrow_u32 (c, p[2], q[2], &p[2]);
+  return _subborrow_u32 (c, p[3], q[3], &p[3]);
+}


        Jakub

Reply via email to