The PowerPC ISA has Load-And-Reserve and Store-Conditional instructions
which can be used to construct a sequence of instructions that appears
to perform an atomic update operation on an aligned storage location.

The larx (load-and-reserve) instruction supports an Exclusive Access
Hint (EH). A value of 0 for this hint indicates other programs might
attempt to modify the storage location. A value of 1 indicates that
other programs will not attempt to modify the memory location until the
program that has done the load performs a subsequent store. EH = 1
should be used when the program is obtaining a lock variable which it
will subsequently release before another program attempts to modify the
lock variable. When contention for a lock is significant, using this
hint may reduce the number of times a cache block is transferred between
processor caches.

This patch introduces a new built-in function:
 __atomic_compare_exchange_local()

It behaves like __atomic_compare_exchange(), but it uses an EH value of
1 in the larx (load-and-reserve) instruction. The new builtin helps
optimize lock contention on PowerPC by keeping the lock cacheline in
the local processor longer, reducing performance penalties from
cacheline movement.

This patch also provides a hook to specify if a target supports load
instructions with exclusive access hints. For targets that do not
support such load instructions, calling the new builtin will result in
an error.

The existing infrastructure for supporting __atomic_compare_exchange
is reused—with some modifications—to accommodate the new builtin.
In the expand pass, additional parameters are introduced in functions
wherever necessary to indicate that the builtin being processed is 
__atomic_compare_exchange_local.

Bootstrapped and regtested on powerpc64le and aarch64. Ok for trunk?

2025-08-05  Surya Kumari Jangala  <jskum...@linux.ibm.com>

gcc:
        * builtins.cc (expand_builtin_atomic_compare_exchange): Add a new
        parameter 'local'. Pass new parameter 'local' to
        expand_atomic_compare_and_swap().
        (expand_builtin): Pass parameter 'local' to
        expand_builtin_atomic_compare_exchange(). Expand call to
        __atomic_compare_exchange_local().
        * c-family/c-common.cc (get_atomic_generic_size): Add new case
        for BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL in switch statement.
        (resolve_overloaded_atomic_compare_exchange): Issue error message if
        lock free size not specified for __atomic_compare_exchange_local.
        (resolve_overloaded_builtin): Check if target supports loads with
        exclusive access hints. Convert builtin to _N variant.
        * config/rs6000/rs6000-protos.h (rs6000_expand_atomic_compare_and_swap):
        Add additional parameter 'local' to the prototype.
        * config/rs6000/rs6000.cc (rs6000_have_load_with_exclusive_access):
        New function.
        (emit_load_locked): Add new parameter. Pass new parameter to generate
        load-locked instruction.
        (rs6000_expand_atomic_compare_and_swap): Add new parameter. Call
        emit_load_locked() with additional parameter value of EH bit.
        (rs6000_expand_atomic_exchange): Pass EH value 0 to emit_load_locked().
        (rs6000_expand_atomic_op): Likewise.
        * config/rs6000/rs6000.h (TARGET_HAVE_LOAD_WITH_EXCLUSIVE_ACCESS):
        Define.
        * config/rs6000/sync.md (load_locked<mode>): Add new operand in RTL
        template. Specify EH bit in the larx instruction.
        (load_locked<QHI:mode>_si): Likewise.
        (load_lockedpti): Likewise.
        (load_lockedti): Add new operand in RTL template. Pass EH bit to
        gen_load_lockedpti().
        (atomic_compare_and_swap<mode>): Pass new parameter 'false' to
        rs6000_expand_atomic_compare_and_swap.
        (atomic_compare_and_swap_local<mode>): New define_expand.
        * doc/tm.texi: Regenerate
        * doc/tm.texi.in (TARGET_HAVE_LOAD_WITH_EXCLUSIVE_ACCESS): New hook.
        * optabs.cc (expand_atomic_compare_and_swap): Expand the new builtin.
        * optabs.def (atomic_compare_and_swap_local_optab): New entry.
        * optabs.h (expand_atomic_compare_and_swap): Add additional parameter
        'local' with default value false.
        * predict.cc (expr_expected_value_1): Set up predictor for the new
        builtin.
        * sync-builtins.def (BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL): Define
        new enum.
        (BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_N): Likewise.
        (BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_1): Likewise.
        (BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_2): Likewise.
        (BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_4): Likewise.
        (BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_8): Likewise.
        (BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_16): Likewise.
        * target.def ((have_load_with_exclusive_access): New hook.

gcc/testsuite:
        * gcc.target/powerpc/acmp-tst.c: New test.
---
 gcc/builtins.cc                             | 32 +++++++++++---
 gcc/c-family/c-common.cc                    | 48 ++++++++++++++++++---
 gcc/config/rs6000/rs6000-protos.h           |  2 +-
 gcc/config/rs6000/rs6000.cc                 | 30 +++++++++----
 gcc/config/rs6000/rs6000.h                  |  3 ++
 gcc/config/rs6000/sync.md                   | 37 ++++++++++++----
 gcc/doc/tm.texi                             |  8 ++++
 gcc/doc/tm.texi.in                          |  2 +
 gcc/optabs.cc                               | 10 ++++-
 gcc/optabs.def                              |  1 +
 gcc/optabs.h                                |  2 +-
 gcc/predict.cc                              |  7 +++
 gcc/sync-builtins.def                       | 28 ++++++++++++
 gcc/target.def                              | 11 +++++
 gcc/testsuite/gcc.target/powerpc/acmp-tst.c | 12 ++++++
 15 files changed, 201 insertions(+), 32 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/acmp-tst.c

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index 7f580a3145f..e44b2f2db9d 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -6691,17 +6691,25 @@ expand_builtin_atomic_exchange (machine_mode mode, tree 
exp, rtx target)
   return expand_atomic_exchange (target, mem, val, model);
 }
 
-/* Expand the __atomic_compare_exchange intrinsic:
+/* Expand the __atomic_compare_exchange and the
+   __atomic_compare_exchange_local intrinsics:
        bool __atomic_compare_exchange (TYPE *object, TYPE *expect,
                                        TYPE desired, BOOL weak,
                                        enum memmodel success,
                                        enum memmodel failure)
+       bool __atomic_compare_exchange_local (TYPE *object, TYPE *expect,
+                                             TYPE desired, BOOL weak,
+                                             enum memmodel success,
+                                             enum memmodel failure)
    EXP is the CALL_EXPR.
-   TARGET is an optional place for us to store the results.  */
+   TARGET is an optional place for us to store the results.
+   LOCAL indicates which builtin is being expanded. A value of true
+   means __atomic_compare_exchange_local is being expanded, while a
+   value of false indicates expansion of __atomic_compare_exchange.  */
 
 static rtx
 expand_builtin_atomic_compare_exchange (machine_mode mode, tree exp,
-                                       rtx target)
+                                       rtx target, bool local)
 {
   rtx expect, desired, mem, oldval;
   rtx_code_label *label;
@@ -6745,7 +6753,7 @@ expand_builtin_atomic_compare_exchange (machine_mode 
mode, tree exp,
   oldval = NULL;
 
   if (!expand_atomic_compare_and_swap (&target, &oldval, mem, expect, desired,
-                                      is_weak, success, failure))
+                                      is_weak, success, failure, local))
     return NULL_RTX;
 
   /* Conditionally store back to EXPECT, lest we create a race condition
@@ -8711,7 +8719,7 @@ expand_builtin (tree exp, rtx target, rtx subtarget, 
machine_mode mode,
 
        mode =
            get_builtin_sync_mode (fcode - BUILT_IN_ATOMIC_COMPARE_EXCHANGE_1);
-       target = expand_builtin_atomic_compare_exchange (mode, exp, target);
+       target = expand_builtin_atomic_compare_exchange (mode, exp, target, 
false);
        if (target)
          return target;
 
@@ -8728,6 +8736,20 @@ expand_builtin (tree exp, rtx target, rtx subtarget, 
machine_mode mode,
        break;
       }
 
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_1:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_2:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_4:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_8:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_16:
+      {
+       mode =
+           get_builtin_sync_mode (fcode - 
BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_1);
+       target = expand_builtin_atomic_compare_exchange (mode, exp, target, 
true);
+       if (target)
+         return target;
+       break;
+      }
+
     case BUILT_IN_ATOMIC_LOAD_1:
     case BUILT_IN_ATOMIC_LOAD_2:
     case BUILT_IN_ATOMIC_LOAD_4:
diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index e7dd4602ac1..965b17947d3 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -7807,6 +7807,11 @@ get_atomic_generic_size (location_t loc, tree function,
       n_model = 2;
       outputs = 3;
       break;
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL:
+      n_param = 6;
+      n_model = 2;
+      outputs = 3;
+      break;
     default:
       gcc_unreachable ();
     }
@@ -8118,13 +8123,22 @@ resolve_overloaded_atomic_exchange (location_t loc, 
tree function,
   return false;
 }
 
-/* This will process an __atomic_compare_exchange function call, determine
-   whether it needs to be mapped to the _N variation, or turned into a lib 
call.
+/* This will process __atomic_compare_exchange and 
__atomic_compare_exchange_local
+   function calls and determine whether they can be mapped to the _N variation,
+   or in the case of __atomic_compare_exchange, turned into a lib call.
    LOC is the location of the builtin call.
    FUNCTION is the DECL that has been invoked;
-   PARAMS is the argument list for the call.  The return value is non-null
+   PARAMS is the argument list for the call.
+   The return value is non-null
+   For __atomic_compare_exchange:
    TRUE is returned if it is translated into the proper format for a call to 
the
    external library, and NEW_RETURN is set the tree for that function.
+   FALSE is returned if processing for the _N variation is required.
+   For __atomic_compare_exchange_local:
+   TRUE is returned if a lock-free size is not specified, and NEW_RETURN IS
+   set to error_mark_node. Library support is not provided for this builtin
+   since the intent of this builtin is to provide exclusive access hints on the
+   machine instructions implementing this builtin.
    FALSE is returned if processing for the _N variation is required.  */
 
 static bool
@@ -8146,6 +8160,14 @@ resolve_overloaded_atomic_compare_exchange (location_t 
loc, tree function,
   /* If not a lock-free size, change to the library generic format.  */
   if (!atomic_size_supported_p (n))
     {
+      enum built_in_function fn_code = DECL_FUNCTION_CODE (function);
+      if (fn_code == BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL)
+       {
+         error_at (loc, "lock-free size not specified for builtin-function 
%qE", function);
+         *new_return = error_mark_node;
+         return true;
+       }
+
       /* The library generic format does not have the weak parameter, so
         remove it from the param list.  Since a parameter has been removed,
         we can be sure that there is room for the SIZE_T parameter, meaning
@@ -8640,10 +8662,11 @@ resolve_overloaded_builtin (location_t loc, tree 
function,
 
     case BUILT_IN_ATOMIC_EXCHANGE:
     case BUILT_IN_ATOMIC_COMPARE_EXCHANGE:
+    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL:
     case BUILT_IN_ATOMIC_LOAD:
     case BUILT_IN_ATOMIC_STORE:
       {
-       /* Handle these 4 together so that they can fall through to the next
+       /* Handle these 5 together so that they can fall through to the next
           case if the call is transformed to an _N variant.  */
         switch (orig_code)
          {
@@ -8666,6 +8689,20 @@ resolve_overloaded_builtin (location_t loc, tree 
function,
              orig_code = BUILT_IN_ATOMIC_COMPARE_EXCHANGE_N;
              break;
            }
+         case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL:
+           {
+             if (!targetm.have_load_with_exclusive_access())
+               {
+                 error_at(loc, "unsupported builtin-function %qE", function);
+                 return error_mark_node;
+               }
+             if (resolve_overloaded_atomic_compare_exchange (
+                   loc, function, params, &new_return, complain))
+               return new_return;
+             /* Change to the _N variant.  */
+             orig_code = BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_N;
+             break;
+           }
          case BUILT_IN_ATOMIC_LOAD:
            {
              if (resolve_overloaded_atomic_load (loc, function, params,
@@ -8771,7 +8808,8 @@ resolve_overloaded_builtin (location_t loc, tree function,
        if (orig_code != BUILT_IN_SYNC_BOOL_COMPARE_AND_SWAP_N
            && orig_code != BUILT_IN_SYNC_LOCK_RELEASE_N
            && orig_code != BUILT_IN_ATOMIC_STORE_N
-           && orig_code != BUILT_IN_ATOMIC_COMPARE_EXCHANGE_N)
+           && orig_code != BUILT_IN_ATOMIC_COMPARE_EXCHANGE_N
+           && orig_code != BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_N)
          result = sync_resolve_return (first_param, result, orig_format);
 
        if (fetch_op)
diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index 234eb0ae2b3..f4b9b4ee922 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -127,7 +127,7 @@ extern bool rs6000_emit_set_const (rtx, rtx);
 extern bool rs6000_emit_cmove (rtx, rtx, rtx, rtx);
 extern bool rs6000_emit_int_cmove (rtx, rtx, rtx, rtx);
 extern void rs6000_emit_minmax (rtx, enum rtx_code, rtx, rtx);
-extern void rs6000_expand_atomic_compare_and_swap (rtx op[]);
+extern void rs6000_expand_atomic_compare_and_swap (rtx op[], bool local);
 extern rtx swap_endian_selector_for_mode (machine_mode mode);
 
 extern void rs6000_expand_atomic_exchange (rtx op[]);
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 764b4992fb5..0f4a5d2c4fb 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -395,6 +395,12 @@ mode_supports_dq_form (machine_mode mode)
          != 0);
 }
 
+bool
+rs6000_have_load_with_exclusive_access ()
+{
+  return true;
+}
+
 /* Given that there exists at least one variable that is set (produced)
    by OUT_INSN and read (consumed) by IN_INSN, return true iff
    IN_INSN represents one or more memory store operations and none of
@@ -16749,12 +16755,13 @@ emit_unlikely_jump (rtx cond, rtx label)
 
 /* A subroutine of the atomic operation splitters.  Emit a load-locked
    instruction in MODE.  For QI/HImode, possibly use a pattern than includes
-   the zero_extend operation.  */
+   the zero_extend operation. LOCAL indicates the EH bit value for the
+   load-locked instruction.  */
 
 static void
-emit_load_locked (machine_mode mode, rtx reg, rtx mem)
+emit_load_locked (machine_mode mode, rtx reg, rtx mem, rtx local)
 {
-  rtx (*fn) (rtx, rtx) = NULL;
+  rtx (*fn) (rtx, rtx, rtx) = NULL;
 
   switch (mode)
     {
@@ -16781,7 +16788,7 @@ emit_load_locked (machine_mode mode, rtx reg, rtx mem)
     default:
       gcc_unreachable ();
     }
-  emit_insn (fn (reg, mem));
+  emit_insn (fn (reg, mem, local));
 }
 
 /* A subroutine of the atomic operation splitters.  Emit a store-conditional
@@ -16948,10 +16955,12 @@ rs6000_finish_atomic_subword (rtx narrow, rtx wide, 
rtx shift)
   emit_move_insn (narrow, gen_lowpart (GET_MODE (narrow), wide));
 }
 
-/* Expand an atomic compare and swap operation.  */
+/* Expand an atomic compare and swap operation.
+   If LOCAL is true, the load-locked (larx) instruction should have
+   an EH value of 1. */
 
 void
-rs6000_expand_atomic_compare_and_swap (rtx operands[])
+rs6000_expand_atomic_compare_and_swap (rtx operands[], bool local)
 {
   rtx boolval, retval, mem, oldval, newval, cond;
   rtx label1, label2, x, mask, shift;
@@ -17014,7 +17023,10 @@ rs6000_expand_atomic_compare_and_swap (rtx operands[])
     }
   label2 = gen_rtx_LABEL_REF (VOIDmode, gen_label_rtx ());
 
-  emit_load_locked (mode, retval, mem);
+  if (local)
+    emit_load_locked (mode, retval, mem, const1_rtx);
+  else
+    emit_load_locked (mode, retval, mem, const0_rtx);
 
   x = retval;
   if (mask)
@@ -17112,7 +17124,7 @@ rs6000_expand_atomic_exchange (rtx operands[])
   label = gen_rtx_LABEL_REF (VOIDmode, gen_label_rtx ());
   emit_label (XEXP (label, 0));
 
-  emit_load_locked (mode, retval, mem);
+  emit_load_locked (mode, retval, mem, const0_rtx);
 
   x = val;
   if (mask)
@@ -17217,7 +17229,7 @@ rs6000_expand_atomic_op (enum rtx_code code, rtx mem, 
rtx val,
   if (before == NULL_RTX)
     before = gen_reg_rtx (mode);
 
-  emit_load_locked (mode, before, mem);
+  emit_load_locked (mode, before, mem, const0_rtx);
 
   if (code == NOT)
     {
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index db6112a09e1..6b86949f963 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -597,6 +597,9 @@ extern unsigned char rs6000_recip_bits[];
 #define TARGET_CPU_CPP_BUILTINS() \
   rs6000_cpu_cpp_builtins (pfile)
 
+#define TARGET_HAVE_LOAD_WITH_EXCLUSIVE_ACCESS \
+  rs6000_have_load_with_exclusive_access
+
 /* This is used by rs6000_cpu_cpp_builtins to indicate the byte order
    we're compiling for.  Some configurations may need to override it.  */
 #define RS6000_CPU_CPP_ENDIAN_BUILTINS()       \
diff --git a/gcc/config/rs6000/sync.md b/gcc/config/rs6000/sync.md
index f0ac3348f7b..2be7828d049 100644
--- a/gcc/config/rs6000/sync.md
+++ b/gcc/config/rs6000/sync.md
@@ -278,17 +278,19 @@
 (define_insn "load_locked<mode>"
   [(set (match_operand:ATOMIC 0 "int_reg_operand" "=r")
        (unspec_volatile:ATOMIC
-         [(match_operand:ATOMIC 1 "memory_operand" "Z")] UNSPECV_LL))]
+         [(match_operand:ATOMIC 1 "memory_operand" "Z")
+          (match_operand:QI 2 "u1bit_cint_operand" "n")] UNSPECV_LL))]
   ""
-  "<larx> %0,%y1"
+  "<larx> %0,%y1,%2"
   [(set_attr "type" "load_l")])
 
 (define_insn "load_locked<QHI:mode>_si"
   [(set (match_operand:SI 0 "int_reg_operand" "=r")
        (unspec_volatile:SI
-         [(match_operand:QHI 1 "memory_operand" "Z")] UNSPECV_LL))]
+         [(match_operand:QHI 1 "memory_operand" "Z")
+           (match_operand:QI 2 "u1bit_cint_operand" "n")] UNSPECV_LL))]
   "TARGET_SYNC_HI_QI"
-  "<QHI:larx> %0,%y1"
+  "<QHI:larx> %0,%y1,%2"
   [(set_attr "type" "load_l")])
 
 ;; Use PTImode to get even/odd register pairs.
@@ -302,7 +304,8 @@
 
 (define_expand "load_lockedti"
   [(use (match_operand:TI 0 "quad_int_reg_operand"))
-   (use (match_operand:TI 1 "memory_operand"))]
+   (use (match_operand:TI 1 "memory_operand"))
+   (use (match_operand:QI 2 "u1bit_cint_operand"))]
   "TARGET_SYNC_TI"
 {
   rtx op0 = operands[0];
@@ -316,7 +319,7 @@
       operands[1] = op1 = change_address (op1, TImode, new_addr);
     }
 
-  emit_insn (gen_load_lockedpti (pti, op1));
+  emit_insn (gen_load_lockedpti (pti, op1, operands[2]));
   if (WORDS_BIG_ENDIAN)
     emit_move_insn (op0, gen_lowpart (TImode, pti));
   else
@@ -330,11 +333,12 @@
 (define_insn "load_lockedpti"
   [(set (match_operand:PTI 0 "quad_int_reg_operand" "=&r")
        (unspec_volatile:PTI
-         [(match_operand:TI 1 "indexed_or_indirect_operand" "Z")] UNSPECV_LL))]
+         [(match_operand:TI 1 "indexed_or_indirect_operand" "Z")
+          (match_operand:QI 2 "u1bit_cint_operand" "n")] UNSPECV_LL))]
   "TARGET_SYNC_TI
    && !reg_mentioned_p (operands[0], operands[1])
    && quad_int_reg_operand (operands[0], PTImode)"
-  "lqarx %0,%y1"
+  "lqarx %0,%y1,%2"
   [(set_attr "type" "load_l")
    (set_attr "size" "128")])
 
@@ -411,7 +415,22 @@
    (match_operand:SI 7 "const_int_operand")]           ;; model fail
   ""
 {
-  rs6000_expand_atomic_compare_and_swap (operands);
+  rs6000_expand_atomic_compare_and_swap (operands, false);
+  DONE;
+})
+
+(define_expand "atomic_compare_and_swap_local<mode>"
+  [(match_operand:SI 0 "int_reg_operand")              ;; bool out
+   (match_operand:AINT 1 "int_reg_operand")            ;; val out
+   (match_operand:AINT 2 "memory_operand")             ;; memory
+   (match_operand:AINT 3 "reg_or_short_operand")       ;; expected
+   (match_operand:AINT 4 "int_reg_operand")            ;; desired
+   (match_operand:SI 5 "const_int_operand")            ;; is_weak
+   (match_operand:SI 6 "const_int_operand")            ;; model succ
+   (match_operand:SI 7 "const_int_operand")]           ;; model fail
+  ""
+{
+  rs6000_expand_atomic_compare_and_swap (operands, true);
   DONE;
 })
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 4d4e676aadf..05e1c15062b 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -12496,6 +12496,14 @@ enabled, such as in response to command-line flags. 
The default implementation
 returns true iff @code{TARGET_GEN_CCMP_FIRST} is defined.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_HAVE_LOAD_WITH_EXCLUSIVE_ACCESS (void)
+This target hook returns true if the target supports load instructions
+with exclusive access hints that optimize how a cache block is transferred
+between processor caches. Such hints are helpful, for example, to reduce the
+number of times a cache block is transferred between processor caches when
+there is significant lock contention.
+@end deftypefn
+
 @deftypefn {Target Hook} unsigned TARGET_LOOP_UNROLL_ADJUST (unsigned 
@var{nunroll}, class loop *@var{loop})
 This target hook returns a new value for the number of times @var{loop}
 should be unrolled. The parameter @var{nunroll} is the number of times
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 1a51ad54817..8b953042335 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -7944,6 +7944,8 @@ lists.
 
 @hook TARGET_HAVE_CCMP
 
+@hook TARGET_HAVE_LOAD_WITH_EXCLUSIVE_ACCESS
+
 @hook TARGET_LOOP_UNROLL_ADJUST
 
 @defmac POWI_MAX_MULTS
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index 5c9450f6145..fe6840d21ed 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -7121,6 +7121,8 @@ expand_atomic_exchange (rtx target, rtx mem, rtx val, 
enum memmodel model)
    success to the actual location of the corresponding result.
 
    MEMMODEL is the memory model variant to use.
+   A true value for LOCAL indicates expansion of the builtin
+   __atomic_compare_exchange_local.
 
    The return value of the function is true for success.  */
 
@@ -7128,7 +7130,7 @@ bool
 expand_atomic_compare_and_swap (rtx *ptarget_bool, rtx *ptarget_oval,
                                rtx mem, rtx expected, rtx desired,
                                bool is_weak, enum memmodel succ_model,
-                               enum memmodel fail_model)
+                               enum memmodel fail_model, bool local)
 {
   machine_mode mode = GET_MODE (mem);
   class expand_operand ops[8];
@@ -7157,7 +7159,11 @@ expand_atomic_compare_and_swap (rtx *ptarget_bool, rtx 
*ptarget_oval,
       || reg_overlap_mentioned_p (expected, target_oval))
     target_oval = gen_reg_rtx (mode);
 
-  icode = direct_optab_handler (atomic_compare_and_swap_optab, mode);
+  if (!local)
+    icode = direct_optab_handler (atomic_compare_and_swap_optab, mode);
+  else
+    icode = direct_optab_handler (atomic_compare_and_swap_local_optab, mode);
+
   if (icode != CODE_FOR_nothing)
     {
       machine_mode bool_mode = insn_data[icode].operand[0].mode;
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 87a8b85da15..1e730069caf 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -512,6 +512,7 @@ OPTAB_D (atomic_bit_test_and_set_optab, 
"atomic_bit_test_and_set$I$a")
 OPTAB_D (atomic_bit_test_and_complement_optab, 
"atomic_bit_test_and_complement$I$a")
 OPTAB_D (atomic_bit_test_and_reset_optab, "atomic_bit_test_and_reset$I$a")
 OPTAB_D (atomic_compare_and_swap_optab, "atomic_compare_and_swap$I$a")
+OPTAB_D (atomic_compare_and_swap_local_optab, 
"atomic_compare_and_swap_local$I$a")
 OPTAB_D (atomic_exchange_optab,         "atomic_exchange$I$a")
 OPTAB_D (atomic_fetch_add_optab, "atomic_fetch_add$I$a")
 OPTAB_D (atomic_fetch_and_optab, "atomic_fetch_and$I$a")
diff --git a/gcc/optabs.h b/gcc/optabs.h
index a8b0e93d60b..6f7e0f5a027 100644
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -356,7 +356,7 @@ extern rtx expand_sync_lock_test_and_set (rtx, rtx, rtx);
 extern rtx expand_atomic_test_and_set (rtx, rtx, enum memmodel);
 extern rtx expand_atomic_exchange (rtx, rtx, rtx, enum memmodel);
 extern bool expand_atomic_compare_and_swap (rtx *, rtx *, rtx, rtx, rtx, bool,
-                                           enum memmodel, enum memmodel);
+                                           enum memmodel, enum memmodel, bool 
local = false);
 /* Generate memory barriers.  */
 extern void expand_mem_thread_fence (enum memmodel);
 extern void expand_mem_signal_fence (enum memmodel);
diff --git a/gcc/predict.cc b/gcc/predict.cc
index 5639d81d277..1006bdf3d3c 100644
--- a/gcc/predict.cc
+++ b/gcc/predict.cc
@@ -2672,6 +2672,13 @@ expr_expected_value_1 (tree type, tree op0, enum 
tree_code code,
              case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_4:
              case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_8:
              case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_16:
+             case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL:
+             case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_N:
+             case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_1:
+             case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_2:
+             case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_4:
+             case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_8:
+             case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_16:
                /* Assume that any given atomic operation has low contention,
                   and thus the compare-and-swap operation succeeds.  */
                *predictor = PRED_COMPARE_AND_SWAP;
diff --git a/gcc/sync-builtins.def b/gcc/sync-builtins.def
index 0f058187a20..ad1dd5e2d1f 100644
--- a/gcc/sync-builtins.def
+++ b/gcc/sync-builtins.def
@@ -338,6 +338,34 @@ DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_COMPARE_EXCHANGE_16,
                  BT_FN_BOOL_VPTR_PTR_I16_BOOL_INT_INT,
                  ATTR_NOTHROWCALL_LEAF_LIST)
 
+DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL,
+                 "__atomic_compare_exchange_local",
+                 BT_FN_BOOL_SIZE_VPTR_PTR_PTR_INT_INT,
+                 ATTR_NOTHROWCALL_LEAF_LIST)
+DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_N,
+                 "__atomic_compare_exchange_local_n",
+                 BT_FN_VOID_VAR, ATTR_NOTHROWCALL_LEAF_LIST)
+DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_1,
+                 "__atomic_compare_exchange_local_1",
+                 BT_FN_BOOL_VPTR_PTR_I1_BOOL_INT_INT,
+                 ATTR_NOTHROWCALL_LEAF_LIST)
+DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_2,
+                 "__atomic_compare_exchange_local_2",
+                 BT_FN_BOOL_VPTR_PTR_I2_BOOL_INT_INT,
+                 ATTR_NOTHROWCALL_LEAF_LIST)
+DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_4,
+                 "__atomic_compare_exchange_local_4",
+                 BT_FN_BOOL_VPTR_PTR_I4_BOOL_INT_INT,
+                 ATTR_NOTHROWCALL_LEAF_LIST)
+DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_8,
+                 "__atomic_compare_exchange_local_8",
+                 BT_FN_BOOL_VPTR_PTR_I8_BOOL_INT_INT,
+                 ATTR_NOTHROWCALL_LEAF_LIST)
+DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_COMPARE_EXCHANGE_LOCAL_16,
+                 "__atomic_compare_exchange_local_16",
+                 BT_FN_BOOL_VPTR_PTR_I16_BOOL_INT_INT,
+                 ATTR_NOTHROWCALL_LEAF_LIST)
+
 DEF_SYNC_BUILTIN (BUILT_IN_ATOMIC_STORE,
                  "__atomic_store",
                  BT_FN_VOID_SIZE_VPTR_PTR_INT, ATTR_NOTHROWCALL_LEAF_LIST)
diff --git a/gcc/target.def b/gcc/target.def
index 5dd8f253ef6..fad306b0199 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -2828,6 +2828,17 @@ returns true iff @code{TARGET_GEN_CCMP_FIRST} is 
defined.",
  bool, (void),
  default_have_ccmp)
 
+/* Return true if the target supports load instructions with exclusive access. 
 */
+DEFHOOK
+(have_load_with_exclusive_access,
+ "This target hook returns true if the target supports load instructions\n\
+with exclusive access hints that optimize how a cache block is transferred\n\
+between processor caches. Such hints are helpful, for example, to reduce the\n\
+number of times a cache block is transferred between processor caches when\n\
+there is significant lock contention.",
+ bool, (void),
+ hook_bool_void_false)
+
 /* Return a new value for loop unroll size.  */
 DEFHOOK
 (loop_unroll_adjust,
diff --git a/gcc/testsuite/gcc.target/powerpc/acmp-tst.c 
b/gcc/testsuite/gcc.target/powerpc/acmp-tst.c
new file mode 100644
index 00000000000..a4b5861216b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/acmp-tst.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-mdejagnu-cpu=power8 -O2" } */
+
+#include <stdint.h>
+
+bool
+word_exchange (uint64_t *ptr, uint64_t *expected, uint64_t * desired)
+{
+  return __atomic_compare_exchange_local (ptr, expected, desired, 0, 
__ATOMIC_SEQ_CST, __ATOMIC_ACQUIRE);
+}
+
+/* { dg-final { scan-assembler {\mldarx +[0-9]+,[0-9]+,[0-9]+,1} } } */
-- 
2.47.3

Reply via email to