On 22/05/15 09:28, Matthew Wahab wrote:
[Added PR number and updated patches]

This patch changes the code generated for __sync_type_compare_and_swap to

    ldxr reg; cmp; bne label; stlxr; cbnz; label: dmb ish; mov .., reg

This removes the acquire-barrier from the load and ends the operation with a
fence to prevent memory references appearing after the __sync operation from
being moved ahead of the store-release.

This also strengthens the acquire barrier generated for __sync_lock_test_and_set
(which, like compare-and-swap, is implemented as a form of atomic exchange):

    ldaxr; stxr; cbnz
becomes
    ldxr; stxr; cbnz; dmb ish


Updated patch:
- Used 'barrier' rather than 'fence' in comments.
- Simplified the code for the initial load.

Tested with check-gcc for aarch64-none-linux-gnu.

Ok for trunk?
Matthew

2015-06-01  Matthew Wahab  <matthew.wa...@arm.com>

        PR target/65697
        * config/aarch64/aarch64.c (aarch64_split_compare_and_swap): Check
        for __sync memory models, emit initial loads and final barriers as
        appropriate.

From e1f68896db3b367d43a7ae863339dbe100244360 Mon Sep 17 00:00:00 2001
From: Matthew Wahab <matthew.wa...@arm.com>
Date: Fri, 15 May 2015 09:31:06 +0100
Subject: [PATCH 2/3] [AArch64] Strengthen barriers for sync-compare-swap
 builtins.

Change-Id: I335771f2f42ea951d227f20f6cb9daa07330614d
---
 gcc/config/aarch64/aarch64.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index b083e12..2db8f4f 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9436,14 +9436,18 @@ aarch64_split_compare_and_swap (rtx operands[])
   bool is_weak;
   rtx_code_label *label1, *label2;
   rtx x, cond;
+  enum memmodel model;
+  rtx model_rtx;
 
   rval = operands[0];
   mem = operands[1];
   oldval = operands[2];
   newval = operands[3];
   is_weak = (operands[4] != const0_rtx);
+  model_rtx = operands[5];
   scratch = operands[7];
   mode = GET_MODE (mem);
+  model = memmodel_from_int (INTVAL (model_rtx));
 
   label1 = NULL;
   if (!is_weak)
@@ -9453,7 +9457,13 @@ aarch64_split_compare_and_swap (rtx operands[])
     }
   label2 = gen_label_rtx ();
 
-  aarch64_emit_load_exclusive (mode, rval, mem, operands[5]);
+  /* The initial load can be relaxed for a __sync operation since a final
+     barrier will be emitted to stop code hoisting.  */
+  if (is_mm_sync (model))
+    aarch64_emit_load_exclusive (mode, rval, mem,
+				 GEN_INT (MEMMODEL_RELAXED));
+  else
+    aarch64_emit_load_exclusive (mode, rval, mem, model_rtx);
 
   cond = aarch64_gen_compare_reg (NE, rval, oldval);
   x = gen_rtx_NE (VOIDmode, cond, const0_rtx);
@@ -9461,7 +9471,7 @@ aarch64_split_compare_and_swap (rtx operands[])
 			    gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
   aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
 
-  aarch64_emit_store_exclusive (mode, scratch, mem, newval, operands[5]);
+  aarch64_emit_store_exclusive (mode, scratch, mem, newval, model_rtx);
 
   if (!is_weak)
     {
@@ -9478,6 +9488,10 @@ aarch64_split_compare_and_swap (rtx operands[])
     }
 
   emit_label (label2);
+
+  /* Emit any final barrier needed for a __sync operation.  */
+  if (is_mm_sync (model))
+    aarch64_emit_post_barrier (model);
 }
 
 /* Split an atomic operation.  */
-- 
1.9.1

Reply via email to