Hi all,

In this PR shrink-wrapping ends up duplicating the load-exclusive part of a 
load-exclusive/store-exclusive loop
used to implement an atomic compare exchange operation. Look in bugzilla for 
the kind of sequences generated.
The load-exclusive paths are duplicated, but not the store-exclusive ones, with 
some very counter-intuitive
control flow between them.

The suggested solution is to catch the load exclusive instructions in 
TARGET_CANNOT_COPY_INSN_P so that
the RTL optimisers don't do any funny tricks with them. Since these 
instructions only really appear when
implementing atomic operations it's unlikely that the user will want 
particularly fancy transformations involving
duplication of exclusive access sequences anyway (though examples to the 
contrary would certainly be appreciated)
and this fix seems like the safest at this point. This is a regression fix from 
GCC 5 as shrink-wrapping got more
aggressive for GCC 6.

With this patch the sequence is not shrink-wrapped and is the same as the 
sequence for GCC 5 as shown in the PR.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk?

Thanks,
Kyrill


2016-03-03  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>

    PR rtl-optimization/69904
    * config/arm/arm.c (arm_cannot_copy_insn_p):
    Return true for load-exclusive instructions.

2016-03-03  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>

    PR rtl-optimization/69904
    * gcc.target/arm/pr69904.c: New test.
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 10559625cd74c99b786009143a8543deb196462e..3d14ccfb3b2bfdaa1e9c0f58556d6dc5898bf142 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -13270,7 +13270,11 @@ tls_mentioned_p (rtx x)
     }
 }
 
-/* Must not copy any rtx that uses a pc-relative address.  */
+/* Must not copy any rtx that uses a pc-relative address.
+   Also, disallow copying of load-exclusive instructions that
+   may appear after splitting of compare-and-swap-style operations
+   so as to prevent those loops from being transformed away from their
+   canonical forms (see PR 69904).  */
 
 static bool
 arm_cannot_copy_insn_p (rtx_insn *insn)
@@ -13289,6 +13293,20 @@ arm_cannot_copy_insn_p (rtx_insn *insn)
 	      || XINT (x, 1) == UNSPEC_PIC_UNIFIED))
 	return true;
     }
+
+  rtx set = single_set (insn);
+  if (set)
+    {
+      rtx src = SET_SRC (set);
+      if (GET_CODE (src) == ZERO_EXTEND)
+	src = XEXP (src, 0);
+
+      /* Catch the load-exclusive and load-acquire operations.  */
+      if (GET_CODE (src) == UNSPEC_VOLATILE
+	  && (XINT (src, 1) == VUNSPEC_LL
+	      || XINT (src, 1) == VUNSPEC_LAX))
+	return true;
+    }
   return false;
 }
 
diff --git a/gcc/testsuite/gcc.target/arm/pr69904.c b/gcc/testsuite/gcc.target/arm/pr69904.c
new file mode 100644
index 0000000000000000000000000000000000000000..24fe844d58eba445ddbe3877e227d0086310192a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr69904.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -marm" } */
+/* { dg-require-effective-target arm_arch_v7a_ok } */
+/* { dg-add-options arm_arch_v7a } */
+
+/* Make sure that RTL optimizers don't do any unexpected transformations
+   on the compare_exchange loop.  */
+
+#include <stdatomic.h>
+
+atomic_uint foo;
+atomic_uint bar;
+int glob;
+
+int
+main (void)
+{
+  glob = atomic_compare_exchange_strong (&foo, &bar, 0);
+  return glob;
+}
+
+/* { dg-final { scan-assembler-times "dmb\tish" 2 } } */
+/* { dg-final { scan-assembler-times "ldrex\t" 1 } } */
+/* { dg-final { scan-assembler-times "strex\t" 1 } } */

Reply via email to