saiislam updated this revision to Diff 258368.
saiislam marked an inline comment as done.
saiislam added a comment.

1. Moved test case to clang/test/CodeGenCXX.
2. Added a failing test case with invalid sync scope, which gets detected by 
implementation of fence instruction.
3. Updated the change description of the builtin.
4. Updated the clang documentation describing mapping of C++ memory-ordering to 
LLVM memory-ordering.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75917/new/

https://reviews.llvm.org/D75917

Files:
  clang/docs/LanguageExtensions.rst
  clang/test/CodeGenCXX/builtin-memory-fence-failure.cpp
  clang/test/CodeGenCXX/builtin-memory-fence.cpp
  clang/test/CodeGenHIP/builtin_memory_fence.cpp

Index: clang/test/CodeGenHIP/builtin_memory_fence.cpp
===================================================================
--- /dev/null
+++ clang/test/CodeGenHIP/builtin_memory_fence.cpp
@@ -1,25 +0,0 @@
-// REQUIRES: amdgpu-registered-target
-// RUN: %clang_cc1 %s -x hip -emit-llvm -O0 -o - \
-// RUN:   -triple=amdgcn-amd-amdhsa  | opt -instnamer -S | FileCheck %s
-
-void test_memory_fence_success() {
-// CHECK-LABEL: test_memory_fence_success
-
-  // CHECK: fence syncscope("workgroup") seq_cst
-  __builtin_memory_fence(__ATOMIC_SEQ_CST,  "workgroup");
-  
-   // CHECK: fence syncscope("agent") acquire
-  __builtin_memory_fence(__ATOMIC_ACQUIRE, "agent");
-
-  // CHECK: fence seq_cst
-  __builtin_memory_fence(__ATOMIC_SEQ_CST, "");
-
-  // CHECK: fence syncscope("agent") acq_rel
-  __builtin_memory_fence(4, "agent");
-
-    // CHECK: fence syncscope("workgroup") release
-  __builtin_memory_fence(3, "workgroup");
-
-  // CHECK: fence syncscope("foobar") release
-  __builtin_memory_fence(3, "foobar");
-}
\ No newline at end of file
Index: clang/test/CodeGenCXX/builtin-memory-fence-failure.cpp
===================================================================
--- /dev/null
+++ clang/test/CodeGenCXX/builtin-memory-fence-failure.cpp
@@ -0,0 +1,9 @@
+// REQUIRES: amdgpu-registered-target
+// RUN: not %clang_cc1 %s -S \
+// RUN:   -triple=amdgcn-amd-amdhsa 2>&1 | FileCheck %s
+
+void test_memory_fence_failure() {
+
+  // CHECK: error: Unsupported atomic synchronization scope 
+  __builtin_memory_fence(__ATOMIC_SEQ_CST, "foobar");
+}
\ No newline at end of file
Index: clang/docs/LanguageExtensions.rst
===================================================================
--- clang/docs/LanguageExtensions.rst
+++ clang/docs/LanguageExtensions.rst
@@ -2455,6 +2455,59 @@
 and ``__OPENCL_MEMORY_SCOPE_SUB_GROUP`` are provided, with values
 corresponding to the enumerators of OpenCL's ``memory_scope`` enumeration.)
 
+``__builtin_memory_fence``
+-------------------------
+
+``__builtin_memory_fence`` allows using `Fence instruction <https://llvm.org/docs/LangRef.html#fence-instruction>`_ 
+from clang. It takes C++11 compatible memory-ordering and target-specific
+sync-scope as arguments, and generates a fence instruction in the IR.
+
+**Syntax**:
+
+.. code-block:: c++
+
+    __builtin_memory_fence(unsigned int memory_ordering, String sync_scope)
+
+**Example of use**:
+
+.. code-block:: c++
+
+  void my_fence(int i) {
+    i++;
+    __builtin_memory_fence(__ATOMIC_ACQUIRE,  "workgroup");
+    i--;
+    __builtin_memory_fence(__ATOMIC_SEQ_CST,  "agent");
+  }
+
+**Description**:
+
+The first argument of ``__builtin_memory_fence()`` builtin is one of the
+memory-ordering specifiers ``__ATOMIC_ACQUIRE``, ``__ATOMIC_RELEASE``,
+``__ATOMIC_ACQ_REL``, or ``__ATOMIC_SEQ_CST`` following C++11 memory model
+semantics. Equivalent enum values of these memory-ordering can also be 
+specified. The builtin maps these C++ memory-ordering to corresponding
+LLVM Atomic Memory Ordering for the fence instruction using LLVM Atomic C
+ABI, as given in the table below. The second argument is a target-specific
+synchronization scope defined as a String. This builtin transparently
+passes the second argument to fence instruction and relies on target
+implementation for validity check.
+
++------------------------------+--------------------------------+
+| Input in clang               | Output in IR                   |
+| (C++11 Memory-ordering)      | (LLVM Atomic Memory-ordering)  |
++======================+=======+========================+=======+
+| Enum                 | Value | Enum                   | Value |
++----------------------+-------+------------------------+-------+
+| ``__ATOMIC_ACQUIRE`` | 2     | Acquire                | 4     |
++----------------------+-------+------------------------+-------+
+| ``__ATOMIC_RELEASE`` | 3     | Release                | 5     |
++----------------------+-------+------------------------+-------+
+| ``__ATOMIC_ACQ_REL`` | 4     | AcquireRelease         | 6     |
++----------------------+-------+------------------------+-------+
+| ``__ATOMIC_SEQ_CST`` | 5     | SequentiallyConsistent | 7     |
++----------------------+-------+------------------------+-------+
+
+
 Low-level ARM exclusive memory builtins
 ---------------------------------------
 
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to