[PATCH] D153924: Allowing exception handling in OpenMP target regions when offloading to AMDGCN or NVPTX targets

Anton Rydahl via Phabricator via cfe-commits Tue, 27 Jun 2023 14:46:39 -0700

AntonRydahl created this revision.
AntonRydahl added reviewers: jdoerfert, jhuber6.
Herald added subscribers: sunshaoce, mattd, gchakrabarti, asavonic, guansong, 
tpr, yaxunl, rampitec.
Herald added a project: All.
AntonRydahl requested review of this revision.
Herald added subscribers: cfe-commits, jplehr, sstefan1.
Herald added a project: clang.


The motivation for this patch is that many code bases use exception handling. 
As GPUs are not expected to support exception handling in the near future, we 
can experiment with compiling the code for GPU targets anyway. This will 
allow us to run the code, as long as no exception is thrown.

The overall idea is very simple:

- If a throw expression is compiled to AMDGCN or NVPTX, it is replaced with a 
trap during code generation.
- If a try/catch statement is compiled to AMDGCN or AMDHSA, we ganerate code 
for the try statement as if it were a basic block.

While I have almost not modified any code, I have made tests that verify that 
we still do not allow compilation of exceptions, unless the user explicitly 
enables `-fcxx-exceptions` or `-fexceptions`.

Please let me know what you think of this patch and if the warnings could be 
improved.

Example
-------

With this patch, the compilation of the following example

  C++
  #include <iostream>
  #pragma omp declare target
  int gaussian_sum(int a,int b){
        if ((a + b) % 2 == 0) {throw -1;};
        return (a+b) * ((a+b)/2);
  }
  #pragma omp end declare target
  
  int main(void) {
        int gauss = 0;
        #pragma omp target map(from:gauss)
        {
                try {
                        gauss = gaussian_sum(1,100);
                }
                catch (int e){
                        gauss = e;
                }
        }
        std::cout << "GaussianSum(1,100)="<<gauss<<std::endl;
          #pragma omp target map(from:gauss)
          {
                  try {
                        gauss = gaussian_sum(1,101);
                  }
                  catch (int e){
                          gauss = e;
                  }
          }
        std::cout << "GaussianSum(1,101)="<<gauss<<std::endl;
        return (gauss > 1) ? 0 : 1;
  }

with offloading to `gfx906` results in

  bash
  ./bin/target_try_minimal_fail      
  GaussianSum(1,100)=5050
  AMDGPU fatal error 1: Received error in queue 0x155555506000: 
HSA_STATUS_ERROR_EXCEPTION: An HSAIL operation resulted in a hardware exception.
  zsh: abort (core dumped)



Issues
------

The patch make 11 of the tests from `clang/test/OpenMP` fail. It seems to be 
related only with the values of `Opts.Exceptions` and `Opts.Exceptions`. I have 
tested that this change alone breaks the aforementioned tests. It would 
be nice if somebody with a better knowledge of Clang and NVPTX would help me 
understand to what degree I have done something wrong, and when, if in any 
cases, it would be allowed to modify the tests.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D153924

Files:
  clang/include/clang/Basic/DiagnosticCommonKinds.td
  clang/lib/CodeGen/CGException.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/OpenMP/target_throw_message.cpp
  clang/test/OpenMP/target_throw_message_fun_call.cpp
  clang/test/OpenMP/target_try_catch_message.cpp
  clang/test/OpenMP/target_try_catch_message_fun_call.cpp
  clang/test/OpenMP/target_try_catch_throw_message.cpp
  clang/test/OpenMP/target_try_catch_throw_message_no_exceptions.cpp

Index: clang/test/OpenMP/target_try_catch_throw_message_no_exceptions.cpp
===================================================================
--- /dev/null
+++ clang/test/OpenMP/target_try_catch_throw_message_no_exceptions.cpp
@@ -0,0 +1,22 @@
+// This test makes sure that exception handling is turned off
+// on the device and that the two expected errors appear.
+
+// RUN: %clang_cc1 -x c++ -verify -fopenmp \
+// RUN: -fopenmp-targets=nvptx64-nvidia-cuda %s
+
+// RUN: %clang_cc1 -x c++ -verify -fopenmp \
+// RUN: -fopenmp-targets=amdgcn-amd-amdhsa %s
+
+int main(void) {
+  int retval = -1;
+  #pragma omp target map(always,from:retval)
+  {
+    try { //expected-error{{with exceptions disabled}}
+      throw 404; //expected-error{{with exceptions disabled}}
+    }
+    catch (int e) {
+      retval = e;
+    }
+  }
+  return 0;
+}
\ No newline at end of file
Index: clang/test/OpenMP/target_try_catch_throw_message.cpp
===================================================================
--- /dev/null
+++ clang/test/OpenMP/target_try_catch_throw_message.cpp
@@ -0,0 +1,36 @@
+// We first test that we treat 'try' as a basic block on AMD and NVIDIA GPUs in TEST1 and TEST2
+// RUN: %clangxx -O0 -Wall -std=c++17 -fopenmp \
+// RUN: -fopenmp-targets=amdgcn-amd-amdhsa --offload-arch=gfx906 \
+// RUN: -fopenmp-offload-mandatory -Xopenmp-target -fexceptions \
+// RUN: -Xopenmp-target -fcxx-exceptions -emit-llvm -S %s -o - 2>&1 | FileCheck %s -v --check-prefix=TEST1
+// TEST1: {{Target 'amdgcn-amd-amdhsa' does not support exception handling. To allow code generation for 'amdgcn-amd-amdhsa', 'try' statements are treated as basic blocks}}
+// TEST1: {{Target 'amdgcn-amd-amdhsa' does not support exception handling. To allow code generation for 'amdgcn-amd-amdhsa', 'throw' expressions will be replaced by traps}}
+
+// RUN: %clangxx -O0 -Wall -std=c++17 -fopenmp \
+// RUN: -fopenmp-targets=nvptx64-nvidia-cuda --offload-arch=sm_70 \
+// RUN: -fopenmp-offload-mandatory -Xopenmp-target -fexceptions \
+// RUN: -Xopenmp-target -fcxx-exceptions -emit-llvm -S %s -o - 2>&1 | FileCheck %s -v --check-prefix=TEST2
+// TEST2: {{Target 'nvptx64-nvidia-cuda' does not support exception handling. To allow code generation for 'nvptx64-nvidia-cuda', 'try' statements are treated as basic blocks}}
+// TEST2: {{Target 'nvptx64-nvidia-cuda' does not support exception handling. To allow code generation for 'nvptx64-nvidia-cuda', 'throw' expressions will be replaced by traps}}
+
+// Then, we test that exception handling is still allowed on code offloaded to a CPU in TEST3
+// RUN: %clangxx -O0 -Wall -std=c++17 -fopenmp \
+// RUN: -fopenmp-targets=x86_64-pc-linux-gnu --offload-arch=x86_64-pc-linux-gnu \
+// RUN: -fopenmp-offload-mandatory -Xopenmp-target -fexceptions \
+// RUN: -Xopenmp-target -fcxx-exceptions -emit-llvm -S %s -o - 2>&1 | FileCheck %s -v --check-prefix=TEST3
+// TEST3-NOT: {{Target 'x86_64-pc-linux-gnu' does not support exception handling. To allow code generation for 'x86_64-pc-linux-gnu', 'try' statements are treated as basic blocks}}
+// TEST3-NOT: {{Target 'x86_64-pc-linux-gnu' does not support exception handling. To allow code generation for 'x86_64-pc-linux-gnu', 'throw' expressions will be replaced by traps}}
+
+int main(void) {
+  int retval = -1;
+  #pragma omp target map(always,from:retval)
+  {
+    try { 
+      throw 404; 
+    }
+    catch (int e) {
+      retval = e;
+    }
+  }
+  return 0;
+}
\ No newline at end of file
Index: clang/test/OpenMP/target_try_catch_message_fun_call.cpp
===================================================================
--- /dev/null
+++ clang/test/OpenMP/target_try_catch_message_fun_call.cpp
@@ -0,0 +1,41 @@
+// We first test that we treat 'try' as a basic block on AMD and NVIDIA GPUs in TEST1 and TEST2
+// RUN: %clangxx -O0 -Wall -std=c++17 -fopenmp \
+// RUN: -fopenmp-targets=amdgcn-amd-amdhsa --offload-arch=gfx906 \
+// RUN: -fopenmp-offload-mandatory -Xopenmp-target -fexceptions \
+// RUN: -Xopenmp-target -fcxx-exceptions -emit-llvm -S %s -o - 2>&1 | FileCheck %s -v --check-prefix=TEST1
+// TEST1: {{Target 'amdgcn-amd-amdhsa' does not support exception handling. To allow code generation for 'amdgcn-amd-amdhsa', 'try' statements are treated as basic blocks}}
+
+// RUN: %clangxx -O0 -Wall -std=c++17 -fopenmp \
+// RUN: -fopenmp-targets=nvptx64-nvidia-cuda --offload-arch=sm_70 \
+// RUN: -fopenmp-offload-mandatory -Xopenmp-target -fexceptions \
+// RUN: -Xopenmp-target -fcxx-exceptions -emit-llvm -S %s -o - 2>&1 | FileCheck %s -v --check-prefix=TEST2
+// TEST2: {{Target 'nvptx64-nvidia-cuda' does not support exception handling. To allow code generation for 'nvptx64-nvidia-cuda', 'try' statements are treated as basic blocks}}
+
+// Then, we test that exception handling is still allowed on code offloaded to a CPU in TEST3
+// RUN: %clangxx -O0 -Wall -std=c++17 -fopenmp \
+// RUN: -fopenmp-targets=x86_64-pc-linux-gnu --offload-arch=x86_64-pc-linux-gnu \
+// RUN: -fopenmp-offload-mandatory -Xopenmp-target -fexceptions \
+// RUN: -Xopenmp-target -fcxx-exceptions -emit-llvm -S %s -o - 2>&1 | FileCheck %s -v --check-prefix=TEST3
+// TEST3-NOT: {{Target 'x86_64-pc-linux-gnu' does not support exception handling. To allow code generation for 'x86_64-pc-linux-gnu', 'try' statements are treated as basic blocks}}
+
+#pragma omp declare target
+int foo(){
+ int retval = -1;
+  try {
+    retval = 0;
+  }
+  catch (int e) {
+    retval = e;
+  }
+  return retval;
+}
+#pragma omp end declare target
+
+int main(void) {
+  int retval = -1;
+  #pragma omp target map(always,from:retval)
+  {
+    retval = foo();
+  }
+  return 0;
+}
\ No newline at end of file
Index: clang/test/OpenMP/target_try_catch_message.cpp
===================================================================
--- /dev/null
+++ clang/test/OpenMP/target_try_catch_message.cpp
@@ -0,0 +1,33 @@
+// We first test that we treat 'try' as a basic block on AMD and NVIDIA GPUs in TEST1 and TEST2
+// RUN: %clangxx -O0 -Wall -std=c++17 -fopenmp \
+// RUN: -fopenmp-targets=amdgcn-amd-amdhsa --offload-arch=gfx906 \
+// RUN: -fopenmp-offload-mandatory -Xopenmp-target -fexceptions \
+// RUN: -Xopenmp-target -fcxx-exceptions -emit-llvm -S %s -o - 2>&1 | FileCheck %s -v --check-prefix=TEST1
+// TEST1: {{Target 'amdgcn-amd-amdhsa' does not support exception handling. To allow code generation for 'amdgcn-amd-amdhsa', 'try' statements are treated as basic blocks}}
+
+// RUN: %clangxx -O0 -Wall -std=c++17 -fopenmp \
+// RUN: -fopenmp-targets=nvptx64-nvidia-cuda --offload-arch=sm_70 \
+// RUN: -fopenmp-offload-mandatory -Xopenmp-target -fexceptions \
+// RUN: -Xopenmp-target -fcxx-exceptions -emit-llvm -S %s -o - 2>&1 | FileCheck %s -v --check-prefix=TEST2
+// TEST2: {{Target 'nvptx64-nvidia-cuda' does not support exception handling. To allow code generation for 'nvptx64-nvidia-cuda', 'try' statements are treated as basic blocks}}
+
+// Then, we test that exception handling is still allowed on code offloaded to a CPU in TEST3
+// RUN: %clangxx -O0 -Wall -std=c++17 -fopenmp \
+// RUN: -fopenmp-targets=x86_64-pc-linux-gnu --offload-arch=x86_64-pc-linux-gnu \
+// RUN: -fopenmp-offload-mandatory -Xopenmp-target -fexceptions \
+// RUN: -Xopenmp-target -fcxx-exceptions -emit-llvm -S %s -o - 2>&1 | FileCheck %s -v --check-prefix=TEST3
+// TEST3-NOT: {{Target 'x86_64-pc-linux-gnu' does not support exception handling. To allow code generation for 'x86_64-pc-linux-gnu', 'try' statements are treated as basic blocks}}
+
+int main(void) {
+  int retval = -1;
+  #pragma omp target map(always,from:retval)
+  {
+    try {
+      retval = 0;
+    }
+    catch (int e) {
+      retval = e;
+    }
+  }
+  return 0;
+}
\ No newline at end of file
Index: clang/test/OpenMP/target_throw_message_fun_call.cpp
===================================================================
--- /dev/null
+++ clang/test/OpenMP/target_throw_message_fun_call.cpp
@@ -0,0 +1,32 @@
+// We first test that we treat 'throw' as a trap on AMD and NVIDIA GPUs in TEST1 and TEST2
+// RUN: %clangxx -O0 -Wall -std=c++17 -fopenmp \
+// RUN: -fopenmp-targets=amdgcn-amd-amdhsa --offload-arch=gfx906 \
+// RUN: -fopenmp-offload-mandatory -Xopenmp-target -fexceptions \
+// RUN: -Xopenmp-target -fcxx-exceptions -emit-llvm -S %s -o - 2>&1 | FileCheck %s -v --check-prefix=TEST1
+// TEST1: {{Target 'amdgcn-amd-amdhsa' does not support exception handling. To allow code generation for 'amdgcn-amd-amdhsa', 'throw' expressions will be replaced by traps}}
+
+// RUN: %clangxx -O0 -Wall -std=c++17 -fopenmp \
+// RUN: -fopenmp-targets=nvptx64-nvidia-cuda --offload-arch=sm_70 \
+// RUN: -fopenmp-offload-mandatory -Xopenmp-target -fexceptions \
+// RUN: -Xopenmp-target -fcxx-exceptions -emit-llvm -S %s -o - 2>&1 | FileCheck %s -v --check-prefix=TEST2
+// TEST2: {{Target 'nvptx64-nvidia-cuda' does not support exception handling. To allow code generation for 'nvptx64-nvidia-cuda', 'throw' expressions will be replaced by traps}}
+
+// Then, we test that exception handling is still allowed on code offloaded to a CPU in TEST3
+// RUN: %clangxx -O0 -Wall -std=c++17 -fopenmp \
+// RUN: -fopenmp-targets=x86_64-pc-linux-gnu --offload-arch=x86_64-pc-linux-gnu \
+// RUN: -fopenmp-offload-mandatory -Xopenmp-target -fexceptions \
+// RUN: -Xopenmp-target -fcxx-exceptions -emit-llvm -S %s -o - 2>&1 | FileCheck %s -v --check-prefix=TEST3
+// TEST3-NOT: {{Target 'x86_64-pc-linux-gnu' does not support exception handling. To allow code generation for 'x86_64-pc-linux-gnu', 'throw' expressions will be replaced by traps}}
+
+
+#pragma omp declare target
+void foo() {throw 'e';}
+#pragma omp end declare target
+
+int main(void) {
+#pragma omp target
+  {
+    foo();
+  }
+  return 0;
+}
\ No newline at end of file
Index: clang/test/OpenMP/target_throw_message.cpp
===================================================================
--- /dev/null
+++ clang/test/OpenMP/target_throw_message.cpp
@@ -0,0 +1,27 @@
+// We first test that we treat 'throw' as a trap on AMD and NVIDIA GPUs in TEST1 and TEST2
+// RUN: %clangxx -O0 -Wall -std=c++17 -fopenmp \
+// RUN: -fopenmp-targets=amdgcn-amd-amdhsa --offload-arch=gfx906 \
+// RUN: -fopenmp-offload-mandatory -Xopenmp-target -fexceptions \
+// RUN: -Xopenmp-target -fcxx-exceptions -emit-llvm -S %s -o - 2>&1 | FileCheck %s -v --check-prefix=TEST1
+// TEST1: {{Target 'amdgcn-amd-amdhsa' does not support exception handling. To allow code generation for 'amdgcn-amd-amdhsa', 'throw' expressions will be replaced by traps}}
+
+// RUN: %clangxx -O0 -Wall -std=c++17 -fopenmp \
+// RUN: -fopenmp-targets=nvptx64-nvidia-cuda --offload-arch=sm_70 \
+// RUN: -fopenmp-offload-mandatory -Xopenmp-target -fexceptions \
+// RUN: -Xopenmp-target -fcxx-exceptions -emit-llvm -S %s -o - 2>&1 | FileCheck %s -v --check-prefix=TEST2
+// TEST2: {{Target 'nvptx64-nvidia-cuda' does not support exception handling. To allow code generation for 'nvptx64-nvidia-cuda', 'throw' expressions will be replaced by traps}}
+
+// Then, we test that exception handling is still allowed on code offloaded to a CPU in TEST3
+// RUN: %clangxx -O0 -Wall -std=c++17 -fopenmp \
+// RUN: -fopenmp-targets=x86_64-pc-linux-gnu --offload-arch=x86_64-pc-linux-gnu \
+// RUN: -fopenmp-offload-mandatory -Xopenmp-target -fexceptions \
+// RUN: -Xopenmp-target -fcxx-exceptions -emit-llvm -S %s -o - 2>&1 | FileCheck %s -v --check-prefix=TEST3
+// TEST3-NOT: {{Target 'x86_64-pc-linux-gnu' does not support exception handling. To allow code generation for 'x86_64-pc-linux-gnu', 'throw' expressions will be replaced by traps}}
+
+int main(void) {
+#pragma omp target
+  {
+    throw 404;
+  }
+  return 0;
+}
\ No newline at end of file
Index: clang/lib/Frontend/CompilerInvocation.cpp
===================================================================
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -3826,12 +3826,15 @@
   }
 
   // Set the flag to prevent the implementation from emitting device exception
-  // handling code for those requiring so.
+  // handling code for those requiring so. However, if the user explicitly
+  // enabled exception handling on the device, we will allow exceptions during
+  // Sema and handle the exceptions differently in CodeGen.
   if ((Opts.OpenMPIsDevice && (T.isNVPTX() || T.isAMDGCN())) ||
       Opts.OpenCLCPlusPlus) {
-
-    Opts.Exceptions = 0;
-    Opts.CXXExceptions = 0;
+    bool exceptions_user_enabled = Args.hasFlag(
+        options::OPT_fexceptions, options::OPT_fno_exceptions, false);
+    Opts.Exceptions = exceptions_user_enabled;
+    Opts.CXXExceptions = exceptions_user_enabled;
   }
   if (Opts.OpenMPIsDevice && T.isNVPTX()) {
     Opts.OpenMPCUDANumSMs =
Index: clang/lib/CodeGen/CGException.cpp
===================================================================
--- clang/lib/CodeGen/CGException.cpp
+++ clang/lib/CodeGen/CGException.cpp
@@ -9,7 +9,6 @@
 // This contains code dealing with C++ exception related code generation.
 //
 //===----------------------------------------------------------------------===//
-
 #include "CGCXXABI.h"
 #include "CGCleanup.h"
 #include "CGObjCRuntime.h"
@@ -440,7 +439,16 @@
 
 void CodeGenFunction::EmitCXXThrowExpr(const CXXThrowExpr *E,
                                        bool KeepInsertionPoint) {
-  if (const Expr *SubExpr = E->getSubExpr()) {
+  // If the exception is being emitted in an OpenMP target region,
+  // and the target is a GPU, we do not support exception handling.
+  // Therefore, we emit a trap which will abort the program, and
+  // prompt a warning indicating that a trap will be emitted.
+  const llvm::Triple &T = Target.getTriple();
+  if (CGM.getLangOpts().OpenMPIsDevice && (T.isNVPTX() || T.isAMDGCN())) {
+    CGM.getDiags().Report(E->getExprLoc(), diag::warn_throw_not_valid_on_target)
+        << T.str();
+    EmitTrapCall(llvm::Intrinsic::trap);
+  } else if (const Expr *SubExpr = E->getSubExpr()) {
     QualType ThrowType = SubExpr->getType();
     if (ThrowType->isObjCObjectPointerType()) {
       const Stmt *ThrowStmt = E->getSubExpr();
@@ -609,9 +617,20 @@
 }
 
 void CodeGenFunction::EmitCXXTryStmt(const CXXTryStmt &S) {
-  EnterCXXTryStmt(S);
+  const llvm::Triple &T = Target.getTriple();
+  // If we encounter a try statement on in an OpenMP target region offloaded to
+  // a GPU, we treat it as a basic block.
+  const bool is_omp_gpu_target =
+      (CGM.getLangOpts().OpenMPIsDevice && (T.isNVPTX() || T.isAMDGCN()));
+  if (is_omp_gpu_target) {
+    CGM.getDiags().Report(S.getTryLoc(), diag::warn_try_not_valid_on_target)
+        << T.str();
+  }
+  if (!is_omp_gpu_target)
+    EnterCXXTryStmt(S);
   EmitStmt(S.getTryBlock());
-  ExitCXXTryStmt(S);
+  if (!is_omp_gpu_target)
+    ExitCXXTryStmt(S);
 }
 
 void CodeGenFunction::EnterCXXTryStmt(const CXXTryStmt &S, bool IsFnTryBlock) {
Index: clang/include/clang/Basic/DiagnosticCommonKinds.td
===================================================================
--- clang/include/clang/Basic/DiagnosticCommonKinds.td
+++ clang/include/clang/Basic/DiagnosticCommonKinds.td
@@ -420,4 +420,14 @@
   "options %0 and %1 are set to different values">;
 def err_opencl_feature_requires : Error<
   "feature %0 requires support of %1 feature">;
+
+def warn_throw_not_valid_on_target : Warning<
+  "Target '%0' does not support exception handling."
+  " To allow code generation for '%0', 'throw' expressions will be replaced by traps.">;
+def warn_try_not_valid_on_target : Warning<
+  "Target '%0' does not support exception handling."
+  " To allow code generation for '%0', 'try' statements are treated as basic blocks.">;
+def warn_catch_not_valid_on_target : Warning<
+  "Target '%0' does not support exception handling."
+  " To allow code generation for '%0', 'catch' statement will be replaced by a no operation instruction.">;
 }

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153924: Allowing exception handling in OpenMP target regions when offloading to AMDGCN or NVPTX targets

Reply via email to