ABataev marked an inline comment as done.
ABataev added a comment.

In D82324#2112388 <https://reviews.llvm.org/D82324#2112388>, @jdoerfert wrote:

> Let me rephrase. Does the user needs to request the fast path or the user 
> needs to request the slow but correct path? Only the former is acceptable 
> IMHO.


By default, the universal, but slower option is enabled. If the user is sure 
that there is no parallel target regions in his code, he can compile with 
`fno-openmp-cuda-parallel-target-regions` to get better performance. I.e. 
`fopenmp-cuda-parallel-target-regions` is enabled by default (slow, but 
reliable).



================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:5250
+                       options::OPT_fno_openmp_cuda_parallel_target_regions,
+                       /*Default=*/true))
+        CmdArgs.push_back("-fopenmp-cuda-parallel-target-regions");
----------------
The slow but reliable option is enabled by default here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82324/new/

https://reviews.llvm.org/D82324



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to