ABataev marked an inline comment as done. ABataev added a comment. In D82324#2112388 <https://reviews.llvm.org/D82324#2112388>, @jdoerfert wrote:
> Let me rephrase. Does the user needs to request the fast path or the user > needs to request the slow but correct path? Only the former is acceptable > IMHO. By default, the universal, but slower option is enabled. If the user is sure that there is no parallel target regions in his code, he can compile with `fno-openmp-cuda-parallel-target-regions` to get better performance. I.e. `fopenmp-cuda-parallel-target-regions` is enabled by default (slow, but reliable). ================ Comment at: clang/lib/Driver/ToolChains/Clang.cpp:5250 + options::OPT_fno_openmp_cuda_parallel_target_regions, + /*Default=*/true)) + CmdArgs.push_back("-fopenmp-cuda-parallel-target-regions"); ---------------- The slow but reliable option is enabled by default here. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82324/new/ https://reviews.llvm.org/D82324 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits