JonChesterfield added a comment. Couple of minor suggestions inline, but overall this looks pretty good. Hopefully device-only compilation works already, the rest could be left for after this patch.
================ Comment at: openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt:100 + -fopenmp -Xclang -fopenmp-is-device + -D__CUDACC__ -I${devicertl_base_directory} ---------------- jdoerfert wrote: > JonChesterfield wrote: > > tianshilei1992 wrote: > > > JonChesterfield wrote: > > > > This is suspect - why does openmp want to claim to be cuda? > > > To maintain minimal change. There is an include wrapped into a macro in > > > `interface.h`. For AMD GPU, it includes one header in AMD implementation, > > > and for CUDA device, it includes a header in NVPTX implementation. > > Ah, that's probably my fault. May as well leave it for now. > > > > I think we should expose a macro for openmp that indicates whether we're > > doing offloading to nvptx, or offloading to amdgpu, or just compiling for > > the host. Or, I think equivalently, replace some `#if` with variant. > Please don't use defines if we have `begin/end declare variant` for it. Variant sounds good. Should also be able to use `#ifdef __CUDA_ARCH__`, as amdgpu shouldn't be setting that ================ Comment at: openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt:16 +# TODO: This part can also be removed if we can change the clang driver to make +# it support device only compilation. +if(CMAKE_HOST_SYSTEM_PROCESSOR MATCHES "x86_64") ---------------- From @jdoerfert, ```-Xclang -fopenmp-is-device -Xclang -emit-llvm-bc``` should do device-only ================ Comment at: openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt:115 - # Generate a Bitcode library for all the compute capabilities the user requested. + # This correlation is from clang/lib/Driver/ToolChains/Cuda.cpp. + # The last element is the default case. ---------------- s/correlation/correspondence, or maybe mapping ================ Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h:25 +#define NOINLINE __attribute__((noinline)) +#define ALIGN(N) __attribute__((aligned(N))) ---------------- Suggest we drop the DEVICE annotation and change ALIGN to alignas() or similar, but in a later patch. This one is already quite noisy. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D94745/new/ https://reviews.llvm.org/D94745 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits