jdoerfert added inline comments.
================ Comment at: openmp/libomptarget/deviceRTLs/common/target_region.h:27 + +/// The target region _kernel_ interface for GPUs +/// ---------------- ABataev wrote: > jdoerfert wrote: > > ABataev wrote: > > > All exported functions are declared in the `interface.h` file. I don't > > > think we need an extra interface file here > > `interface.h`, or to be more precise for people that do not know, > > `deviceRTLs/nvptx/src/interface.h`, is nvptx specific. This file, > > `deviceRTLs/common/target_region.h`, is by design target agnostic and not > > placed _under_ the nvptx subfolder. If you are willing to move > > `interface.h` into a common space and remove the nvptx specific functions > > we can merge the two. Otherwise, I have strong reservations agains that and > > good reason not to do it. > I see that currently it is written in Cuda. It means, it targets NVidia GPUs, > at least at the moment. I'm fine to put this header file into the common > directory, if you're sure that this is really target agnostic. But maybe just > for a start we should put it to NVPTX directory? Later, when you or somebody > else will add support for other GPUs and he/she will find out that these > functions are really target agnostic, they can be moved into the common > directory? > I see that currently it is written in Cuda. It means, it targets NVidia GPUs, > at least at the moment How do you see that? (I hope we both talk about this file, correct?) > But maybe just for a start we should put it to NVPTX directory? Why? What is the benefit? If we want it to be agnostic, regardless of the current state, it should be developed _outside_ of the target specific directories. ================ Comment at: openmp/libomptarget/deviceRTLs/common/target_region.h:100 +/// +EXTERN int8_t __kmpc_target_region_kernel_init(bool UseSPMDMode, + bool RequiresOMPRuntime, ---------------- ABataev wrote: > jdoerfert wrote: > > ABataev wrote: > > > Better to use `ident_loc` for passing info about execution mode and > > > full/lightweight runtime. > > Could you please explain why you think that? Adding indirection through a > > structure does not really seem beneficial to me. > Almost all function from libomp rely on `ident_loc`. The functions, which > were added for NVPTX without this parameter had a lot of problems later and > most of them were replaced with the functions with this parameter type. Plus, > this parameter is used for OMPD/OMPT and it may be important for future > OMPD/OMPT support. > Almost all function from libomp rely on ident_loc. If you look at the implementation of this interface for NVPTX you will see that the called functions do not take `ident_loc` values. When you create the calls from the existing NVPTX code generation in clang, the current code **does not use** `ident_loc` for similar functions, see: `___kmpc_kernel_init(kmp_int32 thread_limit, int16_t RequiresOMPRuntime)`, `__kmpc_kernel_deinit(int16_t IsOMPRuntimeInitialized)`, `__kmpc_spmd_kernel_init(kmp_int32 thread_limit, int16_t RequiresOMPRuntime, int16_t RequiresDataSharing)`, `__kmpc_kernel_parallel(void **outlined_function, int16_t IsOMPRuntimeInitialized)`, ... > Plus, this parameter is used for OMPD/OMPT and it may be important for future > OMPD/OMPT support. If we at some point need to make the options permanent in an `ident_loc` we can simply pass an `ident_loc` and require it to be initialized by the call. Cluttering the user code with stores and indirection is exactly what I do want to avoid. ================ Comment at: openmp/libomptarget/deviceRTLs/common/target_region.h:124 +/// unpacking code. +typedef void (*ParallelWorkFnTy)(char * /* SharedValues */, + char * /* PrivateValues */); ---------------- ABataev wrote: > We used `void *` for buffers usually, I think it is better to use `void *` > here too instead of `char *`. Thanks, fixed. ================ Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/omp_data.cu:70 +//////////////////////////////////////////////////////////////////////////////// +__device__ __shared__ target_region_shared_buffer _target_region_shared_memory; + ---------------- ABataev wrote: > It would be good to store it the global memory rather than in the shared to > save th shared memory. Also, we already are using several shared memory > buffers for different purposes, it would be good to merge them somehow to > reduce pressure on shared memory. I would have reused your buffer but it is for reasons unclear to me, not a byte-wise buffer but an array of `void *` and also used as such. Using it as a byte-wise buffer might cause problems or at least confusion. Changing it to a byte-wise buffer would be fine with me. I don't need a separate buffer but just one with the functionality implemented in this one. ================ Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/target_region.cu:64 + +/// Filter threads into masters and workers. If \p UseStateMachine is true, +/// required workers will enter a state machine through and be trapped there. ---------------- ABataev wrote: > What is the criteria for `UseStateMachine`? Under what conditions it can be > set to `true` and `false`? Also, what if have several parallel regions in > non-SPMD kernel and `UseStateMachine` is `true`? > What is the criteria for UseStateMachine? Under what conditions it can be set > to true and false? `UseStateMachine` is an option exposed to the outer world through the `__kmpc_target_region_kernel_init` call. The semantics are explained here and in the declaration of `__kmpc_target_region_kernel_init`. > Also, what if have several parallel regions in non-SPMD kernel and > UseStateMachine is true? I don't see the problem, I expect all kernels having threads in their own state machine and no interference between them. That is at least what should happen. Maybe I miss something. Do you see a problem? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D59319/new/ https://reviews.llvm.org/D59319 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits