[PATCH] D59319: [OpenMP][Offloading][1/3] A generic and simple target region interface

Johannes Doerfert via Phabricator via cfe-commits Thu, 14 Mar 2019 15:04:33 -0700

jdoerfert marked 3 inline comments as done.
jdoerfert added inline comments.



================
Comment at: openmp/libomptarget/deviceRTLs/common/target_region.h:100
+///
+EXTERN int8_t __kmpc_target_region_kernel_init(bool UseSPMDMode,
+                                               bool RequiresOMPRuntime,
----------------
ABataev wrote:
> jdoerfert wrote:
> > ABataev wrote:
> > > jdoerfert wrote:
> > > > ABataev wrote:
> > > > > jdoerfert wrote:
> > > > > > ABataev wrote:
> > > > > > > Better to use `ident_loc` for passing info about execution mode 
> > > > > > > and full/lightweight runtime.
> > > > > > Could you please explain why you think that? Adding indirection 
> > > > > > through a structure does not really seem beneficial to me.
> > > > > Almost all function from libomp rely on `ident_loc`. The functions, 
> > > > > which were added for NVPTX without this parameter had a lot of 
> > > > > problems later and most of them were replaced with the functions with 
> > > > > this parameter type. Plus, this parameter is used for OMPD/OMPT and 
> > > > > it may be important for future OMPD/OMPT support.
> > > > > Almost all function from libomp rely on ident_loc.
> > > > 
> > > > If you look at the implementation of this interface for NVPTX you will 
> > > > see that the called functions do not take `ident_loc` values. When you 
> > > > create the calls from the existing NVPTX code generation in clang, the 
> > > > current code **does not use** `ident_loc` for similar functions, see:
> > > > `___kmpc_kernel_init(kmp_int32 thread_limit, int16_t 
> > > > RequiresOMPRuntime)`,
> > > > `__kmpc_kernel_deinit(int16_t IsOMPRuntimeInitialized)`,
> > > > `__kmpc_spmd_kernel_init(kmp_int32 thread_limit, int16_t 
> > > > RequiresOMPRuntime, int16_t RequiresDataSharing)`,
> > > > `__kmpc_kernel_parallel(void **outlined_function, int16_t 
> > > > IsOMPRuntimeInitialized)`,
> > > > ...
> > > > 
> > > > 
> > > > 
> > > > > Plus, this parameter is used for OMPD/OMPT and it may be important 
> > > > > for future OMPD/OMPT support.
> > > > 
> > > > If we at some point need to make the options permanent in an 
> > > > `ident_loc` we can simply pass an `ident_loc` and require it to be 
> > > > initialized by the call. Cluttering the user code with stores and 
> > > > indirection is exactly what I do want to avoid.
> > > 1. The new functions rely on `ident_loc`. We had to add those new 
> > > functions because the old ones did not use it and it was bad design 
> > > decision. Now we need to fix this. I suggest you do everything right from 
> > > the very beginning rather than fixing this later by adding extra entry 
> > > points to support OMPT/OMPD or something else, for example.
> > > 2. No, you cannot simply change the interface of the library to keep the 
> > > compatibility with the previous versions of the compiler/library. You 
> > > will need to add the new entries.  
> > Let's start this one again because I still haven't understood. Why do we 
> > need to populate the `ident_loc` again? What information has to be in there 
> > at which point? I want this to be clear because a lot of other "design 
> > decisions" of the existing code base are in my opinion not necessary and 
> > consequently missing here. That includes, for example, various global 
> > variables. If we have a description of the problem you try to solve with 
> > the `ident_loc` we might be able to find a way that cuts down on state.
> > 
> > 
> > Regarding the "compatibility", this is not a stable interface people can 
> > rely on. Whatever is committed in this first patch __is not__ set in stone. 
> > Also, we can _always_ add a `__kmpc_init_ident_loc(....)` function after 
> > the fact.
> Ident_loc holds the data about current source code location, execution mode 
> and is full runtime required or not. Also, it is used in OMPT/OMPD support.
> Regarding "compatibility" libraries must be most stable part of the compiler, 
> because the user migbt need to link the old object file/library with the new 
> one. Because of this the new versions of libraries must be compatible with 
> old ones. And you need to maintain the deprecated parts to keep the 
> compatibility with the previous versions. All these libs already have a lot 
> of old code that because of the initial poor design and we need to maintain 
> them. I would like to avoid this situation with this patch.
> Ident_loc holds the data about current source code location, execution mode 
> and is full runtime required or not. Also, it is used in OMPT/OMPD support.

We can store that information through a `__kmpc_init_ident_loc(....)` call once 
needed.


> Regarding "compatibility" libraries must be most stable part of the compiler, 
> because the user migbt need to link the old object file/library with the new 
> one. Because of this the new versions of libraries must be compatible with 
> old ones. And you need to maintain the deprecated parts to keep the 
> compatibility with the previous versions. All these libs already have a lot 
> of old code that because of the initial poor design and we need to maintain 
> them. I would like to avoid this situation with this patch.

The way I understand you now is that you want a way to extend the interface in 
the future and adding a changeable `ident_loc` pointer is your proposed way. Do 
I understand your reaonsing for `ident_loc` here correctly or is it (this and) 
something else?


================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/omp_data.cu:70
+////////////////////////////////////////////////////////////////////////////////
+__device__ __shared__ target_region_shared_buffer _target_region_shared_memory;
+
----------------
ABataev wrote:
> jdoerfert wrote:
> > ABataev wrote:
> > > jdoerfert wrote:
> > > > ABataev wrote:
> > > > > It would be good to store it the global memory rather than in the 
> > > > > shared to save th shared memory. Also, we already are using several 
> > > > > shared memory buffers for different purposes, it would be good to 
> > > > > merge them somehow to reduce pressure on shared memory.
> > > > I would have reused your buffer but it is for reasons unclear to me, 
> > > > not a byte-wise buffer but an array of `void *` and also used as such. 
> > > > Using it as a byte-wise buffer might cause problems or at least 
> > > > confusion. Changing it to a byte-wise buffer would be fine with me. I 
> > > > don't need a separate buffer but just one with the functionality 
> > > > implemented in this one.
> > > I don't know what `my` buffer are talking about. I'm just saying that we 
> > > already using a lot of shared memory and adding another one shared memory 
> > > buffer of ~150 bytes per team increases pressure on the shared memory. It 
> > > would be good to reuse the existing buffers somehow. It was just a 
> > > suggestion.
> > > I don't know what my buffer are talking about. 
> > 
> > Sorry, my bad. The one you see in the (last part of the) implementation 
> > below in the beginning of the shown lines of 
> > `openmp/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h`. It is called 
> > `omptarget_nvptx_SharedArgs` and it does (a subset of) what this new buffer 
> > does, providing space for shared variables in parallel regions.
> > 
> > > I'm just saying that we already using a lot of shared memory and adding 
> > > another one shared memory buffer of ~150 bytes per team increases 
> > > pressure on the shared memory. It would be good to reuse the existing 
> > > buffers somehow. It was just a suggestion.
> > 
> > I understand and I agree. My comment explained why I didn't do that in the 
> > first place, hoping that you see the problem and agree we should rewrite 
> > the users of `omptarget_nvptx_SharedArgs` to use 
> > `target_region_shared_buffer`[1], thereby reducing the required shared 
> > memory.
> > 
> > [1] The name is subject to change! I don't care much.
> > 
> This is not `my` buffer. Unfortunately, I did not work on this library since 
> the very beginning. There are some other buffers, generated by the compiler, 
> for example, and we can try to reuse them.
> This is not my buffer.

My "you" was not directed at you but a general one. The wording was bad, my 
apologies.


> There are some other buffers, generated by the compiler, for example, and we 
> can try to reuse them.

I'm not 100% sure which buffers you refer to here but I think that are the ones 
the new code generation does not emit anymore.

I'm all for merging/replacing multiple buffers implemented in the device RTL, I 
didn't do it because it breaks compatibility or it forces me to inherit design 
choices I dislike (the void** buffer). From my perspective we could get rid of 
the existing `omptarget_nvptx_SharedArgs` space by letting it use the 
`target_region_shared_buffer` internally. That solves the problem for now and 
once `omptarget_nvptx_SharedArgs` isn't directly needed anymore it is removed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59319/new/

https://reviews.llvm.org/D59319



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D59319: [OpenMP][Offloading][1/3] A generic and simple target region interface

Reply via email to