jdoerfert marked an inline comment as done.
jdoerfert added inline comments.
================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.h:73
+/// Note: Only the team master is allowed to call non-const functions!
+struct shared_bytes_buffer {
+
----------------
ABataev wrote:
> jdoerfert wrote:
> > > What is this buffer used for? Transferring pointers to the shread
> > > variables to the parallel regions? If so, it must be handled by the
> > > compiler. There are several reasons to do this:
> > > 1) You're using malloc/free functions for large buffers. The fact is that
> > > the size of this buffer is known at the compile time and compiler can
> > > generate the fixed size buffer in the global memory if required. We
> > > already have similar implementation for target regions, globalized
> > > variables etc. You can take a look and adapt it for your purpose.
> > > 2) Malloc/free are not very fast on the GPU, so it will get an additional
> > > performance with the preallocated buffers.
> > > 3) Another one problem with malloc/free is that they are using
> > > preallocated memory and the size of this memory is limited by 8Mb (if I
> > > do recall correctly). This memory is required for the correct support of
> > > the local variables globalization and we alredy ran into the situation
> > > when malloc could not allocate enough memory for it with some previous
> > > implementations.
> > > 4) You can reused the shared memory buffers already generated by the
> > > compiler and save shared memory.
> >
> > [Quote by ABataev copied from
> > https://reviews.llvm.org/D59319?id=190767#inline-525900 after the patch was
> > split.]
> >
> >
> > This buffer is supposed to be used to communicate variables in shared and
> > firstprivate clauses between threads in a team. In this patch it is simply
> > used to implement the old `void**` buffer. How, when, if we use it is part
> > of the interface implementation. For now, this buffer simply serves the
> > users of the `omptarget_nvptx_globalArgs` global.
> >
> > If you want to provide compiler allocated memory to avoid the buffer use,
> > no problem,
> > the `__kmpc_target_region_kernel_parallel` function allows to do so, see
> > the `SharedMemPointers` flag. I wouldn't want to put the logic to generate
> > these buffers in the front-end though.
> Why?
Why what?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D59424/new/
https://reviews.llvm.org/D59424
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits