ABataev added inline comments.

================
Comment at: openmp/libomptarget/src/omptarget.cpp:233
         MapperComponents
-            .Components[target_data_function == targetDataEnd ? I : E - I - 1];
+            .Components[target_data_function == targetDataEnd ? E - I - 1 : I];
     MapperArgsBase[I] = C.Base;
----------------
ye-luo wrote:
> ABataev wrote:
> > ye-luo wrote:
> > > ABataev wrote:
> > > > ye-luo wrote:
> > > > > ye-luo wrote:
> > > > > > ye-luo wrote:
> > > > > > > ABataev wrote:
> > > > > > > > ABataev wrote:
> > > > > > > > > ye-luo wrote:
> > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > > > > > > > > > > grokos wrote:
> > > > > > > > > > > > > > > > > > > > > > What is the current status of the 
> > > > > > > > > > > > > > > > > > > > > > order of the arguments clang emits? 
> > > > > > > > > > > > > > > > > > > > > > Is it still necessary to traverse 
> > > > > > > > > > > > > > > > > > > > > > arguments in reverse order here?
> > > > > > > > > > > > > > > > > > > > > Yes, still required
> > > > > > > > > > > > > > > > > > > > Based on the conversation in
> > > > > > > > > > > > > > > > > > > > https://reviews.llvm.org/D85216
> > > > > > > > > > > > > > > > > > > > This line of code neither before nor 
> > > > > > > > > > > > > > > > > > > > after the change plays well.
> > > > > > > > > > > > > > > > > > > > Shall we fix the order in targetDataEnd 
> > > > > > > > > > > > > > > > > > > > first?
> > > > > > > > > > > > > > > > > > > This change is part of this patch and 
> > > > > > > > > > > > > > > > > > > cannot be committed separately. 
> > > > > > > > > > > > > > > > > > I mean could you fix that issue as a parent 
> > > > > > > > > > > > > > > > > > of this patch?
> > > > > > > > > > > > > > > > > > This change is part of this patch and 
> > > > > > > > > > > > > > > > > > cannot be committed separately. 
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > If fixing the reordering is part of this 
> > > > > > > > > > > > > > > > > patch, I should have seen 
> > > > > > > > > > > > > > > > > "target_data_function == targetDataEnd ?" 
> > > > > > > > > > > > > > > > > branches disappear.
> > > > > > > > > > > > > > > > Nope, just with this patch. It reorders the 
> > > > > > > > > > > > > > > > maps and need to change the cleanup order too. 
> > > > > > > > > > > > > > > It works just like constructors/destructors: 
> > > > > > > > > > > > > > > allocate in direct order, deallocate in reversed 
> > > > > > > > > > > > > > > to correctly handle map order. 
> > > > > > > > > > > > > > The description says that "present and alloc 
> > > > > > > > > > > > > > mappings are processed first and then all others."
> > > > > > > > > > > > > > Why the order of arguments in targetDataBegin, 
> > > > > > > > > > > > > > targetDataEnd and targetDataUpdate all get reversed.
> > > > > > > > > > > > > Because this is for mappers. Mapper maps are ordered 
> > > > > > > > > > > > > by the compiler in the direct order (alloc, maps, 
> > > > > > > > > > > > > delete) but when we need to do exit, we need to 
> > > > > > > > > > > > > release the data in reversed order (deletes, maps, 
> > > > > > > > > > > > > allocs). 
> > > > > > > > > > > > I was not making the question clear. My question about 
> > > > > > > > > > > > "reverse" is not about having a reverse order for 
> > > > > > > > > > > > targetDataBegin. My question was about "reversing" from 
> > > > > > > > > > > > the the old code. Your change put the opposite order 
> > > > > > > > > > > > for targetDataBegin, targetDataEnd and targetDataUpdate 
> > > > > > > > > > > > cases. 
> > > > > > > > > > > > I was not making the question clear. My question about 
> > > > > > > > > > > > "reverse" is not about having a reverse order for 
> > > > > > > > > > > > targetDataBegin. My question was about "reversing" from 
> > > > > > > > > > > > the the old code. Your change put the opposite order 
> > > > > > > > > > > > for targetDataBegin, targetDataEnd and targetDataUpdate 
> > > > > > > > > > > > cases. 
> > > > > > > > > > > 
> > > > > > > > > > > typo correction
> > > > > > > > > > > I was not making the question clear. My question about 
> > > > > > > > > > > "reverse" is not about having a reverse order for 
> > > > > > > > > > > **targetDataEnd**. My question was about "reversing" from 
> > > > > > > > > > > the the old code. Your change put the opposite order for 
> > > > > > > > > > > targetDataBegin, targetDataEnd and targetDataUpdate cases.
> > > > > > > > > > My separate question specifically for targetDataEnd is the 
> > > > > > > > > > following.
> > > > > > > > > > 
> > > > > > > > > > In target(), we call
> > > > > > > > > > ```
> > > > > > > > > > targetDataBegin(args)
> > > > > > > > > > {  // forward order
> > > > > > > > > >   for (int32_t i = 0; i < arg_num; ++i) { ... }
> > > > > > > > > > }
> > > > > > > > > > launch_kernels
> > > > > > > > > > targetDataEnd(args)
> > > > > > > > > > {  // reverse order
> > > > > > > > > >   for (int32_t I = ArgNum - 1; I >= 0; --I) { }
> > > > > > > > > > }
> > > > > > > > > > ```
> > > > > > > > > > 
> > > > > > > > > > At a mapper,
> > > > > > > > > > ```
> > > > > > > > > > targetDataMapper
> > > > > > > > > > {
> > > > > > > > > >   // generate args_reverse in reverse order for 
> > > > > > > > > > targetDataEnd
> > > > > > > > > >   targetDataEnd(args_reverse)
> > > > > > > > > > }
> > > > > > > > > > ```
> > > > > > > > > > Are we actually getting the original forward order due to 
> > > > > > > > > > one reverse in targetDataMapper and second reverse in 
> > > > > > > > > > targetDataEnd? Is this the desired behavior? This part 
> > > > > > > > > > confused me. Do I miss something? Could you explain a bit?
> > > > > > > > > Yes, something like this. targetDataEnd reverses the order of 
> > > > > > > > > mapping arrays. But mapper generator always generates mapping 
> > > > > > > > > arrays in the direct order (it fills mapping arrays that 
> > > > > > > > > later processed by the targetDataEnd function). We could fix 
> > > > > > > > > this by passing extra Boolean flag to the generator function 
> > > > > > > > > but it means the redesign of the mappers. That's why we have 
> > > > > > > > > to reverse it in the libomptarget.
> > > > > > > > You can check it yourself. Apply the  patch, restore the 
> > > > > > > > original behavior in libomptarget and run libomptarget tests. 
> > > > > > > > Mapper related tests will crash.
> > > > > > > Stick with mapper generator always generating mapping arrays in 
> > > > > > > the direct order. The targetDataMapper reverse the mapping array 
> > > > > > > and then passes args_reverse into targetDataEnd. Inside 
> > > > > > > targetDataEnd, mapping
> > > > > > > Yes, something like this. targetDataEnd reverses the order of 
> > > > > > > mapping arrays. But mapper generator always generates mapping 
> > > > > > > arrays in the direct order (it fills mapping arrays that later 
> > > > > > > processed by the targetDataEnd function). We could fix this by 
> > > > > > > passing extra Boolean flag to the generator function but it means 
> > > > > > > the redesign of the mappers. That's why we have to reverse it in 
> > > > > > > the libomptarget.
> > > > > > 
> > > > > > Stick with mapper generator always generating mapping arrays in the 
> > > > > > direct order.
> > > > > > 
> > > > > > In the targetDataBegin case, targetDataMapper keep direct order 
> > > > > > args and calls targetDataBegin(args) and targetDataBegin process 
> > > > > > args in direct order.
> > > > > > 
> > > > > > In the targetDataEnd case, targetDataMapper reverses the mapping 
> > > > > > array and then passes args_reverse into targetDataEnd. Inside 
> > > > > > targetDataEnd, args_reverse are processed in reverse order. So 
> > > > > > targetDataEnd is actually processing the args in original direct 
> > > > > > order. This seems contradictory to the constructor/deconstructor 
> > > > > > like behavior that all the mappings must be processed in the actual 
> > > > > > reverse order in targetDataEnd. 
> > > > > > 
> > > > > > This is my understanding. The current code should be wrong but 
> > > > > > obviously the current code is working. So why the current code is 
> > > > > > working? what is inconsistent in my analysis. Could you point out 
> > > > > > the missing piece. 
> > > > > > You can check it yourself. Apply the  patch, restore the original 
> > > > > > behavior in libomptarget and run libomptarget tests. Mapper related 
> > > > > > tests will crash.
> > > > > 
> > > > > For sure without this line, tests would crash and that is why you 
> > > > > included this line of change in the patch. Since you made the change, 
> > > > > you could explain why, right?
> > > > I changed and simplified codegen for the mapper generator without 
> > > > changing its interface. I could do this because of the new ordering, 
> > > > before we could not rely on it. But it also requires a change in the 
> > > > runtime.
> > > > targetDataEnd calls targetDataMapper and targetDataMapper fills the 
> > > > array in the direct order, but targetDataEnd processes them in the 
> > > > reverse order, but mapper generator does not know about it. It also has 
> > > > to generate the data in the reverse order, just like targetDataEnd does.
> > > > 
> > > > Before this patch mapper generator tried to do some ordering but it was 
> > > > not always correct. It was not expecting something like map(alloc:s) 
> > > > map(s.a) because it was not allowed by the compiler. That's why it 
> > > > worked before and won't work with this patch.
> > > > PS. The change in the mapper generator is also required and cannot be 
> > > > separated. Without this mappers tests won't work.
> > > I played a bit with your patch.
> > > ```
> > > #pragma omp target exit data map(from: c.a[0:NUM], c.b[0:NU2M]) 
> > > map(delete: c)
> > > ```
> > > I put NUM=1024 and NU2M = 2048.
> > > LIBOMPTARGET_DEBUG reports 
> > > ```
> > > Libomptarget --> Entry  0: Base=0x00007fff064080a8, 
> > > Begin=0x00007fff064080a8, Size=16, Type=0x0, Name=(null)
> > > Libomptarget --> Entry  1: Base=0x00007fff064080a8, 
> > > Begin=0x0000000000f9cbd0, Size=4096, Type=0x1000000000012, Name=(null)
> > > Libomptarget --> Entry  2: Base=0x00007fff064080b0, 
> > > Begin=0x0000000000f86e10, Size=8192, Type=0x1000000000012, Name=(null)
> > > Libomptarget --> Entry  3: Base=0x00007fff064080a8, 
> > > Begin=0x00007fff064080a8, Size=16, Type=0x1000000000008, Name=(null)
> > > ```
> > > Since targetDataEnd internally reverse the processing order, could you 
> > > confirm that the frontend was emitting entries 3,2,1,0?
> > > I'm wondering if the frontend could emit 3, 0, 1, 2 so the processing 
> > > order is 2,1,0,3? The spec requires struct element processed before the 
> > > struct in "target exit data"
> > No, the frontend emits in the order 0, 1, 2, 3. targetDataEnd process in 
> > reversed order 3, 2, 1, 0, but the mapper does not know about it and still 
> > emits the data in the order 0, 1, 2, 3.
> > And it is only for mappers!
> > So, say you have an extra map something like map(a).
> > ```
> > map (a) map(mapper(id), tofrom: c)
> > ```
> > where mapper for с does something like you wrote.
> > 
> > In this case the order would be 0, 1, 2, 3, 4, where 0 is mapping of a and 
> > 1-4 is mapping of c.
> > When we need to delete the data, the mapper still would generate 1,2,3,4 + 
> > 0 for mapping of a, but targetDataEnd expects 4,3,2,1,0. That's why we have 
> > to reverse the mapping data, produced by the mapper generator for 
> > targetDataEnd.
> Let us leave the mapper case aside which has extra mess.
> Double checked that "Libomptarget --> Entry  0" is printed at the 
> __tgt_target_data_end_mapper. So the order is as you said 0, 1, 2, 3 from the 
> frontend. What is the "Entry 0"? more specifically is the difference between 
> entry 0 and 3? Entry 0 seems to be an implicit map while 3 is explicit.
> 
> #pragma omp target exit data map(from: c.a[0:NUM], c.b[0:NU2M]) map(delete: c)
> the "map(delete: c)" has some state machine to protect the delete due to 
> ordering.
> So I'm wondering why the frontend must issue both 0 and 3. Can the front end 
> fuse 0 and 3?
> I mean the frontend generates 3, 1, 2 and the runtime processing 2,1,3 
> without the deleting issue?
Entry 0 is the address that should be passed to the kernel (for the captured 
variable, that's why it is marked as TGT_TARGET_PARAM - target kernel 
argument), entries 1-3 are the actual mappings. Yes, I assume the frontend can 
fuse these entries in many cases, but it is different problem that should be 
addressed in a separate patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86119/new/

https://reviews.llvm.org/D86119

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to