carlo.bertolli added a comment.

Hi Alexey

Thanks for your comment. The suggested change will not work as I intended in my 
patch when using the host as a device. This happens when you select the 
following options:
-fomptargets=powerpc64le-ibm-linux-gnu -fopenmp-is-device

In this case we generate device code and the target is ppc64. In ppc64 we need 
to generate a call to kmpc_fork_teams. In your proposed change, we treat all 
devices in an undistinguished way and we do not generate a call to fork_teams.
There are various reasons why we should not do that, the most clear ones to me 
being:

- When using the host as host or as target device we generate different codes. 
This would mess up with performance profiling.

- On a host it is still important to have teams as that may be the place where 
coarse grain parallelism comes from.

If you still want no specialization in CGOpenMPRuntimeNVPTX, we will need to 
check if we are targeting a device and if that device is an nvptx one.

I know that the problem is that we have two CodeGen objects being created in 
different places if we target nvptx or host. However, by the way the interface 
is currently structured, I do not see any way out of this duplication.

Thanks!


Repository:
  rL LLVM

http://reviews.llvm.org/D18286



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to