dhruvachak wrote:

With reference to the performance degradation, this patch introduces an 
additional allocation/data-submit/deallocation for every kernel 
(GenericKernelTy::getKernelLaunchEnvironment(), PluginInterface.cpp).

Analysis shows that this overhead appears to be the primary reason for the perf 
degradation. Is it possible to limit this additional overhead only when we need 
it? For example, can it be avoided for non-reduction kernels?

@jdoerfert 



https://github.com/llvm/llvm-project/pull/70401
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to