hliao added a comment.

This's an experimental or demo-only patch in my spare time on eliminating 
private memory usage in https://godbolt.org/z/EPPn6h. The attachment F14026286: 
sample.tar.xz <https://reviews.llvm.org/F14026286> includes both the reference 
and new IR, PTX, and SASS (sm_60) output. For the new code, that aggregate 
argument is loaded through `LDC` instruction in SASS instead of `MOV` due to 
the non-static address. I don't have sm_60 to verify that. Could you try that 
on the real hardware?

BTW, from PTX ISA document, parameter space is read-only for input parameters 
and write-only for output parameters. If that's right, even non-kernel function 
may also require a similar change as the semantic is different from the 
language model, where the argument variable could be modified in the function 
body.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91590/new/

https://reviews.llvm.org/D91590

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to