tra added a comment. In D127901#3602771 <https://reviews.llvm.org/D127901#3602771>, @jdoerfert wrote:
> Do we want/need PTX, I do not, but I don't mind having it. Someone will ask > for it eventually. Fair enough. > However, if we embed bitcode via LTO we can use the > single-linked PTX image for the whole module and include it in the > fatbinary. This allows us to do the following and have it execute even > without the correct architecture specified. > `clang foo.cu -foffload-lto -fgpu-rdc --offload-new-driver -lcudart` Then we do need a knob controlling whether we do want to embed PTX or not. The default should be "off" IMO. We currently have `--[no-]cuda-include-ptx=` we may reuse for that purpose. This brings another question -- which GPU variant will we generate PTX for? One? All (if more than one is specified)? The ones specified by `--[no-]cuda-include-ptx=` ? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D127901/new/ https://reviews.llvm.org/D127901 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits