[PATCH] D127901: [LinkerWrapper] Add PTX output to CUDA fatbinary in LTO-mode

2022-06-22 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D127901#3603467 , @tra wrote: > I'm not sure I follow. WDYM by "go inside the binary itself" ? I assume you > mean the per-GPU offload binaries inside per-TU .o. so that it could be used > when that GPU object gets linked int

[PATCH] D127901: [LinkerWrapper] Add PTX output to CUDA fatbinary in LTO-mode

2022-06-22 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D127901#3603118 , @jhuber6 wrote: > In D127901#3603006 , @tra wrote: > >> Then we do need a knob controlling whether we do want to embed PTX or not. >> The default should be "off" IMO. >>

[PATCH] D127901: [LinkerWrapper] Add PTX output to CUDA fatbinary in LTO-mode

2022-06-22 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D127901#3603006 , @tra wrote: > Then we do need a knob controlling whether we do want to embed PTX or not. > The default should be "off" IMO. > We currently have `--[no-]cuda-include-ptx=` we may reuse for that purpose. We co

[PATCH] D127901: [LinkerWrapper] Add PTX output to CUDA fatbinary in LTO-mode

2022-06-22 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D127901#3602771 , @jdoerfert wrote: > Do we want/need PTX, I do not, but I don't mind having it. Someone will ask > for it eventually. Fair enough. > However, if we embed bitcode via LTO we can use the > single-linked PTX image

[PATCH] D127901: [LinkerWrapper] Add PTX output to CUDA fatbinary in LTO-mode

2022-06-22 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added a comment. Do we want JIT -> YES, but specalizing LLVM-IR JIT. Do we want/need PTX, I do not, but I don't mind having it. Someone will ask for it eventually. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D127901/new/ https://reviews

[PATCH] D127901: [LinkerWrapper] Add PTX output to CUDA fatbinary in LTO-mode

2022-06-16 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D127901#3590402 , @tra wrote: > Playing devil's advocate, I've got to ask -- do we even want to support JIT? > > JIT brings more trouble than benefits. > > - substantial start-up time on nontrivial apps. Last time I tried launc

[PATCH] D127901: [LinkerWrapper] Add PTX output to CUDA fatbinary in LTO-mode

2022-06-16 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Playing devil's advocate, I've got to ask -- do we even want to support JIT? JIT brings more trouble than benefits. - substantial start-up time on nontrivial apps. Last time I tried launching a tensorflow app and needed to JIT its kernels, it took about half an hour until

[PATCH] D127901: [LinkerWrapper] Add PTX output to CUDA fatbinary in LTO-mode

2022-06-15 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: jdoerfert, JonChesterfield, tra, yaxunl. Herald added subscribers: mattd, gchakrabarti, asavonic, inglorion. Herald added a project: All. jhuber6 requested review of this revision. Herald added a project: clang. Herald added a subscriber: cfe-