================ @@ -498,12 +498,16 @@ Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, const ArgList &Args) { }; // Forward all of the `--offload-opt` and similar options to the device. - CmdArgs.push_back("-flto"); for (auto &Arg : Args.filtered(OPT_offload_opt_eq_minus, OPT_mllvm)) CmdArgs.append( {"-Xlinker", Args.MakeArgString("--plugin-opt=" + StringRef(Arg->getValue()))}); + if (Triple.isNVPTX() || Triple.isAMDGPU()) + CmdArgs.push_back("-foffload-lto"); + else + CmdArgs.push_back("-flto"); ---------------- jhuber6 wrote:
Clang 19 is in release and can't be modified, does it happen with 20 or main? Also this example uses the `ptx_kernel` CC which I think was only introduced after the 19 release. It works for my installation on `main`. I'm going to guess you're just using an older version of `clang` or your fork is missing something. ```console > clang test.ll --target=nvptx64-nvidia-cuda -march=sm_50 -O2 -flto > llvm-readelf -h a.out ELF Header: Magic: 7f 45 4c 46 02 01 01 33 07 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 OS/ABI: NVIDIA - CUDA ABI Version: 7 Type: EXEC (Executable file) Machine: NVIDIA CUDA architecture Version: 0x7E Entry point address: 0x0 Start of program headers: 1888 (bytes into file) Start of section headers: 1248 (bytes into file) Flags: 0x320532, sm_50 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 3 Size of section headers: 64 (bytes) Number of section headers: 10 Section header string table index: 1 ``` https://github.com/llvm/llvm-project/pull/125243 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits