saiislam added inline comments.
================ Comment at: clang/lib/Driver/ToolChains/Clang.cpp:8475 "triple=" + TC->getTripleString(), - "arch=" + Arch.str(), + "arch=" + getProcessorFromTargetID(TC->getTriple(), Arch).str(), "kind=" + Kind.str(), ---------------- jhuber6 wrote: > yaxunl wrote: > > jhuber6 wrote: > > > saiislam wrote: > > > > Shouldn't Arch (targetID here) should be passed along instead of just > > > > the processor? > > > > > > > > For example, `gfx90a:xnack+` and `gfx90a:xnack-` should be treated > > > > differently. > > > So the problem there is that this will cause us to no longer link in > > > something like the OpenMP runtime library since `gfx90a` != > > > `gfx90a:xnack+`. Right now the behavior is that we will link them both > > > together since the architecture matches but then the attributes will get > > > resolved the same way we handle `-mattr=+x,-x`. I'm not sure what the > > > expected behaviour is here. > > targetID is part of ROCm ABI as it is returned as part of Isa::GetIsaName > > (https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/rocm-5.5.x/src/core/runtime/isa.cpp#L98) > > . > > > > the compatibility rule for targetID is specified by > > https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id . For > > example, bundle entry with gfx90a can be consumed by device with GetIsaName > > gfx90a:xnack+ or gfx90a:xnack- . but bundle entry with gfx90a:xnack+ can > > only be consumed by device with GetIsaName gfx90a:xnack+. > > > > Language runtime is supposed to do a compatibility check for bundle entry > > with the device GetIsaName. Isa::IsCompatible > > (https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/3b939c398bdac0c2b9a860ff9a0ed0be0c80f911/src/core/runtime/isa.cpp#L73) > > can be used to do that. For convenience, language runtime is expected to > > use targetID for identifying bundle entries instead of re-construct > > targetID from features when needed. > > > > targetID is also used for compatibility checks when linking bitcode. > > > So what we need is some more sophisticated logic in the linker wrapper to > merge the binaries according to these rules. However the handling will > definitely require pulling this apart when we send it to LTO. Some logic is given in [[ https://github.com/llvm/llvm-project/blob/111d27484132c0692c214880576dc4a37fd6d645/clang/lib/Driver/OffloadBundler.cpp#L155 | ClangOffloadBundler ]] and in [[ https://github.com/llvm/llvm-project/blob/74c2ec50f393bad8b31d0dd0bd8b2ff44d361198/openmp/libomptarget/plugins-nextgen/amdgpu/utils/UtilitiesRTL.h#L80 | AMDGPU plugin ]] Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D150998/new/ https://reviews.llvm.org/D150998 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits