saiislam added inline comments.

================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:8475
         "triple=" + TC->getTripleString(),
-        "arch=" + Arch.str(),
+        "arch=" + getProcessorFromTargetID(TC->getTriple(), Arch).str(),
         "kind=" + Kind.str(),
----------------
jhuber6 wrote:
> yaxunl wrote:
> > jhuber6 wrote:
> > > saiislam wrote:
> > > > Shouldn't Arch (targetID here) should be passed along instead of just 
> > > > the processor?
> > > > 
> > > > For example, `gfx90a:xnack+` and `gfx90a:xnack-` should be treated 
> > > > differently.
> > > So the problem there is that this will cause us to no longer link in 
> > > something like the OpenMP runtime library since `gfx90a` != 
> > > `gfx90a:xnack+`. Right now the behavior is that we will link them both 
> > > together since the architecture matches but then the attributes will get 
> > > resolved the same way we handle `-mattr=+x,-x`. I'm not sure what the 
> > > expected behaviour is here.
> > targetID is part of ROCm ABI as it is returned as part of Isa::GetIsaName 
> > (https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/rocm-5.5.x/src/core/runtime/isa.cpp#L98)
> >  . 
> > 
> > the compatibility rule for targetID is specified by 
> > https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id . For 
> > example, bundle entry with gfx90a can be consumed by device with GetIsaName 
> > gfx90a:xnack+ or gfx90a:xnack- . but bundle entry with gfx90a:xnack+ can 
> > only be consumed by device with GetIsaName gfx90a:xnack+.
> > 
> > Language runtime is supposed to do a compatibility check for bundle entry 
> > with the device GetIsaName. Isa::IsCompatible 
> > (https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/3b939c398bdac0c2b9a860ff9a0ed0be0c80f911/src/core/runtime/isa.cpp#L73)
> >  can be used to do that. For convenience, language runtime is expected to 
> > use targetID for identifying bundle entries instead of re-construct 
> > targetID from features when needed.
> > 
> > targetID is also used for compatibility checks when linking bitcode.
> > 
> So what we need is some more sophisticated logic in the linker wrapper to 
> merge the binaries according to these rules. However the handling will 
> definitely require pulling this apart when we send it to LTO.
Some logic is given in [[ 
https://github.com/llvm/llvm-project/blob/111d27484132c0692c214880576dc4a37fd6d645/clang/lib/Driver/OffloadBundler.cpp#L155
 | ClangOffloadBundler  ]] and in [[ 
https://github.com/llvm/llvm-project/blob/74c2ec50f393bad8b31d0dd0bd8b2ff44d361198/openmp/libomptarget/plugins-nextgen/amdgpu/utils/UtilitiesRTL.h#L80
 | AMDGPU plugin ]]


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150998/new/

https://reviews.llvm.org/D150998

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to