jrbyrnes added a comment.
Hey Austin -- I like the removal of canAddMIs. In the original design, I was
leaving open the possibility for users to pass in canAddMIs rather than a mask
/ SchedGroup name, but it looks like this isn't the direction we're going, and
the classification functions defined in a general canAddMI makes things easier.
I see this is a WIP, but I've added some thoughts I had from reading it over. I
may have more as I use the design for my patch.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp:199
+ // SchedGroupMask of instructions that should be barred.
+ SchedGroupMask invertSchedBarrierMask(SchedGroupMask Mask) const;
+
----------------
I find it confusing that SchedBarrier uses inversion while SchedGroupBarrier
doesn't.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp:306
+bool SchedGroup::isFull() const {
+ return MaxSize.hasValue() && Collection.size() >= *MaxSize;
+}
----------------
As in the update to IGroupLP.cpp in trunk, seems like we are not supposed to
use hasValue.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp:349
+ add(InitSU);
+ assert(MaxSize.hasValue());
+ (*MaxSize)++;
----------------
Not possible to have unsized groups?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp:445
+ // initialized all of the SCHED_GROUP_BARRIER SchedGroups.
+ addSchedGroupBarrierEdges();
}
----------------
If both types of barriers are present -- the SchedBarriers are handled first.
However, if there is a conflict between SchedBarrier and SchedGroupBarrier,
should SchedBarrier always get the priority? Maybe SchedBarrier should only
handle groups not present in SchedGroupBarrier?
================
Comment at: llvm/test/CodeGen/AMDGPU/sched-group-barrier-pre-RA.mir:104
+ GLOBAL_STORE_DWORD_SADDR %1, %13, %0, 512, 0, implicit $exec :: (store
(s32) into %ir.out, !noalias !0, addrspace 1)
+ ; 1 VMEM_READ
+ SCHED_GROUP_BARRIER 32, 1, 0
----------------
I think you are aware of this issue. But the ability for the mutation to match
the pipeline is dependent upon which instructions go into which group (when an
instruction can be mapped to multiple groups).
If we had SchedGroups: 2 VMEM_READ, 1 VALU, 1 MFMA, 2 VMEM_READ
and initial schedule: VMEMR, VALU, VMEMR, MFMA, VMEMR, with a dependency
between middle VMEMR->MFMA.
initSchedGroup will add the middle VMEMR to the last VMEMR group, but we could
get a more accurate pipeline by adding it to the first group.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D128158/new/
https://reviews.llvm.org/D128158
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits