Hi Jennifer, Hi Kyrylo, The logic behind this patch is sound, but I suggest doing this fix in choose_ready() to avoid proliferation of special-casing of SCHED_GROUP_P() -- see attached patch. As far as I can see, the attached version has the same semantics as your original patch -- please let me know if that's not the case.
I don't have approval acl's on scheduler patches, so would one of maintainers please rubber-stamp this? [Slightly off-topic] The attached patch also makes it explicit that dispatch scheduling is active only when lookahead multipass scheduling is disabled (dfa_lookahead <= 0). I wonder whether the two can co-exist or whether they are mutually exclusive. Looking at i386 backend I see that dfa_lookahead==0 for pre-reload scheduling, but enabled for post-reload. This means the dispatch scheduling is done before reload for BDVERx, but not after. For aarch64, dispatch scheduling seems to be active when sched_fusion is enabled AND AARCH64_EXTRA_TUNE_DISPATCH_SCHED -- this currently means Neoverse-V2 both before or after reload. Kind regards, -- Maxim Kuvyrkov https://www.linaro.org > On Nov 21, 2025, at 23:41, Kyrylo Tkachov <[email protected]> wrote: > > Adding a couple more global reviewers on CC. > > Ping on this patch. We need it to avoid a performance regression relating to > fusing instructions when enabling dispatch scheduling for a new core in > AArch64. > > https://gcc.gnu.org/pipermail/gcc-patches/2025-October/698465.html > Thanks, > Kyrill > > >> On 23 Oct 2025, at 16:18, Jennifer Schmitz <[email protected]> wrote: >> >> While looking at codegen effects of dispatch scheduling in the aarch64 >> backend, I noticed that many fusion pairs were split, in particular >> CMP+CSEL and CMP+CSET. >> The reason is that the information that an instruction is part of a >> fusion pair is not considered in the function >> >> /* This function returns a candidate satisfying dispatch constraints from >> the ready list. */ >> static rtx_insn * >> ready_remove_first_dispatch (struct ready_list *ready) >> >> I propose to fix this issue by adding a check for SCHED_GROUP_P (insn) >> (this is true for the second instruction in a fusion pair) such that >> the instruction is scheduled immediately after its partner without >> considering dispatch constraints. With this change I did not see >> splitting of fusion pairs anymore. >> >> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression. >> OK for trunk? >> >> Signed-off-by: Jennifer Schmitz <[email protected]> >> >> gcc/ >> * haifa-sched.cc (ready_remove_first_dispatch): Add check for >> fusion pairs. >> --- >> gcc/haifa-sched.cc | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/gcc/haifa-sched.cc b/gcc/haifa-sched.cc >> index 63eb06b2d82..163b538c528 100644 >> --- a/gcc/haifa-sched.cc >> +++ b/gcc/haifa-sched.cc >> @@ -9224,6 +9224,7 @@ ready_remove_first_dispatch (struct ready_list *ready) >> || !INSN_P (insn) >> || INSN_CODE (insn) < 0 >> || !active_insn_p (insn) >> + || SCHED_GROUP_P (insn) >> || targetm.sched.dispatch (insn, FITS_DISPATCH_WINDOW)) >> return ready_remove_first (ready); >> >> -- >> 2.34.1 >> >
dispatch-sched-group.diff
Description: Binary data
