Hi Jennifer,
Hi Kyrylo,

The logic behind this patch is sound, but I suggest doing this fix in 
choose_ready() to avoid proliferation of special-casing of SCHED_GROUP_P() -- 
see attached patch.  As far as I can see, the attached version has the same 
semantics as your original patch -- please let me know if that's not the case.

I don't have approval acl's on scheduler patches, so would one of maintainers 
please rubber-stamp this?


[Slightly off-topic]
The attached patch also makes it explicit that dispatch scheduling is active 
only when lookahead multipass scheduling is disabled (dfa_lookahead <= 0).  I 
wonder whether the two can co-exist or whether they are mutually exclusive.

Looking at i386 backend I see that dfa_lookahead==0 for pre-reload scheduling, 
but enabled for post-reload.  This means the dispatch scheduling is done before 
reload for BDVERx, but not after.

For aarch64, dispatch scheduling seems to be active when sched_fusion is 
enabled AND AARCH64_EXTRA_TUNE_DISPATCH_SCHED -- this currently means 
Neoverse-V2 both before or after reload.

Kind regards,

--
Maxim Kuvyrkov
https://www.linaro.org

> On Nov 21, 2025, at 23:41, Kyrylo Tkachov <[email protected]> wrote:
> 
> Adding a couple more global reviewers on CC.
> 
> Ping on this patch. We need it to avoid a performance regression relating to 
> fusing instructions when enabling dispatch scheduling for a new core in 
> AArch64.
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2025-October/698465.html
> Thanks,
> Kyrill
> 
> 
>> On 23 Oct 2025, at 16:18, Jennifer Schmitz <[email protected]> wrote:
>> 
>> While looking at codegen effects of dispatch scheduling in the aarch64
>> backend, I noticed that many fusion pairs were split, in particular
>> CMP+CSEL and CMP+CSET.
>> The reason is that the information that an instruction is part of a
>> fusion pair is not considered in the function
>> 
>> /* This function returns a candidate satisfying dispatch constraints from
>>  the ready list.  */
>> static rtx_insn *
>> ready_remove_first_dispatch (struct ready_list *ready)
>> 
>> I propose to fix this issue by adding a check for SCHED_GROUP_P (insn)
>> (this is true for the second instruction in a fusion pair) such that
>> the instruction is scheduled immediately after its partner without
>> considering dispatch constraints. With this change I did not see
>> splitting of fusion pairs anymore.
>> 
>> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
>> OK for trunk?
>> 
>> Signed-off-by: Jennifer Schmitz <[email protected]>
>> 
>> gcc/
>> * haifa-sched.cc (ready_remove_first_dispatch): Add check for
>> fusion pairs.
>> ---
>> gcc/haifa-sched.cc | 1 +
>> 1 file changed, 1 insertion(+)
>> 
>> diff --git a/gcc/haifa-sched.cc b/gcc/haifa-sched.cc
>> index 63eb06b2d82..163b538c528 100644
>> --- a/gcc/haifa-sched.cc
>> +++ b/gcc/haifa-sched.cc
>> @@ -9224,6 +9224,7 @@ ready_remove_first_dispatch (struct ready_list *ready)
>>      || !INSN_P (insn)
>>      || INSN_CODE (insn) < 0
>>      || !active_insn_p (insn)
>> +      || SCHED_GROUP_P (insn)
>>      || targetm.sched.dispatch (insn, FITS_DISPATCH_WINDOW))
>>    return ready_remove_first (ready);
>> 
>> -- 
>> 2.34.1
>> 
> 

Attachment: dispatch-sched-group.diff
Description: Binary data

Reply via email to