Hi All,

On 21.06.25 17:14, Koichiro Den wrote:
When a running unit is about to be scheduled out due to a competing unit
with the highest remaining credit, the residual credit of the previous
unit is currently ignored in csched2_runtime() because it hasn't yet
been reinserted into the runqueue.

As a result, two equally weighted, busy units can often each be granted
almost the maximum possible runtime (i.e. consuming CSCHED2_CREDIT_INIT
in one shot) when only those two are active. In broad strokes two units
switch back and forth every 10ms (CSCHED2_MAX_TIMER). In contrast, when
more than two busy units are competing, such coarse runtime allocations
are rarely seen, since at least one active unit remains in the runqueue.

To ensure consistent behavior, have csched2_runtime() take into account
the previous unit's latest credit when it still can/wants to run.

Signed-off-by: Koichiro Den <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
---
  xen/common/sched/credit2.c | 28 +++++++++++++++++++++-------
  1 file changed, 21 insertions(+), 7 deletions(-)


We observe regression on ARM64 with this patch.
commit ae648e9f8013 ("xen/credit2: factor in previous active unit's credit in 
csched2_runtime()")

general observation:
 This commit causes Linux guest boot time increase  >5 times for some of our 
the credit2
 specific tests.
 Reverting it makes issue gone.

 - normal log
   (XEN) DOM1: [    6.496166] io scheduler bfq registered
   ...
   (XEN) DOM1: [    9.845108] Freeing unused kernel memory: 9216K
   (XEN) DOM1: [    9.874792] Run /init as init process
   (XEN) sched_smt_power_savings: disabled
   (XEN) NOW=16800131328

 - failed log
   (XEN) DOM1: [   37.281776] io scheduler bfq registered
   (XEN) DOM1: [   61.856512] EINJ: ACPI disabled.
   test: timed out

Run Details:
 Platform: ARM64 (Device Tree)
 Execution platform: qemu 6.0 (2 pCPU, 2G)
 Boot: dom0less, 1 domain (2 vCPU)
 Command line: "console=dtuart guest_loglvl=debug conswitch=ax"

 Dom0less cfg:
    chosen {
        xen,xen-bootargs = "console=dtuart guest_loglvl=debug conswitch=ax";
        #size-cells = <0x00000002>;
        #address-cells = <0x00000002>;
        stdout-path = "/pl011@9000000";
        kaslr-seed = <0x5a7b5649 0x9122e194>;
        cpupool_0 {
            cpupool-sched = "credit2";
            cpupool-cpus = <0x00008001>;
            compatible = "xen,cpupool";
            phandle = <0xfffffffe>;
        };
        domU0 {
            domain-cpupool = <0xfffffffe>;
            vpl011;
            cpus = <0x00000002>;
            memory = <0x00000000 0x00040000>;
            #size-cells = <0x00000002>;
            #address-cells = <0x00000002>;
            compatible = "xen,domain";
            module@42E00000 {
                reg = <0x00000000 0x42e00000 0x00000000 0x000f1160>;
                compatible = "multiboot,ramdisk", "multiboot,module";
            };
            module@40400000 {
                bootargs = "console=ttyAMA0";
                reg = <0x00000000 0x40400000 0x00000000 0x02920000>;
                compatible = "multiboot,kernel", "multiboot,module";
            };
        };
    };

Investigation:
 It was narrowed down to a specific configuration with cpupool assigned to the 
domain (100% reproducible):
 Host has 2 pCPU
 Domain has 2 vCPU
 cpupool_0 has 1 pCPU (cpu@1 credit2)
 domain <- cpupool_0

 if Domain is assigned 1 vCPU - no issues.
 if cpupool_0 is assigned 2 pCPU -  no issues (seems slower a bit, but it is on 
a error  margin level)

I'd be appreciated for any help with this (or revert :().

--
Best regards,
-grygorii

Reply via email to