On 15/05/2024 2:38 pm, Jürgen Groß wrote:
> On 15.05.24 15:22, Andrew Cooper wrote:
>> On 15/05/2024 1:39 pm, Jan Beulich wrote:
>>> On 15.05.2024 13:58, Andrew Cooper wrote:
>>>> Just so it doesn't get lost.  In XenRT, we've found:
>>>>
>>>>> (XEN) ----[ Xen-4.19.0-1-d  x86_64  debug=y  Tainted:     H  ]----
>>>>> (XEN) CPU:    45
>>>>> (XEN) RIP:    e008:[<ffff82d040244cbf>]
>>>>> common/sched/credit.c#csched_load_balance+0x41/0x877
>>>>> (XEN) RFLAGS: 0000000000010092   CONTEXT: hypervisor
>>>>> (XEN) rax: ffff82d040981618   rbx: ffff82d040981618   rcx:
>>>>> 0000000000000000
>>>>> (XEN) rdx: 0000003ff68cd000   rsi: 000000000000002d   rdi:
>>>>> ffff83103723d450
>>>>> (XEN) rbp: ffff83207caa7d48   rsp: ffff83207caa7b98   r8:
>>>>> 0000000000000000
>>>>> (XEN) r9:  ffff831037253cf0   r10: ffff83103767c3f0   r11:
>>>>> 0000000000000009
>>>>> (XEN) r12: ffff831037237990   r13: ffff831037237990   r14:
>>>>> ffff831037253720
>>>>> (XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4:
>>>>> 0000000000f526e0
>>>>> (XEN) cr3: 000000005bc2f000   cr2: 0000000000000010
>>>>> (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss:
>>>>> 0000000000000000
>>>>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
>>>>> (XEN) Xen code around <ffff82d040244cbf>
>>>>> (common/sched/credit.c#csched_load_balance+0x41/0x877):
>>>>> (XEN)  48 8b 0c 10 48 8b 49 08 <48> 8b 79 10 48 89 bd b8 fe ff ff 49
>>>>> 8b 4e 28 48
>>>>> <snip>
>>>>> (XEN) Xen call trace:
>>>>> (XEN)    [<ffff82d040244cbf>] R
>>>>> common/sched/credit.c#csched_load_balance+0x41/0x877
>>> While this is of course pretty little information, I've still tried to
>>> decipher it, first noticing it's credit1 that's being used here. Once
>>> forcing csched_load_balance() non-inline (no idea why it is a separate
>>> function in your build), I can see a sufficiently matching pattern at
>>> approximately the same offset into the function. That's
>>>
>>>      const struct cpupool *c = get_sched_res(cpu)->cpupool;
>>>      ...
>>>      const cpumask_t *online = c->res_valid;
>>>      ...
>>>      BUG_ON(get_sched_res(cpu) != snext->unit->res);
>>>
>>> overlapping, with the crash being on the middle of the quoted lines.
>>> IOW the CPU pool is still NULL for this sched resource. Cc-ing
>>> Jürgen for possible clues ...
>>
>> We've seen it in 4.13, 4.17 and upstream, after Roger extended the
>> existing CPU hotplug testing to try and reproduce the MTRR watchdog
>> failure.  We've found yet another "no irq for handler" from this too.
>>
>> It's always a deference at NULL+0x10, somewhere within
>> csched_schedule().
>
> I think I've found the reason.
>
> In schedule_cpu_add() the cpupool and granularity are set only after
> releasing the scheduling lock. I think those must be inside the locked
> region.
>
> Can you give this one a try (not tested at all)?
>
> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> index 0cb33831d2..babac7aad6 100644
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -3176,6 +3176,8 @@ int schedule_cpu_add(unsigned int cpu, struct
> cpupool *c)
>
>      sr->scheduler = new_ops;
>      sr->sched_priv = ppriv;
> +    sr->granularity = cpupool_get_granularity(c);
> +    sr->cpupool = c;
>
>      /*
>       * Reroute the lock to the per pCPU lock as /last/ thing. In fact,
> @@ -3188,8 +3190,6 @@ int schedule_cpu_add(unsigned int cpu, struct
> cpupool *c)
>      /* _Not_ pcpu_schedule_unlock(): schedule_lock has changed! */
>      spin_unlock_irqrestore(old_lock, flags);
>
> -    sr->granularity = cpupool_get_granularity(c);
> -    sr->cpupool = c;
>      /* The  cpu is added to a pool, trigger it to go pick up some
> work */
>      cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);

This change seems to be standing up to the test, in a way that the
previous form very much didn't.

Thanks for the quick fix.

~Andrew

Reply via email to