On 26.11.25 12:09, Andrew Cooper wrote:
On 26/11/2025 10:13 am, Jan Beulich wrote:The lock-less list updating isn't safe against racing for_each_vcpu(), unless done (by hardware) in exactly the order written.Fixes: 3037c5a2cb82 ("arm: domain") Signed-off-by: Jan Beulich <[email protected]> --- The Fixes: tag is pretty arbitrary; the issue was becoming non-latent when Arm support was added. (Strictly speaking IA-64 and PPC would have been affected too afaict, just that now that doesn't matter anymore [or, for PPC, not yet, seeing that its support is being re-built from scratch].) I'm not quite happy about prev_id being plain int, but changing it to unsigned (with suitable other adjustments) actually makes gcc15 generate worse code on x86. --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -464,6 +464,7 @@ struct vcpu *vcpu_create(struct domain * prev_id--; BUG_ON(prev_id < 0); v->next_in_list = d->vcpu[prev_id]->next_in_list; + smp_wmb(); d->vcpu[prev_id]->next_in_list = v; }Where's the matching smp_rmb()? There needs to be one for this smp_wmb() to be correct. It's rather rhetorical, because clearly the smp_rmb() needs to be in for_each_vcpu() given your commit message, but we obviously don't want to do that. This list can only be changed once during a VM's lifecycle, and even then it only gets appended to. i.e. this particular piece of logic to splice a vCPU into the middle of a single linked list can be simplified to the second assignment, as the first is always copying NULL.
This is not true for cpu hotplug: the associated idle vcpu will be inserted at the appropriate place in the list. The question is whether we call for_each_vcpu() for the idle domain at all. Juergen
OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key
OpenPGP_signature.asc
Description: OpenPGP digital signature
