On 18.02.2026 15:33, Daniel P. Smith wrote:
> On 2/17/26 04:34, Jan Beulich wrote:
>> On 16.02.2026 22:57, Daniel P. Smith wrote:
>>> --- a/xen/common/domain.c
>>> +++ b/xen/common/domain.c
>>> @@ -210,7 +210,7 @@ static void set_domain_state_info(struct
>>> xen_domctl_get_domain_state *info,
>>> int get_domain_state(struct xen_domctl_get_domain_state *info, struct
>>> domain *d,
>>> domid_t *domid)
>>> {
>>> - unsigned int dom;
>>> + unsigned int dom = 0;
>>> int rc = -ENOENT;
>>> struct domain *hdl;
>>>
>>> @@ -219,6 +219,10 @@ int get_domain_state(struct
>>> xen_domctl_get_domain_state *info, struct domain *d,
>>>
>>> if ( d )
>>> {
>>> + rc = xsm_get_domain_state(XSM_XS_PRIV, d);
>>> + if ( rc )
>>> + return rc;
>>> +
>>> set_domain_state_info(info, d);
>>>
>>> return 0;
>>> @@ -238,10 +242,10 @@ int get_domain_state(struct
>>> xen_domctl_get_domain_state *info, struct domain *d,
>>>
>>> while ( dom_state_changed )
>>> {
>>> - dom = find_first_bit(dom_state_changed, DOMID_MASK + 1);
>>> + dom = find_next_bit(dom_state_changed, DOMID_MASK + 1, dom);
>>> if ( dom >= DOMID_FIRST_RESERVED )
>>> break;
>>> - if ( test_and_clear_bit(dom, dom_state_changed) )
>>> + if ( test_bit(dom, dom_state_changed) )
>>> {
>>> *domid = dom;
>>
>> This is problematic wrt other work (already talked about in the distant past,
>> but sadly only making little progress) towards trying to pull some of the
>> sub-ops out of the domctl-locked region. This subop is one of the prime
>> candidates, yet only if the test_and_clear_bit() remains here.
>
> Okay, but we can't be clearing the bit if the src domain doesn't have
> access. When considering that xsm_domctl() does a no-op check for
> XEN_DOMCTL_get_domain_state, deferring to xsm_get_domain_state(), then
> any domain could invoke the OP with DOMID_INVALID and clear the bit
> before access is checked.
>
> If you want to ensure atomic operations on the bit field, while I am not
> a fan of this, a combination with set_bit() could be done. Let the
> test_and_clear_bit() remain and then if access check fails, use
> set_bit() to put it back. Would that be sufficient for your objective?
No, that could then confuse a legitimate (for that domain) caller. IOW
you would still build upon the domctl lock serializing things. I think
you want to do the XSM check first, and only then use test_and_clear_bit().
>>> @@ -249,6 +253,15 @@ int get_domain_state(struct
>>> xen_domctl_get_domain_state *info, struct domain *d,
>>>
>>> if ( d )
>>> {
>>> + rc = xsm_get_domain_state(XSM_XS_PRIV, d);
>>> + if ( rc )
>>> + {
>>> + rcu_unlock_domain(d);
>>> + rc = -ENOENT;
>>
>> As you don't otherwise use xsm_get_domain_state()'s return value, the need
>> for this assignment can be eliminated by putting the function call straight
>> in the if(). Then again, to address the remark above, overall code structure
>> will need to change quite a bit anyway (so the remark here may be moot).
>
> I can drop the use of rc here and inline it.
>
>>> + dom++;
>>
>> It may be nice to eliminate the need to have this in two places (here and ...
>>
>>> + continue;
>>> + }
>>> +
>>> set_domain_state_info(info, d);
>>>
>>> rcu_unlock_domain(d);
>>> @@ -256,10 +269,13 @@ int get_domain_state(struct
>>> xen_domctl_get_domain_state *info, struct domain *d,
>>> else
>>> memset(info, 0, sizeof(*info));
>>>
>>> + clear_bit(dom, dom_state_changed);
>>> rc = 0;
>>>
>>> break;
>>> }
>>> +
>>> + dom++;
>>> }
>>
>> ... here), by having the variable's initializer be -1 and then using dom + 1
>> in the find_next_bit() invocation.
>
> If you want this way, then there are two options, make dom no longer
> unsigned or be willing to allow unsigned int overflow. If we go with the
> former, If you agree, I would leave it as an int as that should cover
> the range of valid domids.
I wouldn't outright nak use of plain int, but I'm putting in effort to remove
such undue uses of that type. Unsigned overflow is well-defined aiui, so I
see no reason why the variable can't remain "unsigned int".
Jan