Hello Roger,

Roger Pau Monné <[email protected]> writes:

> On Tue, Aug 29, 2023 at 11:19:42PM +0000, Volodymyr Babchuk wrote:
>> Add per-domain d->pci_lock that protects access to
>> d->pdev_list. Purpose of this lock is to give guarantees to VPCI code
>> that underlying pdev will not disappear under feet. This is a rw-lock,
>> but this patch adds only write_lock()s. There will be read_lock()
>> users in the next patches.
>> 
>> This lock should be taken in write mode every time d->pdev_list is
>> altered. This covers both accesses to d->pdev_list and accesses to
>> pdev->domain_list fields.
>
> Why do you mention pdev->domain_list here?  I don't think the lock
> covers accesses to pdev->domain_list, unless that domain_list field
> happens to be part of the linked list in d->pdev_list.  I find it kind
> of odd to mention here.

You are correct. I was referring very specific case in reassign_device()
IOMMU functions. It seemed important for me when I wrote this. But you
are correct, no need to mention pdev->domain_list explicitly.

>
>> All write accesses also should be protected
>> by pcidevs_lock() as well. Idea is that any user that wants read
>> access to the list or to the devices stored in the list should use
>> either this new d->pci_lock or old pcidevs_lock(). Usage of any of
>> this two locks will ensure only that pdev of interest will not
>> disappear from under feet and that the pdev still will be assigned to
>> the same domain. Of course, any new users should use pcidevs_lock()
>> when it is appropriate (e.g. when accessing any other state that is
>> protected by the said lock). In case both the newly introduced
>> per-domain rwlock and the pcidevs lock is taken, the later must be
>> acquired first.
>> 
>> Any write access to pdev->domain_list should be protected by both
>> pcidevs_lock() and d->pci_lock in the write mode.
>> 
>> Suggested-by: Roger Pau Monné <[email protected]>
>> Suggested-by: Jan Beulich <[email protected]>
>> Signed-off-by: Volodymyr Babchuk <[email protected]>
>> 
>> ---
>> 
>> Changes in v9:
>>  - returned back "pdev->domain = target;" in AMD IOMMU code
>>  - used "source" instead of pdev->domain in IOMMU functions
>>  - added comment about lock ordering in the commit message
>>  - reduced locked regions
>>  - minor changes non-functional changes in various places
>> 
>> Changes in v8:
>>  - New patch
>> 
>> Changes in v8 vs RFC:
>>  - Removed all read_locks after discussion with Roger in #xendevel
>>  - pci_release_devices() now returns the first error code
>>  - extended commit message
>>  - added missing lock in pci_remove_device()
>>  - extended locked region in pci_add_device() to protect list_del() calls
>> ---
>>  xen/common/domain.c                         |  1 +
>>  xen/drivers/passthrough/amd/pci_amd_iommu.c |  9 ++-
>>  xen/drivers/passthrough/pci.c               | 71 +++++++++++++++++----
>>  xen/drivers/passthrough/vtd/iommu.c         |  9 ++-
>>  xen/include/xen/sched.h                     |  1 +
>>  5 files changed, 78 insertions(+), 13 deletions(-)
>> 
>> diff --git a/xen/common/domain.c b/xen/common/domain.c
>> index 304aa04fa6..9b04a20160 100644
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -651,6 +651,7 @@ struct domain *domain_create(domid_t domid,
>>  
>>  #ifdef CONFIG_HAS_PCI
>>      INIT_LIST_HEAD(&d->pdev_list);
>> +    rwlock_init(&d->pci_lock);
>>  #endif
>>  
>>      /* All error paths can depend on the above setup. */
>> diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c 
>> b/xen/drivers/passthrough/amd/pci_amd_iommu.c
>> index bea70db4b7..d219bd9453 100644
>> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
>> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
>> @@ -476,7 +476,14 @@ static int cf_check reassign_device(
>>  
>>      if ( devfn == pdev->devfn && pdev->domain != target )
>>      {
>> -        list_move(&pdev->domain_list, &target->pdev_list);
>> +        write_lock(&source->pci_lock);
>> +        list_del(&pdev->domain_list);
>> +        write_unlock(&source->pci_lock);
>> +
>> +        write_lock(&target->pci_lock);
>> +        list_add(&pdev->domain_list, &target->pdev_list);
>> +        write_unlock(&target->pci_lock);
>> +
>>          pdev->domain = target;
>
> While I don't think this is strictly an issue right now, it would be
> better to set pdev->domain before the device is added to domain_list.
> A pattern like:
>
> read_lock(d->pci_lock);
> for_each_pdev(d, pdev)
>     foo(pdev->domain);
> read_unlock(d->pci_lock);
>
> Wouldn't work currently if the pdev is added to domain_list before the
> pdev->domain field is updated to reflect the new owner.

Agree. I moved `pdev->domain = target` so it sits between list_del() and
list_add() calls


-- 
WBR, Volodymyr

Reply via email to