On 20.02.2026 10:28, Roger Pau Monné wrote:
> On Tue, Feb 10, 2026 at 11:55:34AM +0100, Jan Beulich wrote:
>> When Dom0 informs us about MMCFG usability, this may change whether
>> extended capabilities are available (accessible) for devices. Zap what
>> might be on record, and re-initialize things.
>>
>> No synchronization is added for the case where devices may already be in
>> use. That'll need sorting when (a) DomU support was added and (b) DomU-s
>> may run already while Dom0 / hwdom still boots (dom0less, Hyperlaunch).
>>
>> Signed-off-by: Jan Beulich <[email protected]>
>> ---
>> vpci_reinit_ext_capabilities()'es return value isn't checked, as it
>> doesn't feel quite right to fail the hypercall because of this. At the
>> same time it also doesn't feel quite right to have the function return
>> "void". Thoughts?
> 
> For the non hardwware domain case we could deassign the device from
> the domain?

Will need to check. De-assigning is generally done only from domctl context,
I think. I'm also uncertain what other things may break (in Xen or the
toolstacks) if we take away a device in such a pretty much uncontrolled way.

> And print a warning message for both cases.

Can do, albeit I'm unsure what "both" refers to - I see only ...

>> --- a/xen/arch/x86/physdev.c
>> +++ b/xen/arch/x86/physdev.c
>> @@ -8,6 +8,8 @@
>>  #include <xen/guest_access.h>
>>  #include <xen/iocap.h>
>>  #include <xen/serial.h>
>> +#include <xen/vpci.h>
>> +
>>  #include <asm/current.h>
>>  #include <asm/io_apic.h>
>>  #include <asm/msi.h>
>> @@ -169,7 +171,10 @@ int cf_check physdev_check_pci_extcfg(st
>>  
>>      ASSERT(pdev->seg == info->segment);
>>      if ( pdev->bus >= info->start_bus && pdev->bus <= info->end_bus )
>> +    {
>>          pci_check_extcfg(pdev);
>> +        vpci_reinit_ext_capabilities(pdev);
>> +    }

... this.

>> @@ -376,6 +379,20 @@ void vpci_cleanup_capabilities(struct pc
>>      }
>>  }
>>  
>> +int vpci_reinit_ext_capabilities(struct pci_dev *pdev)
>> +{
>> +    if ( !pdev->vpci )
>> +        return 0;
>> +
>> +    vpci_cleanup_capabilities(pdev, true);
>> +
>> +    if ( vpci_remove_registers(pdev->vpci, PCI_CFG_SPACE_SIZE,
>> +                               PCI_CFG_SPACE_EXP_SIZE - PCI_CFG_SPACE_SIZE) 
>> )
>> +        ASSERT_UNREACHABLE();
>> +
>> +    return vpci_init_capabilities(pdev, true);
> 
> I wonder here, in the context here, where the device is already
> assigned to a domain you likely need to take the vPCI lock to safely
> perform (parts of?) the cleanup and reinit.  Otherwise you could free
> capability data while it's being accessed by the handlers.

The lock isn't recursive, so I fear we'd deadlock if it was taken here.
Furthermore this falls into "DomU support needs dealing with"; right
now we assume Dom0 tells us about its final MCFG verdict ahead of
putting devices in use. Once we need to consider devices already in
use, I think we would further need to pause the owning domain. Also ...

> The only current extended capability (reBAR) doesn't have any internal
> state to free on cleanup, so it's all safe.  But a cleanup like the
> MSI(-X) ones would be racy, as they free the structure without holding
> the vPCI lock.  I think we need to be careful, and possibly adjust the
> cleanup functions so they can tolerate cleanup with possible
> concurrent accesses.

... to cover such. (For something like MSI(-X) it might then further be
necessary to mask/disable interrupts, but hopefully we'll never have to
deal with extended capabilities that would require this.)

Jan

Reply via email to