On 20.02.2026 10:28, Roger Pau Monné wrote:
> On Tue, Feb 10, 2026 at 11:55:34AM +0100, Jan Beulich wrote:
>> When Dom0 informs us about MMCFG usability, this may change whether
>> extended capabilities are available (accessible) for devices. Zap what
>> might be on record, and re-initialize things.
>>
>> No synchronization is added for the case where devices may already be in
>> use. That'll need sorting when (a) DomU support was added and (b) DomU-s
>> may run already while Dom0 / hwdom still boots (dom0less, Hyperlaunch).
>>
>> Signed-off-by: Jan Beulich <[email protected]>
>> ---
>> vpci_reinit_ext_capabilities()'es return value isn't checked, as it
>> doesn't feel quite right to fail the hypercall because of this. At the
>> same time it also doesn't feel quite right to have the function return
>> "void". Thoughts?
>
> For the non hardwware domain case we could deassign the device from
> the domain?
Will need to check. De-assigning is generally done only from domctl context,
I think. I'm also uncertain what other things may break (in Xen or the
toolstacks) if we take away a device in such a pretty much uncontrolled way.
> And print a warning message for both cases.
Can do, albeit I'm unsure what "both" refers to - I see only ...
>> --- a/xen/arch/x86/physdev.c
>> +++ b/xen/arch/x86/physdev.c
>> @@ -8,6 +8,8 @@
>> #include <xen/guest_access.h>
>> #include <xen/iocap.h>
>> #include <xen/serial.h>
>> +#include <xen/vpci.h>
>> +
>> #include <asm/current.h>
>> #include <asm/io_apic.h>
>> #include <asm/msi.h>
>> @@ -169,7 +171,10 @@ int cf_check physdev_check_pci_extcfg(st
>>
>> ASSERT(pdev->seg == info->segment);
>> if ( pdev->bus >= info->start_bus && pdev->bus <= info->end_bus )
>> + {
>> pci_check_extcfg(pdev);
>> + vpci_reinit_ext_capabilities(pdev);
>> + }
... this.
>> @@ -376,6 +379,20 @@ void vpci_cleanup_capabilities(struct pc
>> }
>> }
>>
>> +int vpci_reinit_ext_capabilities(struct pci_dev *pdev)
>> +{
>> + if ( !pdev->vpci )
>> + return 0;
>> +
>> + vpci_cleanup_capabilities(pdev, true);
>> +
>> + if ( vpci_remove_registers(pdev->vpci, PCI_CFG_SPACE_SIZE,
>> + PCI_CFG_SPACE_EXP_SIZE - PCI_CFG_SPACE_SIZE)
>> )
>> + ASSERT_UNREACHABLE();
>> +
>> + return vpci_init_capabilities(pdev, true);
>
> I wonder here, in the context here, where the device is already
> assigned to a domain you likely need to take the vPCI lock to safely
> perform (parts of?) the cleanup and reinit. Otherwise you could free
> capability data while it's being accessed by the handlers.
The lock isn't recursive, so I fear we'd deadlock if it was taken here.
Furthermore this falls into "DomU support needs dealing with"; right
now we assume Dom0 tells us about its final MCFG verdict ahead of
putting devices in use. Once we need to consider devices already in
use, I think we would further need to pause the owning domain. Also ...
> The only current extended capability (reBAR) doesn't have any internal
> state to free on cleanup, so it's all safe. But a cleanup like the
> MSI(-X) ones would be racy, as they free the structure without holding
> the vPCI lock. I think we need to be careful, and possibly adjust the
> cleanup functions so they can tolerate cleanup with possible
> concurrent accesses.
... to cover such. (For something like MSI(-X) it might then further be
necessary to mask/disable interrupts, but hopefully we'll never have to
deal with extended capabilities that would require this.)
Jan