On Mon, Oct 24, 2022 at 01:04:01PM +0200, Jan Beulich wrote:
> On 20.10.2022 11:46, Roger Pau Monne wrote:
> > It's possible for a device to be assigned to a domain but have no
> > vpci structure if vpci_process_pending() failed and called
> > vpci_remove_device() as a result.  The unconditional accesses done by
> > vpci_{read,write}() and vpci_remove_device() to pdev->vpci would
> > then trigger a NULL pointer dereference.
> > 
> > Add checks for pdev->vpci presence in the affected functions.
> > 
> > Fixes: 9c244fdef7 ('vpci: add header handlers')
> > Signed-off-by: Roger Pau Monné <[email protected]>
> 
> Reviewed-by: Jan Beulich <[email protected]>
> 
> I wonder though whether these changes are enough. Is
> vpci_process_pending() immune to a pdev losing its ->vpci?

I think this is safe so far because the only place where
vpci_remove_device() gets called that doesn't also deassign the device
from the domain is vpci_process_pending(), and in that error path it
also clears any pending work.  Since the device no longer has ->vpci
handlers  no further calls to vpci_process_pending() can happen.

> Furthermore msix_find() iterates over d->arch.hvm.msix_tables, which
> looks to only ever be added to. Doesn't this list need pruning by
> vpci_remove_device()? I've noticed this only because of looking at
> derefs of ->vpci in msix.c - I don't think I can easily see that all
> of those derefs are once again immune to a pdev losing its ->vpci.

I think you are correct, we are missing a
list_del(&pdev->vpci->msix->next) in vpci_remove_device().  We will
however have locking issues with this, as msix_find() doesn't take any
locks, neither do it's callers.  I guess this will be fixed as part of
the lager add vPCI locking series.  Will add another patch to the
series with the MSIX table removal.

Thanks, Roger.

Reply via email to