On 08.02.2022 11:22, Oleksandr Andrushchenko wrote:
> 
> 
> On 08.02.22 12:09, Jan Beulich wrote:
>> On 08.02.2022 10:55, Oleksandr Andrushchenko wrote:
>>>
>>> On 08.02.22 11:44, Jan Beulich wrote:
>>>> On 08.02.2022 10:27, Oleksandr Andrushchenko wrote:
>>>>> On 08.02.22 11:13, Jan Beulich wrote:
>>>>>> On 08.02.2022 09:32, Oleksandr Andrushchenko wrote:
>>>>>>> On 07.02.22 18:28, Jan Beulich wrote:
>>>>>>>> On 04.02.2022 07:34, Oleksandr Andrushchenko wrote:
>>>>>>>>> @@ -1507,6 +1511,8 @@ static int assign_device(struct domain *d, u16 
>>>>>>>>> seg, u8 bus, u8 devfn, u32 flag)
>>>>>>>>>                              pci_to_dev(pdev), flag);
>>>>>>>>>          }
>>>>>>>>>      
>>>>>>>>> +    rc = vpci_assign_device(d, pdev);
>>>>>>>>> +
>>>>>>>>>       done:
>>>>>>>>>          if ( rc )
>>>>>>>>>              printk(XENLOG_G_WARNING "%pd: assign (%pp) failed 
>>>>>>>>> (%d)\n",
>>>>>>>> There's no attempt to undo anything in the case of getting back an
>>>>>>>> error. ISTR this being deemed okay on the basis that the tool stack
>>>>>>>> would then take whatever action, but whatever it is that is supposed
>>>>>>>> to deal with errors here wants spelling out in the description.
>>>>>>> Why? I don't change the previously expected decision and implementation
>>>>>>> of the assign_device function: I use error paths as they were used 
>>>>>>> before
>>>>>>> for the existing code. So, I see no clear reason to stress that the 
>>>>>>> existing
>>>>>>> and new code relies on the toolstack
>>>>>> Saying half a sentence on this is helping review.
>>>>> Ok
>>>>>>>> What's important is that no caller up the call tree may be left with
>>>>>>>> the impression that the device is still owned by the original
>>>>>>>> domain. With how you have it, the device is going to be owned by the
>>>>>>>> new domain, but not really usable.
>>>>>>> This is not true: vpci_assign_device will call vpci_deassign_device
>>>>>>> internally if it fails. So, the device won't be assigned in this case
>>>>>> No. The device is assigned to whatever pdev->domain holds. Calling
>>>>>> vpci_deassign_device() there merely makes sure that the device will
>>>>>> have _no_ vPCI data and hooks in place, rather than something
>>>>>> partial.
>>>>> So, this patch is only dealing with vpci assign/de-assign
>>>>> And it rolls back what it did in case of a failure
>>>>> It also returns rc in assign_device to signal it has failed
>>>>> What else is expected from this patch??
>>>> Until now if assign_device() returns an error, this tells the caller
>>>> that the device did not change ownership;
>>> Not sure this is the case:
>>>       if ( (rc = iommu_call(hd->platform_ops, assign_device, d, devfn,
>>>                             pci_to_dev(pdev), flag)) )
>>> iommu_call can leave the new ownership even now without
>>> vpci_assign_device.
>> Did you check the actual hook functions for when exactly the ownership
>> change happens. For both VT-d and AMD it is the last thing they do,
>> when no error can occur anymore.
> This functionality does not exist for Arm yet, so this is up to the
> future series to add that.
> 
> WRT to the existing code:
> 
> static int amd_iommu_assign_device(struct domain *d, u8 devfn,
>                                     struct pci_dev *pdev,
>                                     u32 flag)
> {
>      if ( !rc )
>          rc = reassign_device(pdev->domain, d, devfn, pdev); <<<<< this will 
> set pdev->domain
> 
>      if ( rc && !is_hardware_domain(d) )
>      {
>          int ret = amd_iommu_reserve_domain_unity_unmap(
>                        d, ivrs_mappings[req_id].unity_map);
> 
>          if ( ret )
>          {
>              printk(XENLOG_ERR "AMD-Vi: "
>                     "unity-unmap for %pd/%04x:%02x:%02x.%u failed (%d)\n",
>                     d, pdev->seg, pdev->bus,
>                     PCI_SLOT(devfn), PCI_FUNC(devfn), ret);
>              domain_crash(d);
>          }
> So....
> 
> This is IMO wrong in the first place to let IOMMU code assign pdev->domain.
> This is something that needs to be done by the PCI code itself and
> not relying on each IOMMU callback implementation
>>
>>   My understanding is that the roll-back is
>>> expected to be performed by the toolstack and vpci_assign_device
>>> doesn't prevent that by returning rc. Even more, before we discussed
>>> that it would be good for vpci_assign_device to try recovering from
>>> a possible error early which is done by calling vpci_deassign_device
>>> internally.
>> Yes, but that's only part of it. It at least needs considering what
>> effects have resulted from operations prior to vpci_assign_device().
> Taking into account the code snippet above: what is your expectation
> from this patch with this respect?

You did note the domain_crash() in there, didn't you? The snippet above
still matches the "device not assigned to an alive DomU" criteria (which
can be translated to "no exposure of a device to an untrusted entity in
case of error"). Such domain_crash() uses aren't nice, and I'd prefer to
see them go away, but said property needs to be retained with any
alternative solutions.

Jan


Reply via email to