> On 04-Jul-2023, at 5:20 PM, Akihiko Odaki <[email protected]> wrote:
>
> On 2023/07/04 20:38, Igor Mammedov wrote:
>> On Sat, 1 Jul 2023 16:28:30 +0900
>> Akihiko Odaki <[email protected]> wrote:
>>> On 2023/07/01 0:29, Michael S. Tsirkin wrote:
>>>> On Fri, Jun 30, 2023 at 08:36:38PM +0900, Akihiko Odaki wrote:
>>>>> On 2023/06/30 19:37, Ani Sinha wrote:
>>>>>>
>>>>>>
>>>>>>> On 30-Jun-2023, at 3:30 PM, Michael S. Tsirkin <[email protected]> wrote:
>>>>>>>
>>>>>>> On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>> On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thus the check for unoccupied function 0 needs to use pci_is_vf()
>>>>>>>>>>>>> instead of checking ARI capability, and that can happen in
>>>>>>>>>>>>> do_pci_register_device().
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Also where do you propose we move the check?
>>>>>>>>>>>>>
>>>>>>>>>>>>> In pci_qdev_realize(), somewhere after pc->realize() and before
>>>>>>>>>>>>> option ROM loading.
>>>>>>>>>>>>
>>>>>>>>>>>> Hmm, I tried this. The issue here is something like this would be
>>>>>>>>>>>> now allowed since the PF has ARI capability:
>>>>>>>>>>>>
>>>>>>>>>>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0
>>>>>>>>>>>>
>>>>>>>>>>>> The above should not be allowed and when used, we do not see the
>>>>>>>>>>>> igb ethernet device from the guest OS.
>>>>>>>>>>>
>>>>>>>>>>> I think it's allowed because it expects you to hotplug function 0
>>>>>>>>>>> later,
>>>>>>>>>>
>>>>>>>>>> This is about the igb device being plugged into the non-zero slot of
>>>>>>>>>> the pci-root-port. The guest OS ignores it.
>>>>>>>>>
>>>>>>>>> yes but if you later add a device with ARI and with next field
>>>>>>>>> pointing
>>>>>>>>> slot 2 guest will suddently find both.
>>>>>>>>
>>>>>>>> Hmm, I tried this:
>>>>>>>>
>>>>>>>> -device pcie-root-port,id=p \
>>>>>>>> -device igb,bus=p,addr=0x2.0x0 \
>>>>>>>> -device igb,bus=p,addr=0x0.0x0 \
>>>>>>>>
>>>>>>>> The guest only found the second igb device not the first. You can try
>>>>>>>> too.
>>>>>>>
>>>>>>> Because next parameter in pcie_ari_init does not match.
>>>>>>
>>>>>> OK send me a command line that I can test it with. I can’t come up with
>>>>>> a case that actually works in practice.
>>>>>
>>>>> I don't think there is one because the code for PCI multifunction does not
>>>>> care ARI. In my opinion, we need yet another check to make non-SR-IOV
>>>>> multifunction and ARI capability mutually exclusive; if a function has the
>>>>> ARI capability and it is not a VF, an attempt to assign non-zero function
>>>>> number for it should fail.
>> is it stated somewhere in spec(s) that ARI and !SR-IOV are mutually
>> exclusive?
>>>>
>>>> Why is that? My understanding is that ARI capable devices should also
>>>> set the multifunction bit in the header. It's not terribly clear from
>>>> the spec though.
>>>
>>> Something like the following will not work properly with ARI-capable
>>> device (think of a as an ARI-capable device):
>>> -device a,addr=0x1.0x0,multifunction=on -device a,addr=0x1.0x1
>> (I had a crazy idea, to use it like that so we could put more devices
>> on port without resorting to adding extra bridges)
>> Can you elaborate some more why it won't work?
>
> It won't work because the ARI next function number field is fixed. In this
> case, the field of the Function at 0x1.0x0 should point to 0x1.0x1, but it
> doesn’t.
Where does it point to in this case then? 0x1.0x2 becasue of the stride?
> As the result, the Function at 0x1.0x1 won't be recognized.
>
> It's more problematic if some of the Functions are ARI-capable but others are
> not. In my understanding, all Functions in a ARI-capable device need to have
> ARI capability, but that's not enforced.
>
>>> This is because the next function numbers advertised with ARI are not
>>> updated with the multifunction configuration, but they are hardcoded in
>>> the device implementation. In this sense, the traditional (non-SR/IOV)
>>> multifunction mechanism QEMU has will not work with ARI-capable devices.
>>>
>>>>
>>>>> But it should be a distinct check as it will need to check the function
>>>>> number bits.
>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>>>> no?
>>>>>>>>>>>
>>>>>>>>>>> I am quite worried about all this work going into blocking
>>>>>>>>>>> what we think is disallowed configurations. We should have
>>>>>>>>>>> maybe blocked them originally, but now that we didn't
>>>>>>>>>>> there's a non zero chance of regressions,
>>>>>>>>>>
>>>>>>>>>> Sigh,
>>>>>>>>>
>>>>>>>>> There's value in patches 1-4 I think - the last patch helped you find
>>>>>>>>> these. so there's value in this work.
>>>>>>>>>
>>>>>>>>>> no medals here for being brave :-)
>>>>>>>>>
>>>>>>>>> Try removing support for a 3.5mm jack next. Oh wait ...
>>>>>>>>
>>>>>>>> Indeed. Everyone uses bluetooth these days. I for one is happy that
>>>>>>>> the jack is gone (and they were bold enough to do it while Samsung and
>>>>>>>> others still carry the useless port ) :-)
>>>>>
>>>>> Hello from a guy using a shiny M2 Macbook Air carrying the legacy jack
>>>>> with
>>>>> a 100-yen earphone. Even people who ported Linux to this machine spent
>>>>> efforts to get the jack to work on Linux ;)
>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>> and the benefit
>>>>>>>>>>> is not guaranteed.
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> MST