On 24/09/18 13:18, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich [mailto:[email protected]]
>> Sent: 24 September 2018 13:16
>> To: Paul Durrant <[email protected]>
>> Cc: Brian Woods <[email protected]>; Suravee Suthikulpanit
>> <[email protected]>; Andrew Cooper
>> <[email protected]>; Roger Pau Monne <[email protected]>; Wei
>> Liu <[email protected]>; Xen-devel <[email protected]>
>> Subject: RE: [PATCH 1/2] x86/SVM-IOMMU: Don't opencode memcpy() in
>> queue_iommu_command()
>>
>>>>> On 24.09.18 at 14:09, <[email protected]> wrote:
>>>>  -----Original Message-----
>>>> From: Andrew Cooper
>>>> Sent: 24 September 2018 13:06
>>>> To: Paul Durrant <[email protected]>; Xen-devel <xen-
>>>> [email protected]>
>>>> Cc: Jan Beulich <[email protected]>; Wei Liu <[email protected]>;
>> Roger
>>>> Pau Monne <[email protected]>; Suravee Suthikulpanit
>>>> <[email protected]>; Brian Woods <[email protected]>
>>>> Subject: Re: [PATCH 1/2] x86/SVM-IOMMU: Don't opencode memcpy() in
>>>> queue_iommu_command()
>>>>
>>>> On 24/09/18 12:59, Paul Durrant wrote:
>>>>>> -----Original Message-----
>>>>>> From: Andrew Cooper [mailto:[email protected]]
>>>>>> Sent: 24 September 2018 11:55
>>>>>> To: Xen-devel <[email protected]>
>>>>>> Cc: Andrew Cooper <[email protected]>; Jan Beulich
>>>>>> <[email protected]>; Wei Liu <[email protected]>; Roger Pau Monne
>>>>>> <[email protected]>; Paul Durrant <[email protected]>;
>> Suravee
>>>>>> Suthikulpanit <[email protected]>; Brian Woods
>>>>>> <[email protected]>
>>>>>> Subject: [PATCH 1/2] x86/SVM-IOMMU: Don't opencode memcpy() in
>>>>>> queue_iommu_command()
>>>>>>
>>>>>> In practice, this allows the compiler to replace the loop with a
>> pair
>>>> of
>>>>>> movs.
>>>>>>
>>>>>> No functional change.
>>>>> Well there is a potential functional change...
>>>>>
>>>>>> Signed-off-by: Andrew Cooper <[email protected]>
>>>>>> ---
>>>>>> CC: Jan Beulich <[email protected]>
>>>>>> CC: Wei Liu <[email protected]>
>>>>>> CC: Roger Pau MonnĂ© <[email protected]>
>>>>>> CC: Paul Durrant <[email protected]>
>>>>>> CC: Suravee Suthikulpanit <[email protected]>
>>>>>> CC: Brian Woods <[email protected]>
>>>>>> ---
>>>>>>  xen/drivers/passthrough/amd/iommu_cmd.c      | 12 ++++--------
>>>>>>  xen/include/asm-x86/hvm/svm/amd-iommu-defs.h |  1 -
>>>>>>  2 files changed, 4 insertions(+), 9 deletions(-)
>>>>>>
>>>>>> diff --git a/xen/drivers/passthrough/amd/iommu_cmd.c
>>>>>> b/xen/drivers/passthrough/amd/iommu_cmd.c
>>>>>> index 08247fa..c6c0b4f 100644
>>>>>> --- a/xen/drivers/passthrough/amd/iommu_cmd.c
>>>>>> +++ b/xen/drivers/passthrough/amd/iommu_cmd.c
>>>>>> @@ -24,8 +24,7 @@
>>>>>>
>>>>>>  static int queue_iommu_command(struct amd_iommu *iommu, u32 cmd[])
>>>>>>  {
>>>>>> -    u32 tail, head, *cmd_buffer;
>>>>>> -    int i;
>>>>>> +    uint32_t tail, head;
>>>>>>
>>>>>>      tail = iommu->cmd_buffer.tail;
>>>>>>      if ( ++tail == iommu->cmd_buffer.entries )
>>>>>> @@ -35,12 +34,9 @@ static int queue_iommu_command(struct amd_iommu
>>>> *iommu,
>>>>>> u32 cmd[])
>>>>>>
>> IOMMU_CMD_BUFFER_HEAD_OFFSET));
>>>>>>      if ( head != tail )
>>>>>>      {
>>>>>> -        cmd_buffer = (u32 *)(iommu->cmd_buffer.buffer +
>>>>>> -                             (iommu->cmd_buffer.tail *
>>>>>> -                             IOMMU_CMD_BUFFER_ENTRY_SIZE));
>>>>>> -
>>>>>> -        for ( i = 0; i < IOMMU_CMD_BUFFER_U32_PER_ENTRY; i++ )
>>>>>> -            cmd_buffer[i] = cmd[i];
>>>>>> +        memcpy(iommu->cmd_buffer.buffer +
>>>>>> +               (iommu->cmd_buffer.tail *
>> IOMMU_CMD_BUFFER_ENTRY_SIZE),
>>>>>> +               cmd, IOMMU_CMD_BUFFER_ENTRY_SIZE);
>>>>> ...since the built-in memcpy may not guarantee to the copy in 4 byte
>>>> chunks in ascending order.
>>>>
>>>> "No functional change" != "The binary is identical".
>>>>
>>>> The functionality of queue_iommu_command() does not change, even if
>> it's
>>>> code generation does.  It is just copying bytes into a shared ring,
>>>> bounded later by updating the tail pointer.
>>> Yes, my point is that the ring is shared and so DMA by the h/w may race.
>>> This is clearly not a good situation but the fact that the code
>> generation
>>> may change may have side effects.
>> All writes to the ring occur strictly before the update of the tail
>> pointer
> ...unless the compiler decides to re-order. There's no barrier.

/sigh - You're right, but the code was equally buggy beforehand.

readl()/writel() aren't suitable for how they are used by this code.

~Andrew

_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to