On 24/09/18 13:18, Paul Durrant wrote: >> -----Original Message----- >> From: Jan Beulich [mailto:[email protected]] >> Sent: 24 September 2018 13:16 >> To: Paul Durrant <[email protected]> >> Cc: Brian Woods <[email protected]>; Suravee Suthikulpanit >> <[email protected]>; Andrew Cooper >> <[email protected]>; Roger Pau Monne <[email protected]>; Wei >> Liu <[email protected]>; Xen-devel <[email protected]> >> Subject: RE: [PATCH 1/2] x86/SVM-IOMMU: Don't opencode memcpy() in >> queue_iommu_command() >> >>>>> On 24.09.18 at 14:09, <[email protected]> wrote: >>>> -----Original Message----- >>>> From: Andrew Cooper >>>> Sent: 24 September 2018 13:06 >>>> To: Paul Durrant <[email protected]>; Xen-devel <xen- >>>> [email protected]> >>>> Cc: Jan Beulich <[email protected]>; Wei Liu <[email protected]>; >> Roger >>>> Pau Monne <[email protected]>; Suravee Suthikulpanit >>>> <[email protected]>; Brian Woods <[email protected]> >>>> Subject: Re: [PATCH 1/2] x86/SVM-IOMMU: Don't opencode memcpy() in >>>> queue_iommu_command() >>>> >>>> On 24/09/18 12:59, Paul Durrant wrote: >>>>>> -----Original Message----- >>>>>> From: Andrew Cooper [mailto:[email protected]] >>>>>> Sent: 24 September 2018 11:55 >>>>>> To: Xen-devel <[email protected]> >>>>>> Cc: Andrew Cooper <[email protected]>; Jan Beulich >>>>>> <[email protected]>; Wei Liu <[email protected]>; Roger Pau Monne >>>>>> <[email protected]>; Paul Durrant <[email protected]>; >> Suravee >>>>>> Suthikulpanit <[email protected]>; Brian Woods >>>>>> <[email protected]> >>>>>> Subject: [PATCH 1/2] x86/SVM-IOMMU: Don't opencode memcpy() in >>>>>> queue_iommu_command() >>>>>> >>>>>> In practice, this allows the compiler to replace the loop with a >> pair >>>> of >>>>>> movs. >>>>>> >>>>>> No functional change. >>>>> Well there is a potential functional change... >>>>> >>>>>> Signed-off-by: Andrew Cooper <[email protected]> >>>>>> --- >>>>>> CC: Jan Beulich <[email protected]> >>>>>> CC: Wei Liu <[email protected]> >>>>>> CC: Roger Pau Monné <[email protected]> >>>>>> CC: Paul Durrant <[email protected]> >>>>>> CC: Suravee Suthikulpanit <[email protected]> >>>>>> CC: Brian Woods <[email protected]> >>>>>> --- >>>>>> xen/drivers/passthrough/amd/iommu_cmd.c | 12 ++++-------- >>>>>> xen/include/asm-x86/hvm/svm/amd-iommu-defs.h | 1 - >>>>>> 2 files changed, 4 insertions(+), 9 deletions(-) >>>>>> >>>>>> diff --git a/xen/drivers/passthrough/amd/iommu_cmd.c >>>>>> b/xen/drivers/passthrough/amd/iommu_cmd.c >>>>>> index 08247fa..c6c0b4f 100644 >>>>>> --- a/xen/drivers/passthrough/amd/iommu_cmd.c >>>>>> +++ b/xen/drivers/passthrough/amd/iommu_cmd.c >>>>>> @@ -24,8 +24,7 @@ >>>>>> >>>>>> static int queue_iommu_command(struct amd_iommu *iommu, u32 cmd[]) >>>>>> { >>>>>> - u32 tail, head, *cmd_buffer; >>>>>> - int i; >>>>>> + uint32_t tail, head; >>>>>> >>>>>> tail = iommu->cmd_buffer.tail; >>>>>> if ( ++tail == iommu->cmd_buffer.entries ) >>>>>> @@ -35,12 +34,9 @@ static int queue_iommu_command(struct amd_iommu >>>> *iommu, >>>>>> u32 cmd[]) >>>>>> >> IOMMU_CMD_BUFFER_HEAD_OFFSET)); >>>>>> if ( head != tail ) >>>>>> { >>>>>> - cmd_buffer = (u32 *)(iommu->cmd_buffer.buffer + >>>>>> - (iommu->cmd_buffer.tail * >>>>>> - IOMMU_CMD_BUFFER_ENTRY_SIZE)); >>>>>> - >>>>>> - for ( i = 0; i < IOMMU_CMD_BUFFER_U32_PER_ENTRY; i++ ) >>>>>> - cmd_buffer[i] = cmd[i]; >>>>>> + memcpy(iommu->cmd_buffer.buffer + >>>>>> + (iommu->cmd_buffer.tail * >> IOMMU_CMD_BUFFER_ENTRY_SIZE), >>>>>> + cmd, IOMMU_CMD_BUFFER_ENTRY_SIZE); >>>>> ...since the built-in memcpy may not guarantee to the copy in 4 byte >>>> chunks in ascending order. >>>> >>>> "No functional change" != "The binary is identical". >>>> >>>> The functionality of queue_iommu_command() does not change, even if >> it's >>>> code generation does. It is just copying bytes into a shared ring, >>>> bounded later by updating the tail pointer. >>> Yes, my point is that the ring is shared and so DMA by the h/w may race. >>> This is clearly not a good situation but the fact that the code >> generation >>> may change may have side effects. >> All writes to the ring occur strictly before the update of the tail >> pointer > ...unless the compiler decides to re-order. There's no barrier.
/sigh - You're right, but the code was equally buggy beforehand. readl()/writel() aren't suitable for how they are used by this code. ~Andrew _______________________________________________ Xen-devel mailing list [email protected] https://lists.xenproject.org/mailman/listinfo/xen-devel
