On Mon, Jan 23, 2017 at 1:44 AM, David Laight <david.lai...@aculab.com> wrote:
> Alexander Duyck
>> Sent: 19 January 2017 15:55
> ...
>> >> The Relaxed Ordering attribute doesn't get applied across the board.
>> >> It ends up being limited to a subset of the transactions if I recall
>> >> correctly.  In this case it is the Tx descriptor write back, and the
>> >> Rx data write back.  We don't apply the RO bit to any other
>> >> transactions.
>> >>
>> >> In the case of Tx descriptor there is no harm in allowing it to be
>> >> reordered because we only really read the DD bit so we don't care
>> >> about the ordering of the write back.  In the case of the Rx data the
>> >> Rx descriptor essentially acts as a flush since it is sent without the
>> >> RO bit set.  So all the writes before it must be completed before the
>> >> Rx descriptor write back.
>> >
>> > In which case why not set it unconditionally for all architectures?
>> >
>> > I'm surprised (I often am) that allowing those re-orderings makes
>> > any significant difference.
>> > Unfortunately you need a PCIe analyser to see what is really happening
>> > and they don't come cheap.
>> >
>> > What I do vaguely remember is that some hosts don't always implement
>> > the 'normal' re-ordering of reads and read completions.
>> > Re-ordering of reads allows descriptor reads to overtake transmit
>> > traffic which is likely to make a difference.
>>
>> I think part of the issue, at least in the case of SPARC, is that the
>> handling of the memory writes in the PCIe root complex is impacted by
>> the RO attribute.  On the bus itself it doesn't matter much, but at
>> the root complex it can become expensive to have to wait on a partial
>> write to complete while there are other writes pending.  This is why
>> the IOMMU for SPARC now has a WEAK_ORDERING attribute you can add so
>> that it can write the data in whatever order it wants in relation to
>> other writes in that region.
>
> I hope the IOMMU only ever reorders writes that have the RO bit set.

I'm assuming it only applies to DMA regions mapped with
DMA_ATTR_WEAK_ORDERING.  Since drivers have to specify that attribute
it likely is only going to apply to DMA regions that could have the RO
bit set.

> Has anyone tried cache invalidates on the rx buffers?
> Might make the writes less expensive.
> Or is the issue with NUMA rather than cache.

I don't know.  This is all very SPARC specific and I haven't done any
of the work on it.  You might try checking with those responsible for
introducing DMA_ATTR_WEAK_ORDERING for the SPARC architecture.

- Alex

Reply via email to