Re: [PATCH v1 net-next 00/15] tc-taprio offload for SJA1105 DSA

Vladimir Oltean Sun, 08 Sep 2019 04:08:08 -0700

Hi Andrew, David,

On Sep 7, 2019, at 3:46 PM, Andrew Lunn <[email protected]> wrote:
>
> On Fri, Sep 06, 2019 at 02:54:03PM +0200, David Miller wrote:
>>
>>  From: Vladimir Oltean <[email protected]>
>>  Date: Mon,  2 Sep 2019 19:25:29 +0300
>>
>>>
>>>  This is the first attempt to submit the tc-taprio offload model for
>>>  inclusion in the net tree.
>>
>>
>>  Someone really needs to review this.
>
> Hi Vladimir
>
> You might have more chance getting this reviewed if you split it up
> into a number of smaller series. Richard could probably review the
> plain PTP changes. Who else has worked on tc-taprio recently? A series
> purely about tc-taprio might be more likely reviewed by a tc-taprio
> person, if it does not contain PTP changes.
>
>     Andrew


I think Richard has been there when the taprio, etf qdiscs, SO_TXTIME
were first defined and developed:
https://patchwork.ozlabs.org/cover/808504/
I expect he is capable of delivering a competent review of the entire
series, possibly way more competent than my patch set itself.

The reason why I'm not splitting it up is because I lose around 10 ns
of synchronization offset when using the hardware-corrected PTPCLKVAL
clock for timestamping rather than the PTPTSCLK free-running counter.
This is mostly due to the fact that SPI interaction is reduced to a
minimum when correcting the switch's PHC in software - OTOH when that
correction translates into SPI writes to PTPCLKADD/PTPCLKVAL and
PTPCLKRATE, that's when things go a bit downhill with the precision.
Now the compromise is fully acceptable if the PTP clock is to be used
as the trigger source for the time-aware scheduler, but the conversion
would be quite pointless with no user to really require the hardware
clock.

Additionally, the 802.1AS PTP profile even calls for switches and
end-stations to use timestamping counters that are free-running, and
scale&rate-correct those in software - due to a perceived "double
feedback loop", or "changing the ruler while measuring with it". Now
I'm no expert at all, but it would be interesting if we went on with
the discussion in the direction of what Linux is currently
understanding by a "free-running" PTP counter. On one hand there's the
timecounter/cyclecounter in the kernel which makes for a
software-corrected PHC, and on the other there's the free_running
option in linuxptp which makes for a "nowhere-corrected" PHC that is
only being used in the E2E_TC and P2P_TC profiles. But user space
otherwise has no insight into the PHC implementation from the kernel,
and "free_running" from ptp4l can't really be used to implement the
synchronization mechanism required by 802.1AS.

To me, the most striking aspect is that this particular recommendation
from 802.1AS is at direct odds with 802.1Qbv (time-based egress) /
802.1Qci (time-based ingress policing) which clearly require a PTP
counter in the NIC that ticks to the wall clock, and not to a random
free-running time since boot up. I simply can't seem to reconcile the
two.
What this particular switch does is that it permits RX and TX
timestamps to be taken in either corrected or uncorrected timebases
(but unfortunately not both at the same time). I think the hardware
designers' idea was to take timestamps off the uncorrected clock
(PTPTSCLK) and then do a sort of phc2sys-to-itself: write the
software-corrected value of the timecounter/cyclecounter into the
PTPCLKVAL hardware registers which get used for Qbv/Qci.
Actually I hate to use those terms when talking about SJA1105 hardware
support, since it's more "in the style of" IEEE rather than strict
compliance (timing of the design vs the standard might have played a
role as well).

But let's leave 802.1AS aside for a second - that's not what the patch
set is about, but rather a bit of background on why there are 2 PTP
clocks in this switch, and why I'm switching from one to the other.
Richard didn't really warm up to the phc2sys-to-itself idea in the
past, and opted for simplicity: just use the hardware-corrected
PTPCLKVAL for everything, which is exactly what I'm doing as of now.

The only people whom I know are working on TSN stuff are mostly
entrenched in papers, standards and generally in the hardware-only
mentality. There is obviously a lot to be done for Linux to be a
proper TSN endpoint, and RT is a big one. For a switch in particular,
things are a bit easier due to the fact that it just needs to ensure
the real-time guarantees of a frame that was supposedly already
delivered in-band with the schedule. And there's no other way to do
that rather than through a hardware offload - otherwise the software
tc-taprio would only shape the frames egressed by the management CPU
of the switch. The tc-taprio offload for a switch only makes sense
when taken together with the bridging offload, if you will.

I "dared" to submit this for merging maybe because I don't see the
subtleties that prevent it from going in, at least for a switch - it
just works and does the job. I would have loved to see this in 5.4
just so I would have to lug around a bit less patches when finally
starting to evaluate the endpoint side of things with the 5.4-rt
patch. But nonetheless, there's no hurry and getting a healthy
discussion going is surely more important than the patches themselves
are. On the other hand there needs to be a balance, and just talking
with no code is no good either - fixes, improvements, rework can
always come later once we commit to the basic offload model.

I happen to be around at Plumbers during the following days to learn
what else is going on in the Linux community, and develop a more
complete mental model for myself for how TSN fits in with all of that.
If anybody happens to also be around, I'd be more than happy to talk.

Regards,
-Vladimir

Re: [PATCH v1 net-next 00/15] tc-taprio offload for SJA1105 DSA

Reply via email to