On Sun, 15 Sep 2019 at 04:53, Vladimir Oltean <olte...@gmail.com> wrote: > > This is the second attempt to submit the tc-taprio offload model for > inclusion in the net tree. The sja1105 switch driver will provide the > first implementation of the offload. Only the bare minimum is added: > > - The offload model and a DSA pass-through > - The hardware implementation > - The interaction with the netdev queues in the tagger code > - Documentation > > What has been removed from the first attempt is support for > PTP-as-clocksource in sja1105. This will be added as soon as the > offload model is settled. > > Vinicius Costa Gomes (1): > taprio: Add support for hardware offloading > > Vladimir Oltean (6): > net: dsa: Pass ndo_setup_tc slave callback to drivers > net: dsa: sja1105: Add static config tables for scheduling > net: dsa: sja1105: Advertise the 8 TX queues > net: dsa: sja1105: Make HOSTPRIO a kernel config > net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio > offload > docs: net: dsa: sja1105: Add info about the time-aware scheduler > > Documentation/networking/dsa/sja1105.rst | 90 ++++ > drivers/net/dsa/sja1105/Kconfig | 17 + > drivers/net/dsa/sja1105/Makefile | 4 + > drivers/net/dsa/sja1105/sja1105.h | 6 + > .../net/dsa/sja1105/sja1105_dynamic_config.c | 8 + > drivers/net/dsa/sja1105/sja1105_main.c | 28 +- > .../net/dsa/sja1105/sja1105_static_config.c | 167 +++++++ > .../net/dsa/sja1105/sja1105_static_config.h | 48 +- > drivers/net/dsa/sja1105/sja1105_tas.c | 427 ++++++++++++++++++ > drivers/net/dsa/sja1105/sja1105_tas.h | 42 ++ > include/linux/netdevice.h | 1 + > include/net/dsa.h | 2 + > include/net/pkt_sched.h | 23 + > include/uapi/linux/pkt_sched.h | 3 +- > net/dsa/slave.c | 12 +- > net/dsa/tag_sja1105.c | 3 +- > net/sched/sch_taprio.c | 409 +++++++++++++++-- > 17 files changed, 1237 insertions(+), 53 deletions(-) > create mode 100644 drivers/net/dsa/sja1105/sja1105_tas.c > create mode 100644 drivers/net/dsa/sja1105/sja1105_tas.h > > -- > > For those who want to follow along with the hardware implementation, the > manual is here: > https://www.nxp.com/docs/en/user-guide/UM10944.pdf > > Notable changes in v2: > - Made the series independent from PTP (which is temporarily removed) > - Changed the meaning of the gate_mask - it is now acting on traffic > classes even in the view exposed by taprio to drivers. > - Removed the next_sched hrtimer. > - Summarized one of the responses given to Vinicius into a new > documentation section. > > The first version of the net-next patch series can be found here: > > https://www.spinics.net/lists/netdev/msg597214.html > > Changes in the first version of the net-next series compared to RFC v2: > - Made "flags 1" and "flags 2" mutually exclusive in the taprio qdisc > - Moved taprio_enable_offload and taprio_disable_offload out of atomic > context - spin_lock_bh(qdisc_lock(sch)). This allows drivers that > implement the ndo_setup_tc to sleep and for taprio memory to be > allocated with GFP_KERNEL. The only thing that was kept under the > spinlock is the assignment of the q->dequeue and q->peek pointers. > - Finally making proper use of own API - added a taprio_alloc helper to > avoid passing stack memory to drivers. > > The second version of the RFC is at: > https://www.spinics.net/lists/netdev/msg596663.html > > Changes in v2 of the RFC since v1: > - Adapted the taprio offload patch to work by specifying "flags 2" to > the iproute2-next tc. At the moment I don't clearly understand whether > the full offload and the txtime assist ("flags 1") are mutually > exclusive or not (i.e. whether a "flags 3" mode should be rejected, > which it currently isn't). > - Added reference counting to the taprio offload structure. Maybe the > function names and placement could have been better though. As for the > other complaint (cycle time calculation) it got fixed in the taprio > parser in the meantime. > - Converted sja1105 to use the hardware PTP registers, and save/restore > the PTP time across resets. > - Made the DSA callback for ndo_setup_tc a bit more generic, but I don't > know whether it fulfills expectations. Drivers still can't do blocking > operations in its execution context. > - Added a state machine for starting/stopping the scheduler based on the > last command run on the PTP clock. > > The first RFC from July can be seen at: > https://lists.openwall.net/netdev/2019/07/07/81 > > Original cover letter: > > Using Vinicius Costa Gomes' configuration interface for 802.1Qbv (later > resent by Voon Weifeng for the stmmac driver), I am submitting for > review a draft implementation of this offload for a DSA switch. > > I don't want to insist too much on the hardware specifics of SJA1105 > which isn't otherwise very compliant to the IEEE spec. > > In order to be able to test with Vedang Patel's iproute2 patch for > taprio offload (https://www.spinics.net/lists/netdev/msg573072.html) > I had to actually revert the txtime-assist branch as it had changed the > iproute2 interface. > > In terms of impact for DSA drivers, I would like to point out that: > > - Maybe somebody should pre-populate qopt->cycle_time in case the user > does not provide one. Otherwise each driver needs to iterate over the > GCL once, just to set the cycle time (right now stmmac does as well). > > - Configuring the switch over SPI cannot apparently be done from this > ndo_setup_tc callback because it runs in atomic context. I also have > some downstream patches to offload tc clsact matchall with mirred > action, but in that case it looks like the atomic context restriction > does not apply. > > - I had to copy the struct tc_taprio_qopt_offload to driver private > memory because a static config needs to be constructed every time a > change takes place, and there are up to 4 switch ports that may take a > TAS configuration. I have created a private > tc_taprio_qopt_offload_copy() helper for this - I don't know whether > it's of any help in the general case. > > There is more to be done however. The TAS needs to be integrated with > the PTP driver. This is because with a PTP clock source, the base time > is written dynamically to the PTPSCHTM (PTP schedule time) register and > must be a time in the future. Then the "real" base time of each port's > TAS config can be offset by at most ~50 ms (the DELTA field from the > Schedule Entry Points Table) relative to PTPSCHTM. > Because base times in the past are completely ignored by this hardware, > we need to decide if it's ok behaviorally for a driver to "roll" a past > base time into the immediate future by incrementally adding the cycle > time (so the phase doesn't change). If it is, then decide by how long in > the future it is ok to do so. Or alternatively, is it preferable if the > driver errors out if the user-supplied base time is in the past and the > hardware doesn't like it? But even then, there might be fringe cases > when the base time becomes a past PTP time right as the driver tries to > apply the config. > Also applying a tc-taprio offload to a second SJA1105 switch port will > inevitably need to roll the first port's (now past) base time into an > equivalent future time. > All of this is going to be complicated even further by the fact that > resetting the switch (to apply the tc-taprio offload) makes it reset its > PTP time. > > 2.17.1 >
I'm sorry, please disregard this series, it's not the version I intended to send. -Vladimir