Kurt Kanzenbach <k...@linutronix.de> writes: > When using short intervals e.g. below one millisecond, large packets won't be > transmitted at all. The software implementations checks whether the packet can > be fit into the remaining interval. Therefore, it takes the packet length and > the transmission speed into account. That is correct. > > However, for large packets it may be that the transmission time will be larger > than the interval resulting in no packet transmission. The same situation > works > fine with hardware offloading applied. > > The problem has been observerd with the following schedule and iperf3: > > |tc qdisc replace dev lan1 parent root handle 100 taprio \ > | num_tc 8 \ > | map 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 \ > | queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ > | base-time $base \ > | sched-entry S 0x40 500000 \ > | sched-entry S 0xbf 500000 \ > | clockid CLOCK_TAI \ > | flags 0x00 > > [...] > > |root@tsn:~# iperf3 -c 192.168.2.105 > |Connecting to host 192.168.2.105, port 5201 > |[ 5] local 192.168.2.121 port 52610 connected to 192.168.2.105 port 5201 > |[ ID] Interval Transfer Bitrate Retr Cwnd > |[ 5] 0.00-1.00 sec 45.2 KBytes 370 Kbits/sec 0 1.41 KBytes > |[ 5] 1.00-2.00 sec 0.00 Bytes 0.00 bits/sec 0 1.41 KBytes > > After debugging, it seems that the packet length stored in the SKB is about > 7000-8000 bytes. Using a 100 Mbit/s link the transmission time is about 600us > which larger than the interval of 500us. > > Therefore, segment the SKB into smaller chunks if the packet is too big. This > yields similar results than the hardware offload: > > |root@tsn:~# iperf3 -c 192.168.2.105 > |Connecting to host 192.168.2.105, port 5201 > |- - - - - - - - - - - - - - - - - - - - - - - - - > |[ ID] Interval Transfer Bitrate Retr > |[ 5] 0.00-10.00 sec 48.9 MBytes 41.0 Mbits/sec 0 sender > |[ 5] 0.00-10.02 sec 48.7 MBytes 40.7 Mbits/sec > receiver > > Signed-off-by: Kurt Kanzenbach <k...@linutronix.de> > --- > net/sched/sch_taprio.c | 39 +++++++++++++++++++++++++++++++++++++++ > 1 file changed, 39 insertions(+) > > diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c > index 8287894541e3..8434e87f79f7 100644 > --- a/net/sched/sch_taprio.c > +++ b/net/sched/sch_taprio.c > @@ -411,6 +411,42 @@ static long get_packet_txtime(struct sk_buff *skb, > struct Qdisc *sch) > return txtime; > } > > +/* Similar to net/sched/sch_tbf.c::tbf_segment */ > +static int taprio_segment(struct sk_buff *skb, struct Qdisc *sch, > + struct Qdisc *child, struct sk_buff **to_free) > +{ > + netdev_features_t features = netif_skb_features(skb); > + unsigned int len = 0, prev_len = qdisc_pkt_len(skb); > + struct sk_buff *segs, *nskb; > + int ret, nb; > + > + segs = skb_gso_segment(skb, features & ~NETIF_F_GSO_MASK); > + > + if (IS_ERR_OR_NULL(segs)) > + return qdisc_drop(skb, sch, to_free); > + > + nb = 0; > + skb_list_walk_safe(segs, segs, nskb) { > + skb_mark_not_on_list(segs); > + qdisc_skb_cb(segs)->pkt_len = segs->len; > + len += segs->len; > + ret = qdisc_enqueue(segs, child, to_free); > + if (ret != NET_XMIT_SUCCESS) { > + if (net_xmit_drop_count(ret)) > + qdisc_qstats_drop(sch); > + } else { > + nb++; > + } > + } > + > + sch->q.qlen += nb; > + if (nb > 1) > + qdisc_tree_reduce_backlog(sch, 1 - nb, prev_len - len); > + consume_skb(skb); > + > + return nb > 0 ? NET_XMIT_SUCCESS : NET_XMIT_DROP; > +} > + > static int taprio_enqueue(struct sk_buff *skb, struct Qdisc *sch, > struct sk_buff **to_free) > { > @@ -433,6 +469,9 @@ static int taprio_enqueue(struct sk_buff *skb, struct > Qdisc *sch, > return qdisc_drop(skb, sch, to_free); > } > > + if (skb_is_gso(skb)) > + return taprio_segment(skb, sch, child, to_free); > +
My first worry was whether the segments had the same tstamp as their parent, and it seems that they do, so everything should just work with etf or the txtime-assisted mode. I just want to play with this patch a bit and see how it works in practice. But it looks good. Cheers, -- Vinicius