On Fri, 14 Jun 2019 19:08:08 -0700 (PDT), David Miller wrote: > From: Jakub Kicinski <jakub.kicin...@netronome.com> > Date: Wed, 12 Jun 2019 11:51:21 -0700 > > > Brendan reports that the use of netem's packet corruption capability > > leads to strange crashes. This seems to be caused by > > commit d66280b12bd7 ("net: netem: use a list in addition to rbtree") > > which uses skb->next pointer to construct a fast-path queue of > > in-order skbs. > > > > Packet corruption code has to invoke skb_gso_segment() in case > > of skbs in need of GSO. skb_gso_segment() returns a list of > > skbs. If next pointers of the skbs on that list do not get cleared > > fast path list goes into the weeds and tries to access the next > > segment skb multiple times. > > > > Reported-by: Brendan Galloway <brendan.gallo...@netronome.com> > > Fixes: d66280b12bd7 ("net: netem: use a list in addition to rbtree") > > Signed-off-by: Jakub Kicinski <jakub.kicin...@netronome.com> > > Reviewed-by: Dirk van der Merwe <dirk.vanderme...@netronome.com> > > Please rework the commit message a bit to make things cleared, your > ascii diagrams would be great. :)
In process of rewriting the commit message I found a memory leak, and the backlog accounting is also buggy in the segmentation path qdisc netem 8001: root refcnt 64 limit 100 delay 19us corrupt 1% Sent 30237896 bytes 19895 pkt (dropped 1885, overlimits 0 requeues 287) backlog 0b 99p requeues 287 ^^^^^^ 99 packets but 0 bytes I need an internal review, and will repost soon. I need to stop looking for bugs here 🙈