From: Sheng Lan <lansh...@huawei.com> Date: Thu, 28 Feb 2019 18:47:58 +0800
> From: Sheng Lan <lansh...@huawei.com> > > It can be reproduced by following steps: > 1. virtio_net NIC is configured with gso/tso on > 2. configure nginx as http server with an index file bigger than 1M bytes > 3. use tc netem to produce duplicate packets and delay: > tc qdisc add dev eth0 root netem delay 100ms 10ms 30% duplicate 90% > 4. continually curl the nginx http server to get index file on client > 5. BUG_ON is seen quickly ... > In __skb_to_sgvec(), the skb->len is not equal to the sum of the skb's > linear data size and nonlinear data size, thus BUG_ON triggered. > Because the skb is cloned and a part of nonlinear data is split off. > > Duplicate packet is cloned in netem_enqueue() and may be delayed > some time in qdisc. When qdisc len reached the limit and returns > NET_XMIT_DROP, the skb will be retransmit later in write queue. > the skb will be fragmented by tso_fragment(), the limit size > that depends on cwnd and mss decrease, the skb's nonlinear > data will be split off. The length of the skb cloned by netem > will not be updated. When we use virtio_net NIC and invoke skb_to_sgvec(), > the BUG_ON trigger. > > To fix it, netem returns NET_XMIT_SUCCESS to upper stack > when it clones a duplicate packet. > > Fixes: 35d889d1 ("sch_netem: fix skb leak in netem_enqueue()") > Signed-off-by: Sheng Lan <lansh...@huawei.com> > Reported-by: Qin Ji <jiqin...@huawei.com> > Suggested-by: Eric Dumazet <eric.duma...@gmail.com> Applied and queued up for -stable, thanks.