Re: [PATCH/RFC] tun: defer skb_orphan() to tun_do_read()

Jason Wang Tue, 05 Mar 2019 19:31:36 -0800


On 2019/3/6 上午5:33, Willem de Bruijn wrote:

On Tue, Mar 5, 2019 at 4:03 PM Arthur Kepner <arthur.kep...@riverbed.com> wrote:


The attachment contains an UNTESTED patch (though a similar patch was tested 
with a
3.10 kernel).

We've been chasing a bug where packet corruption is seen on a tap device. We 
have a
PACKET_MMAP socket which is bound to a tap interface. When throughput goes 
above a
threshold, we begin to see that packets received on the tap device are 
truncated, or
otherwise corrupted. We found that when packets are enqueued to the tap device, 
they
are fine, but by the time they are read, they can be corrupted.

And we found that simply deferring the call to skb_orphan() (where the 
destructor,
tpacket_destruct_skb() marks the frame as TP_STATUS_AVAILABLE) fixes the 
problem.

Maybe there's a better fix, but this worked for us. Thoughts? (Please CC me on 
replies - I'm not
subscribed.)

The skb_orphan calls tpacket_destruct_skb, which updates the entry in
the packet ring to TP_STATUS_AVAILABLE.

Thanks for the report and suggested fix. Delaying the call to
skb_orphan reduces the race condition between release and read, but
does not fully remove it.

As of commit  5cd8d46ea156 ("packet: copy user buffers before orphan
or clone") in 4.20 this should no longer be an issue.

That reuses the msg_zerocopy infrastructure also for packet ring
packets with shared memory. And creates a private copy whenever these
may be looped to a local destination that may queue indefinitely, like
tun.

+1 and if possible the commit should be backported to 3.10. (Or tryvhost + TAP instead of packet mmap).


Thanks

Re: [PATCH/RFC] tun: defer skb_orphan() to tun_do_read()

Reply via email to