2019-03-14, 10:51:49 -0700, Eric Dumazet wrote: > > > On 03/14/2019 10:40 AM, Sabrina Dubroca wrote: > > 2019-03-14, 07:56:10 -0700, Eric Dumazet wrote: > >> > >> > >> On 03/14/2019 07:15 AM, Sabrina Dubroca wrote: > >>> 2019-03-14, 05:58:03 -0700, Eric Dumazet wrote: > >>>> > >>>> > >>>> On 03/14/2019 03:15 AM, Sabrina Dubroca wrote: > >>>>> Commit 745e20f1b626 ("net: add a recursion limit in xmit path") > >>>>> introduced a recursion limit, but it only applies to devices without a > >>>>> queue. Virtual devices with a queue (either because they don't have > >>>>> the IFF_NO_QUEUE flag, or because the administrator added one) can > >>>>> still cause an unbounded recursion, via __dev_queue_xmit -> > >>>>> __dev_xmit_skb -> qdisc_run -> __qdisc_run -> qdisc_restart -> > >>>>> sch_direct_xmit -> dev_hard_start_xmit . Jianlin reported this in a > >>>>> setup with 16 gretap devices stacked on top of one another. > >>>>> > >>>>> This patch prevents the stack overflow by incrementing xmit_recursion in > >>>>> code paths that can call dev_hard_start_xmit() (like commit 745e20f1b626 > >>>>> did). If the recursion limit is exceeded, the packet is enqueued and the > >>>>> qdisc is scheduled. > >>>>> > >>>>> Reported-by: Jianlin Shi <ji...@redhat.com> > >>>>> Signed-off-by: Sabrina Dubroca <s...@queasysnail.net> > >>>>> Reviewed-by: Stefano Brivio <sbri...@redhat.com> > >>>> > >>>> Hi Sabrina, thanks for the patch. > >>>> > >>>> Can't we detect this in the control path instead ? > >>> > >>> I don't see how. You could have a perfectly reasonable set of gretap > >>> devices that trigger this situation from simply reshuffling the IP > >>> addresses: > >>> > >>> gretap$x remote 1.1.$((x-1)).{1,2} > >>> (all those addresses set on a single veth device) > >>> > >>> Then you move those addresses to the corresponding device > >>> (1.1.${x}.{1,2} on gretap$x), and your machine crashes. > >>> > >> > >> If this only can be done with gretap, why gretap cant implement the > >> protection, > >> outside of the fast path ? > > > > It's not just gretap. VXLAN will do the same as long as you add a > > qdisc. I expect other types of tunnels to behave like that. > > > > It might make sense to add a helper using dev_queue_xmit() > for tunnel users. > > Then remove the xmit recursion stuff out of the dev_queue_xmit() > > Lets make the fast path fast again.
Ok. I'm traveling next week, so I'll work on this when I get back. Thanks for the comments. -- Sabrina