On 22 January 2016 at 17:22, Eric Dumazet <eric.duma...@gmail.com> wrote: > On Fri, 2016-01-22 at 15:49 -0800, Joe Stringer wrote: >> Later parts of the stack (including fragmentation) expect that there is >> never a socket attached to frag in a frag_list, however this invariant >> was not enforced on all defrag paths. This could lead to the >> BUG_ON(skb->sk) during ip_do_fragment(), as per the call stack at the >> end of this commit message. >> >> While the call could be added to openvswitch to fix this particular >> error, the head and tail of the frags list are already orphaned >> indirectly inside ip_defrag(), so it seems like the remaining fragments >> should all be orphaned in all circumstances. > > > Yes, it looks we have a problem, and even IP early demux apparently does > not check if incoming packet is a fragment. > > Your patch could also remove some socket leaks in this respect. > > I guess we also could add a safety check (ipv4 only, but ipv6 needs care > as well) > > diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c > index b1209b63381f..99513c829213 100644 > --- a/net/ipv4/ip_input.c > +++ b/net/ipv4/ip_input.c > @@ -316,7 +316,9 @@ static int ip_rcv_finish(struct net *net, struct sock > *sk, struct sk_buff *skb) > const struct iphdr *iph = ip_hdr(skb); > struct rtable *rt; > > - if (sysctl_ip_early_demux && !skb_dst(skb) && !skb->sk) { > + if (sysctl_ip_early_demux && > + !skb_dst(skb) && !skb->sk && > + !ip_is_fragment(iph)) { > const struct net_protocol *ipprot; > int protocol = iph->protocol;
Thanks, I can roll this into a v2 (or keep as a separate patch?). I got sidetracked on the IPv6 side, some other issues are blocking me on that but I intend to continue following up there as well.