On Fri, 2016-01-22 at 15:49 -0800, Joe Stringer wrote: > Later parts of the stack (including fragmentation) expect that there is > never a socket attached to frag in a frag_list, however this invariant > was not enforced on all defrag paths. This could lead to the > BUG_ON(skb->sk) during ip_do_fragment(), as per the call stack at the > end of this commit message. > > While the call could be added to openvswitch to fix this particular > error, the head and tail of the frags list are already orphaned > indirectly inside ip_defrag(), so it seems like the remaining fragments > should all be orphaned in all circumstances.
Yes, it looks we have a problem, and even IP early demux apparently does not check if incoming packet is a fragment. Your patch could also remove some socket leaks in this respect. I guess we also could add a safety check (ipv4 only, but ipv6 needs care as well) diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c index b1209b63381f..99513c829213 100644 --- a/net/ipv4/ip_input.c +++ b/net/ipv4/ip_input.c @@ -316,7 +316,9 @@ static int ip_rcv_finish(struct net *net, struct sock *sk, struct sk_buff *skb) const struct iphdr *iph = ip_hdr(skb); struct rtable *rt; - if (sysctl_ip_early_demux && !skb_dst(skb) && !skb->sk) { + if (sysctl_ip_early_demux && + !skb_dst(skb) && !skb->sk && + !ip_is_fragment(iph)) { const struct net_protocol *ipprot; int protocol = iph->protocol;