From: Yonghong Song <[email protected]>
Date: Wed, 21 Mar 2018 16:31:02 -0700
> One of our in-house projects, bpf-based NAT, hits a kernel BUG_ON at
> function skb_segment(), line 3667. The bpf program attaches to
> clsact ingress, calls bpf_skb_change_proto to change protocol
> from ipv4 to ipv6 or from ipv6 to ipv4, and then calls bpf_redirect
> to send the changed packet out.
> ...
> 3665 while (pos < offset + len) {
> 3666 if (i >= nfrags) {
> 3667 BUG_ON(skb_headlen(list_skb));
> ...
>
> The triggering input skb has the following properties:
> list_skb = skb->frag_list;
> skb->nfrags != NULL && skb_headlen(list_skb) != 0
> and skb_segment() is not able to handle a frag_list skb
> if its headlen (list_skb->len - list_skb->data_len) is not 0.
>
> Patch #1 provides a simple solution to avoid BUG_ON. If
> list_skb->head_frag is true, its page-backed frag will
> be processed before the list_skb->frags.
> Patch #2 provides a test case in test_bpf module which
> constructs a skb and calls skb_segment() directly. The test
> case is able to trigger the BUG_ON without Patch #1.
>
> The patch has been tested in the following setup:
> ipv6_host <-> nat_server <-> ipv4_host
> where nat_server has a bpf program doing ipv4<->ipv6
> translation and forwarding through clsact hook
> bpf_skb_change_proto.
Series applied, however I'm still not %100 convinced that allowing this
kind of protocol and MSS sized mucked GRO packet is a good idea.