Historically, support for frag_list packets entering skb_segment() was limited to frag_list members terminating on exact same gso_size boundaries. This is verified with a BUG_ON since commit 89319d3801d1 ("net: Add frag_list support to skb_segment"), quote:
As such we require all frag_list members terminate on exact MSS boundaries. This is checked using BUG_ON. As there should only be one producer in the kernel of such packets, namely GRO, this requirement should not be difficult to maintain. However, since commit 6578171a7ff0 ("bpf: add bpf_skb_change_proto helper"), the "exact MSS boundaries" assumption no longer holds: An eBPF program using bpf_skb_change_proto() DOES modify 'gso_size', but leaves the frag_list members as originally merged by GRO with the original 'gso_size'. Example of such programs are bpf-based NAT46 or NAT64. This lead to a kernel BUG_ON for flows involving: - GRO generating a frag_list skb - bpf program performing bpf_skb_change_proto() or bpf_skb_adjust_room() - skb_segment() of the skb See example BUG_ON reports in [0]. In commit 13acc94eff12 ("net: permit skb_segment on head_frag frag_list skb"), skb_segment() was modified to support the "gso_size mangling" case of a frag_list GRO'ed skb, but *only* for frag_list members having head_frag==true (having a page-fragment head). Alas, GRO packets having frag_list members with a linear kmalloced head (head_frag==false) still hit the BUG_ON. This commit adds support to skb_segment() for a 'head_skb' packet having a frag_list whose members are *non* head_frag, with gso_size mangled, by disabling SG and thus falling-back to copying the data from the given 'head_skb' into the generated segmented skbs - as suggested by Willem de Bruijn [1]. Since this approach involves the penalty of skb_copy_and_csum_bits() when building the segments, care was taken in order to enable this solution only when required: - untrusted gso_size, by testing SKB_GSO_DODGY is set (SKB_GSO_DODGY is set by any gso_size mangling functions in net/core/filter.c) - the frag_list is non empty, its item is a non head_frag, *and* the headlen of the given 'head_skb' does not match the gso_size. [0] https://lore.kernel.org/netdev/20190826170724.25ff616f@pixies/ https://lore.kernel.org/netdev/9265b93f-253d-6b8c-f2b8-4b54eff18...@fb.com/ [1] https://lore.kernel.org/netdev/CA+FuTSfVsgNDi7c=guu8nmg2hwxf2sjcnlxethevpdnxaw5...@mail.gmail.com/ Fixes: 6578171a7ff0 ("bpf: add bpf_skb_change_proto helper") Suggested-by: Willem de Bruijn <willemdebruijn.ker...@gmail.com> Cc: Daniel Borkmann <dan...@iogearbox.net> Cc: Eric Dumazet <eric.duma...@gmail.com> Cc: Alexander Duyck <alexander.du...@gmail.com> Signed-off-by: Shmulik Ladkani <shmulik.ladk...@gmail.com> --- net/core/skbuff.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index ea8e8d332d85..c4bd1881acff 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -3678,6 +3678,24 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb, sg = !!(features & NETIF_F_SG); csum = !!can_checksum_protocol(features, proto); + if (mss != GSO_BY_FRAGS && + (skb_shinfo(head_skb)->gso_type & SKB_GSO_DODGY)) { + /* gso_size is untrusted. + * + * If head_skb has a frag_list with a linear non head_frag + * item, and head_skb's headlen does not fit requested + * gso_size, fall back to copying the skbs - by disabling sg. + * + * We assume checking the first frag suffices, i.e if either of + * the frags have non head_frag data, then the first frag is + * too. + */ + if (list_skb && skb_headlen(list_skb) && !list_skb->head_frag && + (mss != skb_headlen(head_skb) - doffset)) { + sg = false; + } + } + if (sg && csum && (mss != GSO_BY_FRAGS)) { if (!(features & NETIF_F_GSO_PARTIAL)) { struct sk_buff *iter; -- 2.19.1