On 16.10.2020 18:55, Willem de Bruijn wrote:
On Fri, Oct 16, 2020 at 7:14 AM Alexander Ovechkin <o...@yandex-team.ru> wrote:
ip6_tnl_encap assigns to proto transport protocol which
encapsulates inner packet, but we must pass to set_inner_ipproto
protocol of that inner packet.
Calling set_inner_ipproto after ip6_tnl_encap might break gso.
For example, in case of encapsulating ipv6 packet in fou6 packet, inner_ipproto
would be set to IPPROTO_UDP instead of IPPROTO_IPV6. This would lead to
incorrect calling sequence of gso functions:
ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment ->
udp6_ufo_fragment
instead of:
ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment ->
ip6ip6_gso_segment
Signed-off-by: Alexander Ovechkin <o...@yandex-team.ru>
Commit 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support") moved
the call from ip6_tnl_encap's caller to inside ip6_tnl_encap.
It makes sense that that likely broke this behavior for UDP (L4) tunnels.
But it was moved on purpose to avoid setting the inner protocol to
IPPROTO_MPLS. That needs to use skb->inner_protocol to further
segment.
I suspect we need to set this before or after conditionally to avoid
breaking that use case.
I hope it could be fixed with something like this:
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index a0217e5..87368b0 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1121,6 +1121,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device
*dev, __u8 dsfield,
bool use_cache = false;
u8 hop_limit;
int err = -1;
+ __u8 pproto = proto;
if (t->parms.collect_md) {
hop_limit = skb_tunnel_info(skb)->key.ttl;
@@ -1280,7 +1281,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device
*dev, __u8 dsfield,
ipv6_push_frag_opts(skb, &opt.ops, &proto);
}
- skb_set_inner_ipproto(skb, proto);
+ skb_set_inner_ipproto(skb, pproto == IPPROTO_MPLS ? proto : pproto);
skb_push(skb, sizeof(struct ipv6hdr));
skb_reset_network_header(skb);