On 8/15/19 8:09 AM, Hangbin Liu wrote:
> When we send a packet larger than PMTU, we need to reply with
> icmp_send(ICMP_FRAG_NEEDED) or icmpv6_send(ICMPV6_PKT_TOOBIG).
> 
> But in collect_md mode, kernel will crash while accessing the dst dev
> as __metadata_dst_init() init dst->dev to NULL by default. Here is what
> the code path looks like, for GRE:
> 
> - ip6gre_tunnel_xmit
>   - ip6gre_xmit_ipv4
>     - __gre6_xmit
>       - ip6_tnl_xmit
>         - if skb->len - t->tun_hlen - eth_hlen > mtu; return -EMSGSIZE
>     - icmp_send
>       - net = dev_net(rt->dst.dev); <-- here
>   - ip6gre_xmit_ipv6
>     - __gre6_xmit
>       - ip6_tnl_xmit
>         - if skb->len - t->tun_hlen - eth_hlen > mtu; return -EMSGSIZE
>     - icmpv6_send
>       ...
>       - decode_session4
>         - oif = skb_dst(skb)->dev->ifindex; <-- here
>       - decode_session6
>         - oif = skb_dst(skb)->dev->ifindex; <-- here
> 
> Fix it by updating the dst dev if not set.
> 
> The reproducer is easy:
> 
> ovs-vsctl add-br br0
> ip link set br0 up
> ovs-vsctl add-port br0 gre0 -- \
>         set interface gre0 type=gre options:remote_ip=$dst_addr
> ip link set gre0 up
> ip addr add ${local_gre6}/64 dev br0
> ping6 $remote_gre6 -s 1500
> 
> Fixes: c8b34e680a09 ("ip_tunnel: Add tnl_update_pmtu in ip_md_tunnel_xmit")
> Fixes: 8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnels")
> Reviewed-by: Stefano Brivio <sbri...@redhat.com>
> Signed-off-by: Hangbin Liu <liuhang...@gmail.com>
> ---
>  net/ipv4/ip_tunnel.c  |  3 +++
>  net/ipv6/ip6_tunnel.c | 13 +++++++++----
>  2 files changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
> index 38c02bb62e2c..c6713c7287df 100644
> --- a/net/ipv4/ip_tunnel.c
> +++ b/net/ipv4/ip_tunnel.c
> @@ -597,6 +597,9 @@ void ip_md_tunnel_xmit(struct sk_buff *skb, struct 
> net_device *dev,
>               goto tx_error;
>       }
>  
> +     if (skb_dst(skb) && !skb_dst(skb)->dev)
> +             skb_dst(skb)->dev = rt->dst.dev;
> +


IMO this looks wrong.
This dst seems shared. 
Once set, we will reuse the same dev ?

If intended, why not doing this in __metadata_dst_init() instead of in the fast 
path ?

>       if (key->tun_flags & TUNNEL_DONT_FRAGMENT)
>               df = htons(IP_DF);
>       if (tnl_update_pmtu(dev, skb, rt, df, inner_iph, tunnel_hlen,

Reply via email to