[ adding netdev so others know ] On 10/9/18 3:38 AM, Preethi Ramachandra wrote: > Hi David, > > I tested your fix, Linux is updating PMTU successfully.
ok, I'll send a formal patch > > Thanks, > Preethi > > On 10/7/18, 8:59 AM, "David Ahern" <dsah...@gmail.com> wrote: > > The correct mailing list is netdev@vger.kernel.org (added) > > non-text emails will be rejected. > > > On 10/3/18 10:15 PM, Preethi Ramachandra wrote: > > Hi, > > > > > > > > While testing the PMTU discovery for UDP/raw applications, Linux is not > > doing PMTU discovery if the UDP server socket is not bound to a device. > > In the scenario we are testing there could be multiple VRF devices > > created and an application like UDP/RAW can use a common socket for all > > vrf devices. While sending packet IP_PKTINFO socket option can be used > > to specify the vrf interface through which packet will be sent out. In > > this case, when packet too big icmp6 error message comes back to Linux > > on a vrf device, a route lookup is done on default routing-table(0) for > > src/dst address which case, the route will not be found and packet is > > dropped. If the route lookup happened with proper VRF device (packet’s > > incoming index), the route lookup succeeds, PMTU discovery is > successful. > > > > > > > > This might need a fix, please take a look. > > > > > > > > *Linux version * > > > > > > > > Linux 4.8.24 > > > > > > > > *Code flow * > > > > > > > > Linux code where it expects socket’s bound device in order for PMTU > > discovery to happen. > > > > *void ip6_sk_update_pmtu*(struct sk_buff *skb, struct sock *sk, __be32 > mtu) > > > > { > > > > struct dst_entry *dst; > > > > > > > > ip6_update_pmtu(skb, sock_net(sk), mtu, > > > > > > sk->sk_bound_dev_if, sk->sk_mark, sk->sk_uid);*<<<<< This is the point > > where it expects socket’s sk_bound_dev_if to be set. In our testing this > > is actually 0, since the socket is not really bound to a vrf device.* > > Try this based on top of tree for 4.19-next (whitespace damaged on paste > so you'll need to manually apply and handle differences with 4.8): > > diff --git a/net/ipv6/route.c b/net/ipv6/route.c > index 6c1d817151ca..50b95b48b911 100644 > --- a/net/ipv6/route.c > +++ b/net/ipv6/route.c > @@ -2360,10 +2360,13 @@ EXPORT_SYMBOL_GPL(ip6_update_pmtu); > > void ip6_sk_update_pmtu(struct sk_buff *skb, struct sock *sk, __be32 mtu) > { > + int oif = sk->sk_bound_dev_if; > struct dst_entry *dst; > > - ip6_update_pmtu(skb, sock_net(sk), mtu, > - sk->sk_bound_dev_if, sk->sk_mark, sk->sk_uid); > + if (!oif && skb->dev) > + oif = l3mdev_master_ifindex(skb->dev); > + > + ip6_update_pmtu(skb, sock_net(sk), mtu, oif, sk->sk_mark, > sk->sk_uid); > > dst = __sk_dst_get(sk); > if (!dst || !dst->obsolete || > > > >