On 2/20/19 12:18 PM, Paolo Abeni wrote: > When a netdevice is unregistered, we flush the relevant exception > via rt6_sync_down_dev() -> fib6_ifdown() -> fib6_del() -> fib6_del_route(). > > Finally, we end-up calling rt6_remove_exception(), where we release > the relevant dst, while we keep the references to the related fib6_info and > dev. Such references should be released later when the dst will be > destroyed. > > There are a number of caches that can keep the exception around for an > unlimited amount of time - namely dst_cache, possibly even socket cache. > As a result device registration may hang, as demonstrated by this script: > > ip netns add cl > ip netns add rt > ip netns add srv > ip netns exec rt sysctl -w net.ipv6.conf.all.forwarding=1 > > ip link add name cl_veth type veth peer name cl_rt_veth > ip link set dev cl_veth netns cl > ip -n cl link set dev cl_veth up > ip -n cl addr add dev cl_veth 2001::2/64 > ip -n cl route add default via 2001::1 > > ip -n cl link add tunv6 type ip6tnl mode ip6ip6 local 2001::2 remote 2002::1 > hoplimit 64 dev cl_veth > ip -n cl link set tunv6 up > ip -n cl addr add 2013::2/64 dev tunv6 > > ip link set dev cl_rt_veth netns rt > ip -n rt link set dev cl_rt_veth up > ip -n rt addr add dev cl_rt_veth 2001::1/64 > > ip link add name rt_srv_veth type veth peer name srv_veth > ip link set dev srv_veth netns srv > ip -n srv link set dev srv_veth up > ip -n srv addr add dev srv_veth 2002::1/64 > ip -n srv route add default via 2002::2 > > ip -n srv link add tunv6 type ip6tnl mode ip6ip6 local 2002::1 remote 2001::2 > hoplimit 64 dev srv_veth > ip -n srv link set tunv6 up > ip -n srv addr add 2013::1/64 dev tunv6 > > ip link set dev rt_srv_veth netns rt > ip -n rt link set dev rt_srv_veth up > ip -n rt addr add dev rt_srv_veth 2002::2/64 > > ip netns exec srv netserver & sleep 0.1 > ip netns exec cl ping6 -c 4 2013::1 > ip netns exec cl netperf -H 2013::1 -t TCP_STREAM -l 3 & sleep 1 > ip -n rt link set dev rt_srv_veth mtu 1400 > wait %2 > > ip -n cl link del cl_veth > > This commit addresses the issue purging all the references held by the > exception at time, as we currently do for e.g. ipv6 pcpu dst entries. > > v1 -> v2: > - re-order the code to avoid accessing dst and net after dst_dev_put() > > Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst > based routes") > Signed-off-by: Paolo Abeni <pab...@redhat.com> > --- > net/ipv6/route.c | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) >
I am surprised this was not found by the existing pmtu script which creates exceptions. Please add this test case to selftests to capture this specific set of events. Reviewed-by: David Ahern <dsah...@gmail.com> Thanks for the resolving.