> From: "Lance Richardson" <lrich...@redhat.com> > To: netdev@vger.kernel.org, pab...@redhat.com > Sent: Monday, 29 May, 2017 1:25:57 PM > Subject: [PATCH net] vxlan: eliminate cached dst leak > > After commit 0c1d70af924b ("net: use dst_cache for vxlan device"), > cached dst entries could be leaked when more than one remote was > present for a given vxlan_fdb entry, causing subsequent netns > operations to block indefinitely and "unregister_netdevice: waiting > for lo to become free." messages to appear in the kernel log. > > Fix by properly releasing cached dst and freeing resources in this > case. > > Fixes: commit 0c1d70af924b ("net: use dst_cache for vxlan device") > Signed-off-by: Lance Richardson <lrich...@redhat.com> > ---
This problem was originally debugged and the patch tested in an OpenStack (devstack) test environment. Here's a small(-ish) reproducer script that was cooked up after posting: ip netns add ns0 ip netns add ns1 ip netns add ns2 ip link add p0 type veth peer name p1 ip link add p2 type veth peer name p3 ip link add p4 type veth peer name p5 ip link add name br0 type bridge ip link set br0 up ip link set p0 master br0 up ip link set p1 netns ns0 ip link set p2 master br0 up ip link set p3 netns ns1 ip link set p4 master br0 up ip link set p5 netns ns2 ip netns exec ns0 ip addr add "1.1.1.1/24" dev p1 ip netns exec ns0 ip link set dev p1 up ip netns exec ns1 ip addr add "1.1.1.2/24" dev p3 ip netns exec ns1 ip link set dev p3 up ip netns exec ns2 ip addr add "1.1.1.3/24" dev p5 ip netns exec ns2 ip link set dev p5 up ip netns exec ns0 ip link add vxlan0 type vxlan dstport 4789 id 10 dev p1 ip netns exec ns0 ip addr add "4.1.1.1/24" dev vxlan0 ip netns exec ns0 ip link set dev vxlan0 up mtu 1450 ip netns exec ns1 ip link add vxlan1 type vxlan dstport 4789 id 10 dev p3 ip netns exec ns1 ip addr add "4.1.1.2/24" dev vxlan1 ip netns exec ns1 ip link set dev vxlan1 up mtu 1450 ip netns exec ns2 ip link add vxlan2 type vxlan dstport 4789 id 10 dev p5 ip netns exec ns2 ip addr add "4.1.1.3/24" dev vxlan2 ip netns exec ns2 ip link set dev vxlan2 up mtu 1450 # Create a vxlan default fdb entry with two remotes in the list ip netns exec ns0 bridge fdb append to 00:00:00:00:00:00 dst 1.1.1.2 dev vxlan0 ip netns exec ns0 bridge fdb append to 00:00:00:00:00:00 dst 1.1.1.3 dev vxlan0 # Forward some packets to populate dst cache for default fdb ip netns exec ns0 ping -c 2 4.1.1.2 ip netns exec ns0 ping -c 2 4.1.1.3 # delete one of the entries in the fdb remotes list to trigger the bug ip netns exec ns0 bridge fdb del to 00:00:00:00:00:00 dst 1.1.1.3 dev vxlan0 ip netns del ns2 ip netns del ns1 ip netns del ns0 # If bug is triggered, kernel messages similar to this should be logged: # # kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1 # # Netns commands like "ip netns add ns3" will hang indefinitely.