On 14.06.2018 20:11, David Ahern wrote: > On 6/14/18 6:38 AM, Kirill Tkhai wrote: >> The following script makes kernel to crash since it can't obtain >> a name for a device, when the name is occupied by another device: >> >> #!/bin/bash >> ifconfig eth0 down >> ifconfig eth1 down >> index=`cat /sys/class/net/eth1/ifindex` >> ip link set eth1 name dev$index >> unshare -n sleep 1h & >> pid=$! >> while [[ "`readlink /proc/self/ns/net`" == "`readlink /proc/$pid/ns/net`" >> ]]; do continue; done >> ip link set dev$index netns $pid >> ip link set eth0 name dev$index >> kill -9 $pid >> >> Kernel messages: >> >> virtio_net virtio1 dev3: renamed from eth1 >> virtio_net virtio0 dev3: renamed from eth0 >> default_device_exit: failed to move dev3 to init_net: -17 >> ------------[ cut here ]------------ >> kernel BUG at net/core/dev.c:8978! >> invalid opcode: 0000 [#1] PREEMPT SMP >> CPU: 1 PID: 276 Comm: kworker/u8:3 Not tainted 4.17.0+ #292 >> Workqueue: netns cleanup_net >> RIP: 0010:default_device_exit+0x9c/0xb0 >> [stack trace snipped] >> >> This patch gives more variability during choosing new name >> of device and fixes the problem. >> >> Signed-off-by: Kirill Tkhai <ktk...@virtuozzo.com> >> --- >> net/core/dev.c | 4 +--- >> 1 file changed, 1 insertion(+), 3 deletions(-) >> >> diff --git a/net/core/dev.c b/net/core/dev.c >> index 6e18242a1cae..6c9b9303ded6 100644 >> --- a/net/core/dev.c >> +++ b/net/core/dev.c >> @@ -8959,7 +8959,6 @@ static void __net_exit default_device_exit(struct net >> *net) >> rtnl_lock(); >> for_each_netdev_safe(net, dev, aux) { >> int err; >> - char fb_name[IFNAMSIZ]; >> >> /* Ignore unmoveable devices (i.e. loopback) */ >> if (dev->features & NETIF_F_NETNS_LOCAL) >> @@ -8970,8 +8969,7 @@ static void __net_exit default_device_exit(struct net >> *net) >> continue; >> >> /* Push remaining network devices to init_net */ >> - snprintf(fb_name, IFNAMSIZ, "dev%d", dev->ifindex); >> - err = dev_change_net_namespace(dev, &init_net, fb_name); >> + err = dev_change_net_namespace(dev, &init_net, "dev%d"); >> if (err) { >> pr_emerg("%s: failed to move %s to init_net: %d\n", >> __func__, dev->name, err); >> > > This could cause repeated looping over __dev_alloc_name. If init_net has > a large number of devices, it is going to be a performance bottleneck.
Hm, but is this a likely case, when real device is moved to net ns, so it requires moving to init_net back? It seems the most devices moved to !init_net are virtual and they just destroyed in default_device_exit_batch(). Or we have more devices to care here? I don't much want to insert here something like below: if (__dev_get_by_name(&init_net, dev->name)) snprintf(fb_name, IFNAMSIZ, "dev%d", dev->ifindex); err = dev_change_net_namespace(dev, &init_net, "dev%d"); because dev_change_net_namespace() is generic interface and it's used not only here, and this will crumble the code in corner cases. Maybe you have better ideas about this? Kirill