From: Cong Wang > Sent: 01 February 2017 17:20 > On Tue, Jan 31, 2017 at 9:57 AM, David Laight <david.lai...@aculab.com> wrote: > > From: Cong Wang > >> Sent: 31 January 2017 17:38 > >> On Tue, Jan 31, 2017 at 7:41 AM, David Laight <david.lai...@aculab.com> > >> wrote: > >> > Commit 26abe1437 changed sock_create_kern() so that it stopped > >> > holding a reference to the network namespace. > >> > The rational seemed to be 'to allow to stop it' (presumably 'be > >> > deleted'). > >> > Prior to this change some kernel paths used sk_change_net() (etc) to > >> > change the namespace after the socket was created. > >> > > >> > If the socket doesn't hold a reference to the namespace, what actually > >> > happens when the namespace is deleted? > >> > >> Kernel socket should have the same lifetime with the net namespace, > >> that is, created in net_init and released in net_exit. Think about it, if > >> it > >> really held a refcnt to this netns, how could this netns be teared down? > > > > That rather depends on what they are being used for. > > Consider something like an in kernel ftp client, it doesn't really care > > about namespaces except in as much as the connections it creates must > > be inside the correct namespace. > > The namespace shouldn't be torn down while that connection exists any more > > than it should be torn down while a user process has an open connection. > > (Listening sockets are likely to be more of a problem.) > > If you don't care about netns, why not just use init_net which is never > torn down and make your kernel socket global so that each netns > can access it too?
If I create the kernel socket in init_net the connections don't work. In particular a connection to 127.0.0.1 to a process started in a different namespace (which contains an external ethernet port). So I care enough about them to have to create sockets in the right one. I don't care about namespaces being created or deleted. They do work if I save the net_ns from a 'random' open of the driver (from a process that happens to be running in the right namespace). However that just proves the kernel socket need to be in the right namespace. It isn't a real solution and I can't hold a reference count on the namespace at all (well I could call sock_create() and hold it via a user socket). As a matter of interest, a process can change namespace by doing: set_ns(open("/var/run/netns/namespace",...),...) How can it select init_ns ?? David