My apologies for not looking at this earlier I had an email hickup so I'm having to recreate the context from email archives, and you didn't copy me.
Have you seen my previous work in this direction? I know I had a much much more complete implementation. The only part I had not completed was iptables support and that was about a days more work. > The following patches create a private "network namespace" for use > within containers. This is intended for use with system containers > like vserver, but might also be useful for restricting individual > applications' access to the network stack. > > These patches isolate traffic inside the network namespace. The > network ressources, the incoming and the outgoing packets are > identified to be related to a namespace. > > It hides network resource not contained in the current namespace, but > still allows administration of the network with normal commands like > ifconfig. > > It applies to the kernel version 2.6.17-rc6-mm1 A couple of comments. > ------------ > - do unshare with the CLONE_NEWNET flag as root > - do echo eth0 > /sys/kernel/debug/net_ns/dev > - use ifconfig or ip command to set a new ip address > > What is missing ? > ----------------- > The routes are not yet isolated, that implies: > > - binding to another container's address is allowed > > - an outgoing packet which has an unset source address can > potentially get another container's address > > - an incoming packet can be routed to the wrong container if there > are several containers listening to the same addr:port I haven't looked at the patches in enough detail to see how the network isolation is being done exactly. But some of these comments and some of the other pieces I have seen don't give me warm fuzzies. In particular I did not see a provision for multiple instance of the loopback device. As a general rule network sockets and network devices should be attached to the network namespaces, which basically preserves all of the existing network stack logic. Basically this means that the only operations that get more expensive are reads of global variables, which take a necessary extra indirection. As a general rule I found that it usually makes sense to add an additional namespace field to hash tables so they can still use the boot time memory allocator. Although if you already have a network device as part of your hash table key that isn't necessary for the network stack. Eric - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html