On Thursday 02 February 2006 00:37, Mitchell Blank Jr wrote: > Jeff Garzik wrote: > > Once packets classified to be delivered to a specific local host socket, > > what further operations are require privs? What received packet data > > cannot be exposed to userspace? > > You just need to make sure that you don't leak data from other peoples > sockets.
There are three basic ways I can see to do this: - You have really advanced hardware which can potentially manage tens of thousands of hardware queues with full classification down to the ports. Then everything is great. But who has such hardware? Perhaps Leonid will do it, but I expect the majority of Linux users to not have access to it in the forseeable time. Also even with the advanced hardware that can handle e.g. 50k sockets what happens when you need 100k for some extreme situation? - You use some high level easy classifier to distingush between classical "slower and isolated" streams and "fast and shared by everybody" streams. Let's say you use two IP addresses and program the NIC's hardware RX queues to distingush them. Then you end up with two receive rings - a standard one managed in the classical way and a netchannel one mapped into all applications running the user level TCP stack. This requires moderately advanced hardware (like a current XFrame and perhaps Tigon3?), but should be possible. One problem is that you will have to preallocate a lot of memory for the fast ring because mapping new memory this way is relatively costly (potentially lots of TLB flushes on all CPUs). And of course the data will be all shared between all fast users. Ok assuming the internet is considered a rogue place these days with sniffers everywhere I guess that's not too bad - everybody interested in privacy should use encryption anyways. Still maintaining the separate IP address as the high level classify anchor would be somewhat of a administrator burden. You could avoid it by putting just all data into the fast ring and allowing everybody interested to mmap it, but I'm not sure it's a good idea to completely drop all backwards compatibility in "secure" stream isolation. - You do classification to sockets in software in the interrupt handler and then copy the data once from the memory in the RX ring into a big preallocated buffer per netchannel consumer. That would work, but if the user space TCP stack is to emulate a standard read() interface it would likely need to copy again to get the data into the place the application expects it. This means you would have added an additional copy over the current stack, which is not good. Also question is how this classification would work and would it be really faster than what we do today? All the ways I described have severe drawbacks imho. Did I miss some clever additional way? -Andi - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html