On Fri, 2016-12-30 at 07:51 +0100, Vincent Bernat wrote: > The same work is not repeated over and over again. The kernel keeps > the needed information in a structure to avoid parsing the packet > several times.
Yes, it does indeed keep the offset of the headers for the various protocol layers (link, ip, transport) in the skbuff. And yes, it uses them where it can - for example when routing it knows it will be looking at the destination ip address, and when routing it can use those fields. In netfilter (iptables) has a separate block of code for each test, so it can use them so the test knows . Unfortunately this comes at a large cost - sequential execution. In the traffic control engine the main filter, u32, doesn't have a clue what you are looking at, so it can't. nftables could conceivably do it - but internally is structured like u32 so it doesn't. eBpf could also conceivably do it - but it has less of an idea of what it is looking at than u32 so it doesn't either. Linux provides what 2(?) API's for manipulating files - read/write and memory mapped io. Want to count the number of ways it provides to dissect and act upon network packets? These aren't the esoteric things the file system hides under ioctl's either (which is arguably how it the main API remains so clean) - all these ways of pulling apart and manipulating packets are in the fast path. > When you need to decide how to route the packet, you need to do a > route lookup. If the route entry you find happens to be a blackhole > route, you drop the packet. You didn't do any additional work. I bet the bash authors use that argument when they add another feature. In reality all code has a cost. Do you remember Van Jacobson's suggestion for speeding up Linux's network stack? (It's mentioned in his Wikipedia page.) I was lucky enough to be at linux.conf.au when he first presented it. He provided working code and some very impressive benchmarks. It went nowhere because of that cost thing - the code doesn't have to be executed in order to slow down development. I think Linux's networking mob pretty much gave up after the ipchains to iptables transition. Now stuff doesn't get ripped out and replaced, instead new ideas are bolted onto the side - creating a bigger ball of wax that's even harder to move. Which is how we got to having the ability to drop packets added in so many different places. > Those benchmarks show huge differences between Linux distributions > for a kernel-related task. Like all phoronix benchmarks, it should > not be trusted. Maybe you trust Facebook's opinion instead (it's weird - that wasn't easy to write): http://www.theinquirer.net/inquirer/news/2359272/f
signature.asc
Description: This is a digitally signed message part