Re: [tcpdump-workers] BPF Extended: addressing BPF's shortcomings
On Wed, 10 Jun 2015 19:25:58 -0700 Guy Harris wrote: > ...with some way of preventing infinite loops in the kernel, even if > it's as crude as "there's a pointer into the packet and if you do a > backwards jump without moving that pointer forwards and checking to > make sure you haven't gone beyond the end of the packet, the filter > program immediately fails". (Yes, that means it's no longer > Turing-complete, as there's no longer a halting problem. :-)) That's exactly what my LOOP instruction suggestion does. -- Paul "LeoNerd" Evans leon...@leonerd.org.uk http://www.leonerd.org.uk/ | https://metacpan.org/author/PEVANS ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] BPF Extended: addressing BPF's shortcomings
On 11/06/2015 1:08 AM, Paul "LeoNerd" Evans wrote: On Wed, 10 Jun 2015 23:17:20 +1000 Darren Reed wrote: BPF & IPv6 -- The problem with IPv6 and BPF is that the transport header (TCP, UDP, etc) can have a number of extension headers between it and the network header that is present for IPv6. There's no hints in the IPv6 header as to how many of these extension headers there are, or how many bytes the extension header(s) take up. This leaves BPF in a precarious situation because it cannot be reliably used to match on layer 4 packets. What's missing is the ability to either find a specific header after the IPv6 network header or just to determine what the last one is. ... If you're considering extending BPF to better suit IPv6, have you seen either of my proposed ideas? 1) Add a LOOP instruction that allows certain kinds of backward-directed jumps, in order to efficiently implement the IPv6 header-chain walking without needing manual loop unrolling, while still giving static guarantees about eventual termination of the program. I haven't seen much of an appetite for any sort of loop construct in any of the changes or discussions around BPF. Anywhere. It is often brought up but always the point of a BPF program being easily verified is mentioned. 2) A few more AD constants added to the Linux "auxdata" area, giving information about the transport layer. Can you please expand on this? Darren ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] BPF Extended: addressing BPF's shortcomings
On 11/06/2015 9:31 AM, Mindaugas Rasiukevicius wrote: Darren Reed wrote: Extending BPF = Introduction BPF was originally designed to provide very fast packet matching capabilities for IPv4 but as a result of its generic nature, is capable of being used for just about any protocol. With IPv6 the limitations of BPF became apparent. ... Conceptually, I like the idea of an extended BPF instruction set. There are several important questions here. First, what is the exact problem we want to solve with a new instruction set? Is it just the IPv6 handling? I do find the BPF byte-code useful as a general purpose instruction set i.e. for the use as a universal virtual machine. This was also one of the driving forces behind the Linux eBPF. They use it beyond packet filtering. Note that LLVM has recently gained support for the eBPF backend. If that is the objective, then there is a wider spectrum of the requirements. Specifically, I would like to see: - Capability to jump backwards. Basically, the general purpose instruction set ought be Turing-complete. Obviously, with a way to enable/disable this depending whether the user needs bpf_validate(). Do you have any thoughts on what sort of conditions this should be allowed in? Guy's suggestion is intriguing and suggests to me something like this: If condition is true, jump (either forward or backward?) AND move the start of packet pointerforward by # > 0 bytes. Rather than try and detect the move of the start of packet pointer, force it to occur as part of the loop instruction? - 32-bit jump offsets. Currently, they are 16-bit which is quite limiting if you have a larger BPF program. Yes, agreed. - Opcode extended to 32-bits. It seems we agree on this, although this can be debatable. The classic BPF byte-code has a simple, minimalistic RISC-like instruction set (with the exception of BPF_MSH hack). I would be inclined to keep it that way instead of polluting the, quite limited, instruction space with various arbitrary mechanisms, but this is somewhat philosophical RISC vs CISC debate. Nevertheless, if the general feeling is to go with complex instructions, then we could at least dedicate a wide range for them. What are RISC and CISC these days but a layer over microcode? Consider that when BPF was developed, 32MB of RAM was a lot of memoryand the use of long options with getopt was almost unheard of.Today thesize of theinstruction might almost be considered to be a hindrance to performance asaccessing the individual bytes is no longer performance friendly andwill often result in the CPU fetching a complete word anyway. Thus the compact size of the BPF instruction is no longer the benefit it once was. - Support for 64-bit words, but not quite convinced about 128-bit words. Do you want to add them just to accommodate IPv6? Why not to leave this for the byte-code generator/compiler? Considering Michael's comments about concentrating on what opcodes the compiler should generate, it may be a better idea to focus on 128bit words rather than 64bit words because it is rather elegant to have the compiler produce instructions that allow for the address to be used in a single instruction rather than needing to do several. As a for example, it is easier to mentally inspect and verify the following: ldq [32] jeqq #0x01jt 2jf 3 than to try and follow half a dozen instructions. Now when it comes to making changes to those values, e.g. to apply a netmask: ldq [32] andq #0xfff0 jeqq #0x200200010a90 How many instructions does BPF need to emit today to do that? Which is easier to understand? Not only that but if you are writing manual instructions to do math (such as addition or subtraction), it becomes harder again to ensure what's happening is correct.Do you really want to do multiple 32bit instructions with BPF to represent adding together two 128bit values? I would rather have instructions with larger operands that are easier for the parser to generate and let the interpreter (or JIT) worry about how to execute them. - External scratch memory store or a way to initialise it before calling the BPF program. Also, potentially arbitrary (dynamic) BPF_MEMWORDS size rather than hardcoded size. Basically, the user/caller should be able to provide arbitrary data through the memory store. What's the end goal here? Thinking on the fly here, what if the bpf program had a data section on the end of it or at the beginning? Although with JIT this may be starting to wander down the road of inventing a object file format and that might be a bit too far. - BPF_COP and BPF_CALL. When I added BPF_COP to NetBSD, I thought about the generic BPF_CALL to invoke *arbitrary* functions. It requires solving some of the above problems first, but there is an important difference: BPF_COP allows program to invoke a *predetermined* set of func
Re: [tcpdump-workers] BPF Extended: addressing BPF's shortcomings
On Thu, 11 Jun 2015 20:12:00 +1000 Darren Reed wrote: > > 2) A few more AD constants added to the Linux "auxdata" area, > > giving information about the transport layer. > > Can you please expand on this? See the SKF_NET_OFF and SKF_LL_OFF constants. I wanted to simply add another, SKF_TRANS_OFF This would give an offset into a virtual view of the "transport" layer; i.e. the start of the TCP/UDP/whatever header, regardless where it starts in the packet. Now, filtering for a given TCP port only needs to compare the value of SKF_AD_TRANSPORT (which we'd also have to add), and then look at certain indexes into SKF_TRANS_OFF; it doesn't have to *find* the TCP header at all, doesn't care if it's IPv4 or IPv6 or whatever... -- Paul "LeoNerd" Evans leon...@leonerd.org.uk http://www.leonerd.org.uk/ | https://metacpan.org/author/PEVANS ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] BPF Extended: addressing BPF's shortcomings
On Thu, 11 Jun 2015 21:05:20 +1000 Darren Reed wrote: > I would rather have instructions with larger operands that are easier > for the parser to generate and let the interpreter (or JIT) worry > about how to execute them. +1 BPF is supposed to be a high-level interface to describe some sort of virtual program that doesn't have to concern itself with the mundane trivialities of how silicon actually implements it. -- Paul "LeoNerd" Evans leon...@leonerd.org.uk http://www.leonerd.org.uk/ | https://metacpan.org/author/PEVANS ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] BPF Extended: addressing BPF's shortcomings
"Paul \"LeoNerd\" Evans" wrote: >> > 2) A few more AD constants added to the Linux "auxdata" area, >> > giving information about the transport layer. >> >> Can you please expand on this? > See the SKF_NET_OFF and SKF_LL_OFF constants. > I wanted to simply add another, SKF_TRANS_OFF > This would give an offset into a virtual view of the "transport" layer; > i.e. the start of the TCP/UDP/whatever header, regardless where it > starts in the packet. > Now, filtering for a given TCP port only needs to compare the value of > SKF_AD_TRANSPORT (which we'd also have to add), and then look at > certain indexes into SKF_TRANS_OFF; it doesn't have to *find* the TCP > header at all, doesn't care if it's IPv4 or IPv6 or whatever... Is Linux even going to set that if it's for a VLAN or an IP address that is not recognized as local? -- ] Never tell me the odds! | ipv6 mesh networks [ ] Michael Richardson, Sandelman Software Works| network architect [ ] m...@sandelman.ca http://www.sandelman.ca/| ruby on rails[ ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers