Re: [tcpdump-workers] BPF Extended: addressing BPF's shortcomings

2015-06-11 Thread Paul "LeoNerd" Evans
On Wed, 10 Jun 2015 19:25:58 -0700
Guy Harris  wrote:

> ...with some way of preventing infinite loops in the kernel, even if
> it's as crude as "there's a pointer into the packet and  if you do a
> backwards jump without moving that pointer forwards and checking to
> make sure you haven't gone beyond the end of the packet, the filter
> program immediately fails".  (Yes, that means it's no longer
> Turing-complete, as there's no longer a halting problem. :-))

That's exactly what my LOOP instruction suggestion does.

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
http://www.leonerd.org.uk/  |  https://metacpan.org/author/PEVANS
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] BPF Extended: addressing BPF's shortcomings

2015-06-11 Thread Darren Reed

On 11/06/2015 1:08 AM, Paul "LeoNerd" Evans wrote:

On Wed, 10 Jun 2015 23:17:20 +1000
Darren Reed  wrote:


BPF & IPv6
--
The problem with IPv6 and BPF is that the transport header (TCP,
UDP, etc) can have a number of extension headers between it and
the network header that is present for IPv6. There's no hints in
the IPv6 header as to how many of these extension headers there
are, or how many bytes the extension header(s) take up. This leaves
BPF in a precarious situation because it cannot be reliably used to
match on layer 4 packets. What's missing is the ability to either
find a specific header after the IPv6 network header or just to
determine what the last one is.

...

If you're considering extending BPF to better suit IPv6, have you seen
either of my proposed ideas?

  1) Add a LOOP instruction that allows certain kinds of
 backward-directed jumps, in order to efficiently implement the IPv6
 header-chain walking without needing manual loop unrolling, while
 still giving static guarantees about eventual termination of the
 program.


I haven't seen much of an appetite for any sort of loop construct in any
of the changes or discussions around BPF. Anywhere. It is often brought
up but always the point of a BPF program being easily verified is mentioned.



  2) A few more AD constants added to the Linux "auxdata" area, giving
 information about the transport layer.


Can you please expand on this?


Darren

___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] BPF Extended: addressing BPF's shortcomings

2015-06-11 Thread Darren Reed

On 11/06/2015 9:31 AM, Mindaugas Rasiukevicius wrote:

Darren Reed  wrote:

Extending BPF
=

Introduction

BPF was originally designed to provide very fast packet matching
capabilities for IPv4 but as a result of its generic nature, is
capable of being used for just about any protocol. With IPv6 the
limitations of BPF became apparent.

...

Conceptually, I like the idea of an extended BPF instruction set.  There
are several important questions here.  First, what is the exact problem we
want to solve with a new instruction set?  Is it just the IPv6 handling?

I do find the BPF byte-code useful as a general purpose instruction set
i.e. for the use as a universal virtual machine.  This was also one of the
driving forces behind the Linux eBPF.  They use it beyond packet filtering.
Note that LLVM has recently gained support for the eBPF backend.  If that
is the objective, then there is a wider spectrum of the requirements.

Specifically, I would like to see:

- Capability to jump backwards.  Basically, the general purpose instruction
set ought be Turing-complete.  Obviously, with a way to enable/disable this
depending whether the user needs bpf_validate().


Do you have any thoughts on what sort of conditions this should be 
allowed in?


Guy's suggestion is intriguing and suggests to me something like this:

If condition is true, jump (either forward or backward?) AND
move the start of packet pointerforward by # > 0 bytes.

Rather than try and detect the move of the start of packet pointer,
force it to occur as part of the loop instruction?



- 32-bit jump offsets.  Currently, they are 16-bit which is quite limiting
if you have a larger BPF program.


Yes, agreed.



- Opcode extended to 32-bits.  It seems we agree on this, although this
can be debatable.  The classic BPF byte-code has a simple, minimalistic
RISC-like instruction set (with the exception of BPF_MSH hack).  I would
be inclined to keep it that way instead of polluting the, quite limited,
instruction space with various arbitrary mechanisms, but this is somewhat
philosophical RISC vs CISC debate.  Nevertheless, if the general feeling
is to go with complex instructions, then we could at least dedicate a wide
range for them.


What are RISC and CISC these days but a layer over microcode?

Consider that when BPF was developed, 32MB of RAM was a lot of memoryand
the use of long options with getopt was almost unheard of.Today thesize
of theinstruction might almost be considered to be a hindrance to 
performance

asaccessing the individual bytes is no longer performance friendly andwill
often result in the CPU fetching a complete word anyway. Thus the compact
size of the BPF instruction is no longer the benefit it once was.




- Support for 64-bit words, but not quite convinced about 128-bit words.
Do you want to add them just to accommodate IPv6?  Why not to leave this
for the byte-code generator/compiler?


Considering Michael's comments about concentrating on what opcodes the 
compiler
should generate, it may be a better idea to focus on 128bit words rather 
than 64bit
words because it is rather elegant to have the compiler produce 
instructions that
allow for the address to be used in a single instruction rather than 
needing to do

several.

As a for example, it is easier to mentally inspect and verify the following:

ldq  [32]
jeqq #0x01jt 2jf 3

than to try and follow half a dozen instructions. Now when it comes to 
making

changes to those values, e.g. to apply a netmask:

ldq  [32]
andq #0xfff0
jeqq #0x200200010a90

How many instructions does BPF need to emit today to do that?
Which is easier to understand?

Not only that but if you are writing manual instructions to do math
(such as addition or subtraction), it becomes harder again to ensure
what's happening is correct.Do you really want to do multiple 32bit
instructions with BPF to represent adding together two 128bit values?

I would rather have instructions with larger operands that are easier
for the parser to generate and let the interpreter (or JIT) worry about
how to execute them.




- External scratch memory store or a way to initialise it before calling
the BPF program.  Also, potentially arbitrary (dynamic) BPF_MEMWORDS size
rather than hardcoded size.  Basically, the user/caller should be able to
provide arbitrary data through the memory store.


What's the end goal here?

Thinking on the fly here, what if the bpf program had a data section on the
end of it or at the beginning? Although with JIT this may be starting to 
wander
down the road of inventing a object file format and that might be a bit 
too far.





- BPF_COP and BPF_CALL.  When I added BPF_COP to NetBSD, I thought about
the generic BPF_CALL to invoke *arbitrary* functions.  It requires solving
some of the above problems first, but there is an important difference:
BPF_COP allows program to invoke a *predetermined* set of func

Re: [tcpdump-workers] BPF Extended: addressing BPF's shortcomings

2015-06-11 Thread Paul "LeoNerd" Evans
On Thu, 11 Jun 2015 20:12:00 +1000
Darren Reed  wrote:

> >   2) A few more AD constants added to the Linux "auxdata" area,
> > giving information about the transport layer.
> 
> Can you please expand on this?

See the SKF_NET_OFF and SKF_LL_OFF constants.
I wanted to simply add another, SKF_TRANS_OFF

This would give an offset into a virtual view of the "transport" layer;
i.e. the start of the TCP/UDP/whatever header, regardless where it
starts in the packet.

Now, filtering for a given TCP port only needs to compare the value of
SKF_AD_TRANSPORT (which we'd also have to add), and then look at
certain indexes into SKF_TRANS_OFF; it doesn't have to *find* the TCP
header at all, doesn't care if it's IPv4 or IPv6 or whatever...

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
http://www.leonerd.org.uk/  |  https://metacpan.org/author/PEVANS
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] BPF Extended: addressing BPF's shortcomings

2015-06-11 Thread Paul "LeoNerd" Evans
On Thu, 11 Jun 2015 21:05:20 +1000
Darren Reed  wrote:

> I would rather have instructions with larger operands that are easier
> for the parser to generate and let the interpreter (or JIT) worry
> about how to execute them.

+1

BPF is supposed to be a high-level interface to describe some sort
of virtual program that doesn't have to concern itself with the mundane
trivialities of how silicon actually implements it.

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
http://www.leonerd.org.uk/  |  https://metacpan.org/author/PEVANS
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] BPF Extended: addressing BPF's shortcomings

2015-06-11 Thread Michael Richardson

"Paul \"LeoNerd\" Evans"  wrote:
>> >   2) A few more AD constants added to the Linux "auxdata" area,
>> > giving information about the transport layer.
>>
>> Can you please expand on this?

> See the SKF_NET_OFF and SKF_LL_OFF constants.
> I wanted to simply add another, SKF_TRANS_OFF

> This would give an offset into a virtual view of the "transport" layer;
> i.e. the start of the TCP/UDP/whatever header, regardless where it
> starts in the packet.

> Now, filtering for a given TCP port only needs to compare the value of
> SKF_AD_TRANSPORT (which we'd also have to add), and then look at
> certain indexes into SKF_TRANS_OFF; it doesn't have to *find* the TCP
> header at all, doesn't care if it's IPv4 or IPv6 or whatever...

Is Linux even going to set that if it's for a VLAN or an IP address that
is not recognized as local?

--
]   Never tell me the odds! | ipv6 mesh networks [
]   Michael Richardson, Sandelman Software Works| network architect  [
] m...@sandelman.ca  http://www.sandelman.ca/|   ruby on rails[



___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers