Re: [tcpdump-workers] [tcpdump] New feature to limit capture file size (#464)

2015-06-10 Thread Darren Reed

On 10/06/2015 5:42 AM, Michael Richardson wrote:

re: https://github.com/the-tcpdump-group/tcpdump/pull/464 Guy writes:
We have the -C option, giving a file size in megabytes (real 
megabytes, i.e. 1,000,000 bytes, not 1,048,576 bytes); once the file 
gets that big, tcpdump switches to a new file. 
This adds another file size option, with a different syntax for the 
size option, and with tcpdump stopping rather than rotating files 
when it reaches that size. 
We also have the -G option, to rotate files based on time rather than 
size. 
We might want to consider cleaning up these options a bit, so that we 
can specify "stop" vs. "rotate" and "file size" rather than "capture 
time" independently. 
thoughts? I'm happy to accept the patch once sane, and then clean it 
up as Guy suggests.



Maybe it is time to work out how this should interact now...

Why is there even a need to have -G and -C?
Aren't both really just the same feature? (file rotation)
But with a different parameter?
Why can't it be "-C 1h,500M"? (rotate after 500M or 1h)
... and so on.

I think the "We might want to..." paragraph is right but that more thought
is needed.

Darren

___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


[tcpdump-workers] BPF Extended: addressing BPF's shortcomings

2015-06-10 Thread Darren Reed

Extending BPF
=

Introduction

BPF was originally designed to provide very fast packet matching
capabilities for IPv4 but as a result of its generic nature, is
capable of being used for just about any protocol. With IPv6 the
limitations of BPF became apparent.

BPF & IPv6
--
The problem with IPv6 and BPF is that the transport header (TCP,
UDP, etc) can have a number of extension headers between it and
the network header that is present for IPv6. There's no hints in
the IPv6 header as to how many of these extension headers there
are, or how many bytes the extension header(s) take up. This leaves
BPF in a precarious situation because it cannot be reliably used to
match on layer 4 packets. What's missing is the ability to either
find a specific header after the IPv6 network header or just to
determine what the last one is.

There is also the problem that with IPv6 a single BPF instruction
is no longer enough to perform mathematics on the address: IPv6
instructions are 128bits long and BPF is limited to 32bit instructions.
Only the bit operations such as OR, AND and XOR can be easily used.
The traditional BPF instruction has 3 bits to represent the size of
operands, allowing for "word" (32bit), "half-word" (16bit) and "byte"
(8 bit). This limitation also means that BPF is not capable of taking
full advantage of 64bit CPUs that are common today.

Other gaps
--
One of the more common uses of BPF is to select a packet based on
a number of conditions, such as ports 80, 443, 8000 and 8080. To
do this requires 4 different comparisons when all that is really
required is to be able to do a search amongst a set of values.
This is also a problem when selection is done based on IP address(es).

The maximum size of a BPF program is effectively limited by the range
of a jump that is stored in an 8bit value. Whilst the current limit
on instructions is 512, correctly coding a program requires that
all blocks of code requiring a jump around are no larger than 255
instructions.

Looking at other work
-
eBPF on Linux has evolved a long way from BPF in its original form
where "raw" instructions and has overloaded some bit definitions.
An example is BPF_ALU64 (eBPF) that is the same instruction class
as BPF_MISC. eBPF also introduces the concept of maps that are a
container of objects to match up against with a lookup. Whilst eBPF
adds some double word functionality, it doesn't provide single a
solution to providing a single instruction to work on a 128bit
address. The use of maps can be used to move looking for a match
in a set of numbers into a single instruction, however that isn't
implemented as a native operation, rather it is a function call
(BPF_CALL) with parameters to define which map to lookup and for
what.

In NetBSD, BPF_COP has been introduced that is somewhat similar
in implementation to eBPF's BPF_CALL whereby a more complex
function that is outside of BPF is available to be called with
some args passed/returned through the A/X registers. This has
the potential to solve the "lookup a value in a set" problem
as well as deal with IPv6 extension headers but does not address
the problem of instruction space crowding or provide for 64bit
native instructions.

Proposed Solution
-
There is no capacity within the BPF instruction set as it exists
today to further expand enough to solve the above limitations with
IPv6 in a meaningful and elegant fashion. Thus any solution is forced
to redesign the instruction format so that operands with 64 and 128
bits are possible, along with new instructions that are capable of
filling the gap with IPv6 plus create room for further growth. In
recognition of the constraints now being put on BPF programs, the
maximum size is increased to 64k. The extra size is supported through
the use of a larger instruction format and larger operands, allowing
for jumps to the end.

Part of redefining the instruction set allows for more space to be
allowed for future growth - such as in the register set. At present
I've only defined 3 to match those that exist now, but there is room
to grow that as has Linux.

Note that I haven't designed this with JIT compilers in mind, rather
I have tried to think about what are the common operations that are
required and how do they fit into what an engine that matches packets
would be expected to do natively.

Is supporting something like eBPF's MAP instructions better than
doing a lookup to see if something is in a set of values? Or are
the two not exclusive? Does similar functionality turning up in
eBPF (BPF_CALL) and NetBSD (BPF_COP) suggest that this is a missing
feature from BPF itself? On the one hand extending the instruction
set to do advanced steps such as "find the last header" on an IPv6
packet is a step away from the more simple BPF instructions but on
the other, all behaviour is predefined

In terms of progress in implementing this, I'm working on the code
to generate the BPF ins

Re: [tcpdump-workers] [tcpdump] New feature to limit capture file size (#464)

2015-06-10 Thread Wesley Shields

> On Jun 10, 2015, at 7:35 AM, Darren Reed  wrote:
> 
> On 10/06/2015 5:42 AM, Michael Richardson wrote:
>> re: https://github.com/the-tcpdump-group/tcpdump/pull/464 Guy writes:
>>> We have the -C option, giving a file size in megabytes (real megabytes, 
>>> i.e. 1,000,000 bytes, not 1,048,576 bytes); once the file gets that big, 
>>> tcpdump switches to a new file. This adds another file size option, with a 
>>> different syntax for the size option, and with tcpdump stopping rather than 
>>> rotating files when it reaches that size. We also have the -G option, to 
>>> rotate files based on time rather than size. We might want to consider 
>>> cleaning up these options a bit, so that we can specify "stop" vs. "rotate" 
>>> and "file size" rather than "capture time" independently. 
>> thoughts? I'm happy to accept the patch once sane, and then clean it up as 
>> Guy suggests.
> 
> 
> Maybe it is time to work out how this should interact now...
> 
> Why is there even a need to have -G and -C?
> Aren't both really just the same feature? (file rotation)
> But with a different parameter?
> Why can't it be "-C 1h,500M"? (rotate after 500M or 1h)
> ... and so on.
> 
> I think the "We might want to..." paragraph is right but that more thought
> is needed.

Whatever the decision is can we be sure that the various options (-G, -C and 
-W) are kept for backwards compatibility for a long time? I ask because I know 
of some places which are using -G to write packets in time sliced files and 
changing the semantics of that flag would cause mass chaos.

What has never been clear to me is how -G, -C and -W work together, so any way 
to simplify things there and make it clearer is welcome by me.

-- WXS
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] BPF Extended: addressing BPF's shortcomings

2015-06-10 Thread Paul "LeoNerd" Evans
On Wed, 10 Jun 2015 23:17:20 +1000
Darren Reed  wrote:

> BPF & IPv6
> --
> The problem with IPv6 and BPF is that the transport header (TCP,
> UDP, etc) can have a number of extension headers between it and
> the network header that is present for IPv6. There's no hints in
> the IPv6 header as to how many of these extension headers there
> are, or how many bytes the extension header(s) take up. This leaves
> BPF in a precarious situation because it cannot be reliably used to
> match on layer 4 packets. What's missing is the ability to either
> find a specific header after the IPv6 network header or just to
> determine what the last one is.
...

If you're considering extending BPF to better suit IPv6, have you seen
either of my proposed ideas?

 1) Add a LOOP instruction that allows certain kinds of
backward-directed jumps, in order to efficiently implement the IPv6
header-chain walking without needing manual loop unrolling, while
still giving static guarantees about eventual termination of the
program.

 2) A few more AD constants added to the Linux "auxdata" area, giving
information about the transport layer.

Both of these ideas are ones I've tried to point either Linux or
FreeBSD in the direction of, and received almost total silence on. If
you did want to make some direct impact on making IPv6 easier to
handle, I'd suggest either or both of these would make a great start.

-- 
Paul "LeoNerd" Evans

leon...@leonerd.org.uk
http://www.leonerd.org.uk/  |  https://metacpan.org/author/PEVANS
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] BPF_COP support for libpcap

2015-06-10 Thread Mindaugas Rasiukevicius
Darren Reed  wrote:
> > What is "vendor private"?  It does not really matter how you label it.
> 
> Yes, it does.
> 
> By defining an instruction to be "something" there is an expectation that
> it will be used for that "something."

Your "something" is rather vague.  BPF_COP is used by NetBSD, standalone
NPF on Linux, and it is supported by standalone bpfjit.  It provides a
great flexibility for other use cases.

> > It is worth to note that we might want to support multiple coprocessors.
> > Very much like in MIPS - for memory management, FPU and various hardware
> > accelerators - we might have a standardised coprocessor along with the
> > custom ones.
> 
> So?
> 
> Doesn't it give a vendor more flexibility to have an instruction that
> isreserved for them to use however they wish rather than to have it
> defined as "something"?
> 
> <...>

It is exactly what COP provides for the user.  The BPF coprocessor API is
quite clearly described and implemented in the NetBSD kernel.  If libpcap
community would like to have a full COP support in the libpcap interpreter,
then I am happy to backport it.  However, just the "cop" keyword itself is
a useful feature: users can compile programs using the pcap-filter syntax
*and* custom mechanisms implemented by their coprocessor.

> The limitations of BPF are not confined to what has been developed for
> NetBSD with the BPF_COP addon. As an example, it is not possible to do
> a native 64bit operation with BPF as it is today - that's not a political
> issue, that's technical. The issue of there being no advanced primitives
> is another - the solution used by NetBSD with BPF_COP is just one way of
> dealing with that problem and it is a workaround to being able to do the
> right thing within the scope of BPF instructions and since the
> instruction set is limited so people try to avoid using new instructions.

Of course not, because it solves a different problem (specifically, a way
to extend BPF and support complex/external operations).  It is not solving
the inherent limitations of classic BPF instruction set (32-bit operations,
limited instruction space, short jumps, etc).  Linux eBPF, as an example,
is something what attempts to solve some of these problems.

> The only time it gets "political" (if you will) is when someone says their
> modification should be "the solution", when it doesn't really solve the
> problem of shortcomings in the instruction set and its definition.

You have not provided a single reasonable argument so far.  The reason why
I am not keen to see various specialised instructions is that they pollute
the instruction space which, in classic BPF, is already quite limited.
Moreover, it is a pollution with CISC-like instructions when we currently
have a quite neat RISC-like set.  You know, there is a reason why BPF_MSH
is called "a hack" in bpf(4) manual page. :)

Having said that, I think is worth to consider new instructions for IPv6
header walking or other functionality (or limitations of BPF byte-code in
general), but this discussion is quite orthogonal to the BPF_COP issue.

-- 
Mindaugas
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] BPF Extended: addressing BPF's shortcomings

2015-06-10 Thread Mindaugas Rasiukevicius
Darren Reed  wrote:
> Extending BPF
> =
> 
> Introduction
> 
> BPF was originally designed to provide very fast packet matching
> capabilities for IPv4 but as a result of its generic nature, is
> capable of being used for just about any protocol. With IPv6 the
> limitations of BPF became apparent.
>
> ...

Conceptually, I like the idea of an extended BPF instruction set.  There
are several important questions here.  First, what is the exact problem we
want to solve with a new instruction set?  Is it just the IPv6 handling?

I do find the BPF byte-code useful as a general purpose instruction set
i.e. for the use as a universal virtual machine.  This was also one of the
driving forces behind the Linux eBPF.  They use it beyond packet filtering.
Note that LLVM has recently gained support for the eBPF backend.  If that
is the objective, then there is a wider spectrum of the requirements.

Specifically, I would like to see:

- Capability to jump backwards.  Basically, the general purpose instruction
set ought be Turing-complete.  Obviously, with a way to enable/disable this
depending whether the user needs bpf_validate().

- 32-bit jump offsets.  Currently, they are 16-bit which is quite limiting
if you have a larger BPF program.

- Opcode extended to 32-bits.  It seems we agree on this, although this
can be debatable.  The classic BPF byte-code has a simple, minimalistic
RISC-like instruction set (with the exception of BPF_MSH hack).  I would
be inclined to keep it that way instead of polluting the, quite limited,
instruction space with various arbitrary mechanisms, but this is somewhat
philosophical RISC vs CISC debate.  Nevertheless, if the general feeling
is to go with complex instructions, then we could at least dedicate a wide
range for them.

- Support for 64-bit words, but not quite convinced about 128-bit words.
Do you want to add them just to accommodate IPv6?  Why not to leave this
for the byte-code generator/compiler?

- External scratch memory store or a way to initialise it before calling
the BPF program.  Also, potentially arbitrary (dynamic) BPF_MEMWORDS size
rather than hardcoded size.  Basically, the user/caller should be able to
provide arbitrary data through the memory store.

- BPF_COP and BPF_CALL.  When I added BPF_COP to NetBSD, I thought about
the generic BPF_CALL to invoke *arbitrary* functions.  It requires solving
some of the above problems first, but there is an important difference:
BPF_COP allows program to invoke a *predetermined* set of functions, while
the capability to invoke arbitrary functions through BPF_CALL can have
security implications (yet it is more powerful).  I would like to see both,
but with the ability to disable, at least, BPF_CALL.

Another point to consider, besides the JIT compilation, is API level
backwards compatibility with the classic BPF instruction set.  Having at
least partially compatible instruction set would ease the migration for
the existing BPF byte-code generators/compilers.

Last, but not least, how does this all fit in the libpcap/tcpdump project?
Are the project goals exclusively limited to capturing the network traffic
or there is a desire to abstract parts of libpcap into some more generic
libbpf?  Also, given that Linux eBPF is gaining the momentum, how realistic
is to push a competing instruction set?

-- 
Mindaugas
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] BPF Extended: addressing BPF's shortcomings

2015-06-10 Thread Guy Harris

On Jun 10, 2015, at 4:31 PM, Mindaugas Rasiukevicius  wrote:

> Darren Reed  wrote:
>> Extending BPF
>> =
>> 
>> Introduction
>> 
>> BPF was originally designed to provide very fast packet matching
>> capabilities for IPv4 but as a result of its generic nature, is
>> capable of being used for just about any protocol. With IPv6 the
>> limitations of BPF became apparent.
>> 
>> ...
> 
> Conceptually, I like the idea of an extended BPF instruction set.  There
> are several important questions here.  First, what is the exact problem we
> want to solve with a new instruction set?  Is it just the IPv6 handling?

No, we'd like to, at minimum, be able to cope with VLANs better than we do now 
- ideally, it'd be nice to be able to, for example, say "ip" in a filter and 
have it match IP over Ethernet and IP in a VLAN over Ethernet and IP in a VLAN 
in another VLAN over Ethernet and so on.

> Specifically, I would like to see:
> 
> - Capability to jump backwards.  Basically, the general purpose instruction
> set ought be Turing-complete.  Obviously, with a way to enable/disable this
> depending whether the user needs bpf_validate().

...with some way of preventing infinite loops in the kernel, even if it's as 
crude as "there's a pointer into the packet and  if you do a backwards jump 
without moving that pointer forwards and checking to make sure you haven't gone 
beyond the end of the packet, the filter program immediately fails".  (Yes, 
that means it's no longer Turing-complete, as there's no longer a halting 
problem. :-))

> - Opcode extended to 32-bits.  It seems we agree on this, although this
> can be debatable.  The classic BPF byte-code has a simple, minimalistic
> RISC-like instruction set (with the exception of BPF_MSH hack).  I would
> be inclined to keep it that way instead of polluting the, quite limited,
> instruction space with various arbitrary mechanisms, but this is somewhat
> philosophical RISC vs CISC debate.  Nevertheless, if the general feeling
> is to go with complex instructions, then we could at least dedicate a wide
> range for them.

If the machine language is interpreted, frequently-executed complicated 
instructions might help performance.  If it's translated to machine code and 
executed, it probably wouldn't make much of a difference as long as the JIT 
compiler does a reasonably good job.

> Last, but not least, how does this all fit in the libpcap/tcpdump project?

It fits into tcpdump the same way it fits into Wireshark or Snort or... - you 
supply a filter expression to the application and it hands it to libpcap's 
compiler.

For libpcap:

> Are the project goals exclusively limited to capturing the network traffic
> or there is a desire to abstract parts of libpcap into some more generic
> libbpf?

I wouldn't be opposed to putting the BPF interpreter into a libbpf; whether the 
compiler belongs there or not depends on how  generic it is - if it's generic 
enough that it's used for purposes other than looking at network packets, the 
rather network-oriented libpcap filter language might not be appropriate.

> Also, given that Linux eBPF is gaining the momentum, how realistic
> is to push a competing instruction set?

All other things being equal, I'd go for a strategy that increases the chances 
that the new language will be adopted by the OSes whose kernel code supports 
BPF (Linux, *BSD/OS X, Solaris, AIX).  If we can extend eBPF for our purposes, 
that might make it more likely for Linux to pick it up, as long as we can have 
a BSD-licensed interpreter (plus perhaps JIT compilers) for the same machine 
code.
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] [tcpdump] New feature to limit capture file size (#464)

2015-06-10 Thread Guy Harris

On Jun 10, 2015, at 4:35 AM, Darren Reed  wrote:

> On 10/06/2015 5:42 AM, Michael Richardson wrote:
>> re: https://github.com/the-tcpdump-group/tcpdump/pull/464 Guy writes:
>>> We have the -C option, giving a file size in megabytes (real megabytes, 
>>> i.e. 1,000,000 bytes, not 1,048,576 bytes); once the file gets that big, 
>>> tcpdump switches to a new file. This adds another file size option, with a 
>>> different syntax for the size option, and with tcpdump stopping rather than 
>>> rotating files when it reaches that size. We also have the -G option, to 
>>> rotate files based on time rather than size. We might want to consider 
>>> cleaning up these options a bit, so that we can specify "stop" vs. "rotate" 
>>> and "file size" rather than "capture time" independently. 
>> thoughts? I'm happy to accept the patch once sane, and then clean it up as 
>> Guy suggests.
> 
> 
> Maybe it is time to work out how this should interact now...
> 
> Why is there even a need to have -G and -C?

"Why was there a need to have -G and -C?"  Not clear.

"Why *is* there a need to have -G and -C?"  As Wesley Shields indicated, the 
answer is "for backwards compatibility".

I.e., something cleaner probably *should* have been done, but that's water 
under the bridge; we could do something arguably cleaner, along the lines of 
Wireshark's -a and -b flags, but, if we do that, we still need to keep -G and 
-C around for the benefit of older scripts.

I think the cleaner flags should:

1) allow checks for file size and capture time (and packet count?);

2) allow either stopping the capture, switching capture files (with no 
upper bound on the number of capture files), or rotating capture files (with an 
upper bound on the number of capture files);

with 1) and 2) being orthogonal.

I don't know whether there's a need to, in 1), allow *multiple* criteria, with, 
presumably the capture stopping/switching/rotating when any of the criteria are 
met.
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers