On Jan 29, 2013, at 12:54 PM, Wenfei Wu <wenfe...@cs.wisc.edu> wrote:

>  When using tcpdump capture trace, we can add filter expressions (  in a
> form of  primitive [and/or primitive] ).
>  I want to know how the packets are parsed and matched to this filter
> expression. Is there some intermediate data structure for the filter
> expression?

Yes.

libpcap/WinPcap compiles filter expressions into machine code for an 
accumulator-based pseudo-machine; interpreters (simulators) for that machine 
exist in libpcap/WinPcap, in several UN*Xes in kernel-mode code (*BSD, OS X, 
AIX, Tru64 UNIX, sufficiently recent Linux kernels), and in the WinPcap kernel 
driver.  The kernel-mode version means that the capture mechanism 
libpcap/WinPcap uses can ignore "uninteresting" packets before copying them 
into a kernel-mode buffer or into the user address space.

> Is the filter used as it is parsed on each layer of the headers
> or used once after the packet is parsed completely?

The filter is compiled into a single program in BPF pseudo-machine code; the 
program does all the checks at all layers.  For example, a filter such as "tcp 
port 80" compiles, for Ethernet packets, into a program such as (with comments 
added by me):

(000) ldh      [12]                             # load Ethernet type - 2 byte 
"h"alfword at an offset of 12
(001) jeq      #0x86dd          jt 2    jf 8    # if equal to 0x86dd for IPv6, 
go to 2, else go to 8
(002) ldb      [20]                             # load IPv6 "next header" value 
- 1 "b"yte at an offset of 20
(003) jeq      #0x6             jt 4    jf 19   # if equal to 6 for TCP, go to 
4, else go to 19
(004) ldh      [54]                             # load TCP source port value - 
2 byte halfword at an offset of 54
(005) jeq      #0x50            jt 18   jf 6    # if equal to 0x50 = 80, go to 
18, else go to 6
(006) ldh      [56]                             # load TCP dest port value - 2 
byte halfword at an offset of 56
(007) jeq      #0x50            jt 18   jf 19   # if equal to 0x50 = 80, go to 
18, else go to 19

                                                # we got here from (001), so 
the accumulator has the Ethernet type
(008) jeq      #0x800           jt 9    jf 19   # if equal to 0x0800 for IPv4, 
go to 9, else go to 19
(009) ldb      [23]                             # load IPv4 protocol value - 1 
byte at an offset of 23
(010) jeq      #0x6             jt 11   jf 19   # if equal to 6 for TCP, go to 
11, else go to 19
(011) ldh      [20]                             # load fragment offset and 
flags from IPv6 header (2 bytes at 20)
(012) jset     #0x1fff          jt 19   jf 13   # if fragment offset is 
non-zero, go to 19, else go to 13
(013) ldxb     4*([14]&0xf)                     # get offset of TCP header, 
based on IPv4 header length
(014) ldh      [x + 14]                         # load TCP source port value
(015) jeq      #0x50            jt 18   jf 16   # if equal to 0x50 = 80, go to 
18, else go to 16
(016) ldh      [x + 16]                         # load TCP destination port 
value
(017) jeq      #0x50            jt 18   jf 19   # if equal to 0x50 = 80, go to 
18, else go to 19

(018) ret      #65535                           # success - return 65535, so we 
get up to 65535 bytes of packet

(019) ret      #0                               # failure - return 0, meaning 
"ignore this packet"

This is the OS X 10.8 tcpdump and libpcap; newer versions of libpcap generate 
IPv6 code that also checks for fragments other than the first fragment, just as 
is done for IPv4 - the first fragment is the one that'll have the TCP header, 
so you can't check the TCP ports in those fragments.

> Is there some material about this?

Here's the paper on the Berkeley Packet Filter (BPF) mechanism, as used in *BSD 
and OS X (and, perhaps with some changes, in AIX and, I think, Solaris 11), 
which includes the machine-code interpreter:

        http://www.tcpdump.org/papers/bpf-usenix93.pdf

A lot of that only applies to *BSD and OS X, and some might also apply to AIX 
and/or Solaris 11.  The BPF filter language, however, applies to all of them, 
as well as to Tru64 UNIX, Linux (in kernel versions that have the "socket 
filter" mechanism), and WinPcap.
_______________________________________________
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers

Reply via email to