On Fri, Feb 1, 2013 at 3:50 PM, Paul Pearce <pea...@cs.berkeley.edu> wrote:
> I'd like to point out that vlan filtering in general is completely
> broken under Linux 3x (as discussed several times on this list).
>
> In Linux 3x they began stripping the vlan headers off of RX packets
> and setting BPF ancillary flags, but not doing the same on TX packets.
> Since the vlan tags are missing when RX packets reach the kernel filter it
> means that stock libpcap plus any linux 3x kernel can only see TX
> vlan tagged packets.
>
> A recent (3.8 I believe) patch added the ability to use BPF to poke at
> the vlan ancillary fields, and Ani RFC'd a patch to on this list to
> shift vlan filtering to using the ancillary fields rather than offsetting into
> the header. But even with that patch since RX and TX paths are
> different, it's still not fixed.
>
> You could imagine extending Ani's patch to check for the vlan
> ancillary fields and if not set then look at the headers

That was my proposal to Ani, since the kernel guys seemed to insist
that asymmetry was a virtue.

> but that
> would mean the filter:
>
> vlan X or vlan Y
>
> would have different behavior on RX vs TX packets because of the
> pointer into the header advancing when it encounters a vlan tag
> on TX, but not RX.

Well, that filter is broken anyway in the current world, since it
matches 'a packet on vlan X' or 'a double-tagged packet with inner
vlan Y' (or, a packet that happens to have the same bit pattern as a
double-tagged packet with inner vlan Y).

This is the kind of thing that would be fixed by making the vlan
modifications associative - the 'or' in that expression would
effectively reset the offset.

> In my humble (uneducated) opinion the correct fix is to get linux to
> move to setting the vlan ancillary fields on TX packets as they do now
> on RX packets, which would simplify things a lot for libpcap. But that
> idea got a lot of pushback on the net-dev list. I didn't fully understand
> their distinction as to why it was ok on RX vs TX, and they never
> answered when I asked.

We're on the same page on that topic.

  Bill

> -Paul
>
> On Fri, Feb 1, 2013 at 8:51 AM, Gianluca Varenni
> <gianluca.vare...@riverbed.com> wrote:
>> The problem is that if you change the behavior of the vlan keyword, you 
>> potentially break a lot of applications that are based on the old buggy 
>> behavior :-(
>>
>> -----Original Message-----
>> From: fen...@gmail.com [mailto:fen...@gmail.com] On Behalf Of Bill Fenner
>> Sent: Friday, February 01, 2013 4:49 AM
>> To: Gianluca Varenni
>> Cc: Ani Sinha; tcpdump-workers@lists.tcpdump.org; Michael Richardson; 
>> Francesco Ruggeri
>> Subject: Re: [tcpdump-workers] "not vlan" filter expression broken 
>> catastrophically!
>>
>> On Thu, Jan 31, 2013 at 7:20 PM, Gianluca Varenni 
>> <gianluca.vare...@riverbed.com> wrote:
>>> To be totally honest, I think the whole way in which vlans are managed
>>> in the filters is quite nonsense. The underlying problem is that
>>> normally a BPF filter is an "or" or "and" combination of disjoint
>>> filters, so if I write "filterA" or "filterB" I assume that the two
>>> filters are disjoints, so
>>>
>>> "filterA or filterB" should be equivalent to "filterB or filterA"
>>>
>>> This is not true when using the "vlan" keyword. Vlan sticks globally and 
>>> increments the offset of the L3 header unconditionally of two bytes, no 
>>> turning back.
>>>
>>> For example "ip or vlan 14" is different than "vlan 14 or ip"
>>
>> We have wanted to fix the vlan support ever since it was added.  If I 
>> remember right we even talked about not adding it and waiting to do it 
>> right.  It's definitely a hack, the vlan offset info should be associative 
>> and only apply to anything that is "and"ed with the vlan keyword.  Sadly, 
>> the current structure of the parser / code generator do not lend themselves 
>> to that.
>>
>> The global nature of the vlan offset is something that nobody is happy with. 
>>  All it will take to fix it is to rewrite the grammar parser and filter 
>> generation code.
>>
>>   Bill
>>
>>
>>> -----Original Message-----
>>> From: tcpdump-workers-boun...@lists.tcpdump.org
>>> [mailto:tcpdump-workers-boun...@lists.tcpdump.org] On Behalf Of Ani
>>> Sinha
>>> Sent: Thursday, January 31, 2013 3:42 PM
>>> To: tcpdump-workers@lists.tcpdump.org
>>> Cc: Bill Fenner; Michael Richardson; Francesco Ruggeri
>>> Subject: [tcpdump-workers] "not vlan" filter expression broken 
>>> catastrophically!
>>>
>>> hello folks :
>>>
>>> As you guys have been aware, I am hacking libpcap for a while. Me and Bill 
>>> noticed something seriously broken for any filter expression that has a 
>>> "not vlan" in it. For example, take a look at the filter code generated by 
>>> libpcap with an expression like "not vlan and tcp port 80" :
>>>
>>> BpfExpression '(not vlan and tcp port 80)'
>>>       { 0x28,  0,  0, 0x0000000c }, //(000) ldh  [12]
>>>       { 0x15, 19,  0, 0x00008100 }, //(001) jeq  #0x8100     jt 21      jf 2
>>>       { 0x28,  0,  0, 0x00000010 }, //(002) ldh  [16]
>>>       { 0x15,  0,  6, 0x000086dd }, //(003) jeq  #0x86dd     jt 4       jf 
>>> 10
>>>       { 0x30,  0,  0, 0x00000018 }, //(004) ldb  [24]
>>>       { 0x15,  0, 15, 0x00000006 }, //(005) jeq  #0x6        jt 6       jf 
>>> 21
>>>       { 0x28,  0,  0, 0x0000003a }, //(006) ldh  [58]
>>>       { 0x15, 12,  0, 0x00000050 }, //(007) jeq  #0x50       jt 20      jf 8
>>>       { 0x28,  0,  0, 0x0000003c }, //(008) ldh  [60]
>>>       { 0x15, 10, 11, 0x00000050 }, //(009) jeq  #0x50       jt 20      jf 
>>> 21
>>>       { 0x15,  0, 10, 0x00000800 }, //(010) jeq  #0x800      jt 11      jf 
>>> 21
>>>       { 0x30,  0,  0, 0x0000001b }, //(011) ldb  [27]
>>>       { 0x15,  0,  8, 0x00000006 }, //(012) jeq  #0x6        jt 13      jf 
>>> 21
>>>       { 0x28,  0,  0, 0x00000018 }, //(013) ldh  [24]
>>>       { 0x45,  6,  0, 0x00001fff }, //(014) jset #0x1fff     jt 21      jf 
>>> 15
>>>       { 0xb1,  0,  0, 0x00000012 }, //(015) ldxb 4*([18]&0xf)
>>>       { 0x48,  0,  0, 0x00000012 }, //(016) ldh  [x + 18]
>>>       { 0x15,  2,  0, 0x00000050 }, //(017) jeq  #0x50       jt 20      jf 
>>> 18
>>>       { 0x48,  0,  0, 0x00000014 }, //(018) ldh  [x + 20]
>>>       { 0x15,  0,  1, 0x00000050 }, //(019) jeq  #0x50       jt 20      jf 
>>> 21
>>>       {  0x6,  0,  0, 0x0000ffff }, //(020) ret  #65535
>>>       {  0x6,  0,  0, 0x00000000 }, //(021) ret  #0
>>>
>>>
>>> As you can see, it loads offset 12 (ethertype). For vlan packets, it jumps 
>>> to #21 and returns false right away. However, for packets that are not vlan 
>>> tagged, it goes to #2 which loads offset 16 in the packet. Notice that this 
>>> is wrong! The offsets should be incremented by 4 only for vlan tagged 
>>> packets and not for non-vlan packets. The problem is that in gencode.c, the 
>>> off_linktype increments by 4 unconditionally whether or not the packet 
>>> actually contains a vlan tag. We do not want to increment this offset if 
>>> "not vlan" is true. So the above filter code is generated wrong.
>>>
>>> I just wanted to point this out to folks who wishes to dig in and fix it. I 
>>> do not have time right now to think of a proper solution. It would seem 
>>> using unconditional increments of offsets like off_linktype below the 
>>> parser is not going to work. How do you know if the parser is going to take 
>>> your code generated from the "vlan" expression and just negate it? Or may 
>>> be we can hack another rule in grammar.y. I don't know.
>>>
>>> cheers,
>>> ani
>>> _______________________________________________
>>> tcpdump-workers mailing list
>>> tcpdump-workers@lists.tcpdump.org
>>> https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
>>> _______________________________________________
>>> tcpdump-workers mailing list
>>> tcpdump-workers@lists.tcpdump.org
>>> https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
>> _______________________________________________
>> tcpdump-workers mailing list
>> tcpdump-workers@lists.tcpdump.org
>> https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
> _______________________________________________
> tcpdump-workers mailing list
> tcpdump-workers@lists.tcpdump.org
> https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
_______________________________________________
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers

Reply via email to