[tcpdump-workers] vlan tagged packets and libpcap breakage

2012-10-31 Thread Ani Sinha
hello folks :

Before I go into describing the real problem, I'd like to give some
background information for the context. Please bear with me.

As you guys might be aware, the way the Linux kernel handles vlan
tagged packets have changed. Please see the following commit in
mainline Torvalds' tree :

commit bcc6d47903612c3861201cc3a866fb604f26b8b2
Author: Jiri Pirko 
Date:   Thu Apr 7 19:48:33 2011 +

net: vlan: make non-hw-accel rx path similar to hw-accel

Now there are 2 paths for rx vlan frames. When rx-vlan-hw-accel is
enabled, skb is untagged by NIC, vlan_tci is set and the skb gets into
vlan code in __netif_receive_skb - vlan_hwaccel_do_receive.

For non-rx-vlan-hw-accel however, tagged skb goes thru whole
__netif_receive_skb, it's untagged in ptype_base hander and reinjected

This incosistency is fixed by this patch. Vlan untagging happens early in
__netif_receive_skb so the rest of code (ptype_all handlers, rx_handlers)
see the skb like it was untagged by hw.

Signed-off-by: Jiri Pirko 

v1->v2:
remove "inline" from vlan_core.c functions
Signed-off-by: David S. Miller 


As a result of this patch, with or without HW acceleration support in
the kernel driver, the vlan tag information is pulled out from the
packet and is inserted in the skb metadata. Thereafter, the vlan tag
information for a 802.1q packet can be obtained my applications
running in the user space by using the auxdata and cmsg structure.

More recently, two more patches were committed to net-next that enable
the kernel BPF filter to filter packets using their vlan tags :

http://www.spinics.net/lists/netdev/msg214758.html
http://www.spinics.net/lists/netdev/msg214759.html


Now to the real problem. The filter that is currently generated by
libpcap (gencode.c) uses offsets into the packet to obtain the vlan
tag for a packet and then compare it to the one provided by the user
in the filter expression (gen_vlan()). This will no longer work if the
tag has been pulled off from the packet itself. One has to use the
negative offsets (SKF_AD_VLAN_TAG) as used by the kernel BPF filter to
pass down the attribute that we need to compare against.  Now here's a
bigger problem. How do we avoid breakage in case we run libpcap on top
of an older kernel that does not extract the tags from the packet? How
do we handle cases of parsing and filtering against captured pcap
files that already have the vlan tags reinserted into the packet data?
There might be other corner cases that I am missing and not
understanding here.

Are you guys aware of the problem? Is there any thoughts as to how we
may be able to address the problem or not at all?

Thanks for the patience,
Ani
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-10-31 Thread Ani Sinha
cc'ing netdev.


On Wed, Oct 31, 2012 at 2:01 PM, Michael Richardson  wrote:
>
> Thanks for this email.
>
>>>>>> "Ani" == Ani Sinha  writes:
> Ani> remove "inline" from vlan_core.c functions
> Ani> Signed-off-by: David S. Miller 
>
> Ani> As a result of this patch, with or without HW acceleration support in
> Ani> the kernel driver, the vlan tag information is pulled out from the
> Ani> packet and is inserted in the skb metadata. Thereafter, the vlan tag
> Ani> information for a 802.1q packet can be obtained my applications
> Ani> running in the user space by using the auxdata and cmsg
> Ani> structure.
>
> Do you think that the existance of this behaviour could be exposed in
> sysctl, /proc/net or /sys equivalent (we still don't have /sys/net...)?
> As a read only file that had a 0/1 in it?

yes, we definitely need a run time check. Whether this could be in the
form of a socket option or a /proc entry I don't know.

>
> Should be one stanza to net/dev/sysctl_net_core, I think.
> If we are fast, maybe no kernel will ship without it.
>
> (An aside, is this auxdata/cmsg info available on UDP sockets too?)
>
> Ani> More recently, two more patches were committed to net-next that 
> enable
> Ani> the kernel BPF filter to filter packets using their vlan tags :
>
> Ani> http://www.spinics.net/lists/netdev/msg214758.html
> Ani> http://www.spinics.net/lists/netdev/msg214759.html
>
> okay...
>
> Ani> Now to the real problem. The filter that is currently generated by
> Ani> libpcap (gencode.c) uses offsets into the packet to obtain the vlan
> Ani> tag for a packet and then compare it to the one provided by the user
> Ani> in the filter expression (gen_vlan()). This will no longer work if 
> the
> Ani> tag has been pulled off from the packet itself. One has to use the
> Ani> negative offsets (SKF_AD_VLAN_TAG) as used by the kernel BPF filter 
> to
> Ani> pass down the attribute that we need to compare against.  Now here's 
> a
> Ani> bigger problem. How do we avoid breakage in case we run libpcap on 
> top
> Ani> of an older kernel that does not extract the tags from the packet? 
> How
> Ani> do we handle cases of parsing and filtering against captured pcap
> Ani> files that already have the vlan tags reinserted into the packet 
> data?
> Ani> There might be other corner cases that I am missing and not
> Ani> understanding here.
>
> AFAIK, We will have to modify pcap-linux to produce different filters.
>
> Ani> Are you guys aware of the problem? Is there any thoughts as to how we
> Ani> may be able to address the problem or not at all?
>
> Nobody gave us a heads up, no.

Glad that I brought this up. Actually Bill Fenner who also works for
us helped me a lot in understanding the problem and potentially how it
can be addressed.


>
> It sounds like it's impossible to capture all traffic now, and do vlan
> filtering later on?

pcap files that already have the tags reinsrted should work with
current filter code. However for live traffic, one has to get the tags
from CMSG() and then reinsert it back to the packet for the current
filter to work. This is slow.

ani
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-10-31 Thread Ani Sinha
On Wed, Oct 31, 2012 at 3:20 PM, Guy Harris  wrote:
>
> On Oct 31, 2012, at 2:50 PM, Ani Sinha  wrote:
>
>> pcap files that already have the tags reinsrted should work with
>> current filter code. However for live traffic, one has to get the tags
>> from CMSG() and then reinsert it back to the packet for the current
>> filter to work.
>
> *Somebody* has to do that, at least to packets that pass the filter,


yes but if the packet is passed to the filter within libpcap (when we
are not using the kernel filter) before the reinsertion, then the
filter has to be taught to look into the metadata for tag information
and not in the packet itself.

before they're handed to a libpcap-based application, for programs
that expect to see packets as they arrived from/were transmitted to
the wire to work.
>
> I.e., the tags *should* be reinserted by libpcap, and, as I understand it, 
> that's what the
>
> #if defined(HAVE_PACKET_AUXDATA) && 
> defined(HAVE_LINUX_TPACKET_AUXDATA_TP_VLAN_TCI)
> ...
> #endif
>
> blocks of code in pcap-linux.c in libpcap are doing.

yes, I agree.
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-10-31 Thread Ani Sinha
On Wed, Oct 31, 2012 at 5:50 PM, Guy Harris  wrote:
>
> On Oct 31, 2012, at 3:35 PM, Ani Sinha  wrote:
>
>> yes but if the packet is passed to the filter within libpcap (when we
>> are not using the kernel filter) before the reinsertion,
>
> ...that would be a bug.
>
> Currently, that bug doesn't exist in the recvfrom() code path, but *does* 
> appear to exist in the tpacket code path - and that code path also runs the 
> filter before the SLL header is constructed.  That should be fixed.


yes, agreed.
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-11-13 Thread Ani Sinha
Folks,

I'm wondering if there has been any progress on this. Are there any
thoughts on what Bill wrote in his email?

Thanks
ani



On Fri, Nov 2, 2012 at 9:13 AM, Bill Fenner  wrote:

> On Wed, Oct 31, 2012 at 6:20 PM, Guy Harris  wrote:
> >
> > On Oct 31, 2012, at 2:50 PM, Ani Sinha  wrote:
> >
> >> pcap files that already have the tags reinsrted should work with
> >> current filter code. However for live traffic, one has to get the tags
> >> from CMSG() and then reinsert it back to the packet for the current
> >> filter to work.
> >
> > *Somebody* has to do that, at least to packets that pass the filter,
> before they're handed to a libpcap-based application, for programs that
> expect to see packets as they arrived from/were transmitted to the wire to
> work.
> >
> > I.e., the tags *should* be reinserted by libpcap, and, as I understand
> it, that's what the
> >
> > #if defined(HAVE_PACKET_AUXDATA) &&
> defined(HAVE_LINUX_TPACKET_AUXDATA_TP_VLAN_TCI)
> > ...
> > #endif
> >
> > blocks of code in pcap-linux.c in libpcap are doing.
> >
> > Now, if filtering is being done in the *kernel*, and the tags aren't
> being reinserted by the kernel, then filter code stuffed into the kernel
> would need to differ from filter code run in userland.  There's already
> precedent for that on Linux, with the "cooked mode" headers; those are
> synthesized by libpcap from the metadata returned for PF_PACKET sockets,
> and the code that attempts to hand the kernel a filter goes through the
> filter code, which was generated under the assumption that the packet
> begins with a "cooked mode" header, and modifies (a copy of) the code to,
> instead, use the special Linux-BPF-interpreter offsets to access the
> metadata.
> >
> > The right thing to do here would be to, if possible, do the same, so
> that the kernel doesn't have to reinsert VLAN tags for packets that aren't
> going to be handed to userland.
>
> In this case, it would be incredibly complicated to do this just
> postprocessing a set of bpf instructions.  The problem is that when
> running the filter in the kernel, the IP header, etc. are not offset,
> so "off_macpl" and "off_linktype" would be zero, not 4, while
> generating the rest of the expression.  We would also have to insert
> code when comparing the ethertype to 0x8100 to instead load the
> vlan-tagged metadata, so all jumps crossing that point would have to
> be adjusted, and if the "if-false" instruction was also testing the
> ethertype, then the ethertype would have to be reloaded (again
> inserting another instruction).
>
> Basically, take a look at the output of "tcpdump -d tcp port 22 or
> (vlan and tcp port 22)".  Are the IPv4 tcp ports at x+14/x+16, or at
> x+18/x+20?  If we're filtering in the kernel, they're at x+14/x+16
> whether the packet is vlan tagged or not.  If we're filtering on the
> actual packet contents (from a savefile, for example), they're at
> x+18/x+20 if the packet is vlan tagged.
>
> Also, an expression such as 'tcp port 22' would have to have some
> instructions added at the beginning, for "vlan-tagged == false", or it
> would match both tagged and untagged packets.
>
> This would be much more straightforward to deal with in the code
> generation phase, except until now the code generation phase hasn't
> known whether the filter is headed for the kernel or not.
>
>   Bill
>
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-11-13 Thread Ani Sinha
Folks,

I'm wondering if there has been any progress on this. Are there any
thoughts on what Bill wrote in his email?

Thanks

On Tue, Nov 13, 2012 at 2:41 PM, Ani Sinha  wrote:
> Folks,
>
> I'm wondering if there has been any progress on this. Are there any thoughts
> on what Bill wrote in his email?
>
> Thanks
> ani
>
>
>
> On Fri, Nov 2, 2012 at 9:13 AM, Bill Fenner  wrote:
>>
>> On Wed, Oct 31, 2012 at 6:20 PM, Guy Harris  wrote:
>> >
>> > On Oct 31, 2012, at 2:50 PM, Ani Sinha  wrote:
>> >
>> >> pcap files that already have the tags reinsrted should work with
>> >> current filter code. However for live traffic, one has to get the tags
>> >> from CMSG() and then reinsert it back to the packet for the current
>> >> filter to work.
>> >
>> > *Somebody* has to do that, at least to packets that pass the filter,
>> > before they're handed to a libpcap-based application, for programs that
>> > expect to see packets as they arrived from/were transmitted to the wire to
>> > work.
>> >
>> > I.e., the tags *should* be reinserted by libpcap, and, as I understand
>> > it, that's what the
>> >
>> > #if defined(HAVE_PACKET_AUXDATA) &&
>> > defined(HAVE_LINUX_TPACKET_AUXDATA_TP_VLAN_TCI)
>> > ...
>> > #endif
>> >
>> > blocks of code in pcap-linux.c in libpcap are doing.
>> >
>> > Now, if filtering is being done in the *kernel*, and the tags aren't
>> > being reinserted by the kernel, then filter code stuffed into the kernel
>> > would need to differ from filter code run in userland.  There's already
>> > precedent for that on Linux, with the "cooked mode" headers; those are
>> > synthesized by libpcap from the metadata returned for PF_PACKET sockets, 
>> > and
>> > the code that attempts to hand the kernel a filter goes through the filter
>> > code, which was generated under the assumption that the packet begins with 
>> > a
>> > "cooked mode" header, and modifies (a copy of) the code to, instead, use 
>> > the
>> > special Linux-BPF-interpreter offsets to access the metadata.
>> >
>> > The right thing to do here would be to, if possible, do the same, so
>> > that the kernel doesn't have to reinsert VLAN tags for packets that aren't
>> > going to be handed to userland.
>>
>> In this case, it would be incredibly complicated to do this just
>> postprocessing a set of bpf instructions.  The problem is that when
>> running the filter in the kernel, the IP header, etc. are not offset,
>> so "off_macpl" and "off_linktype" would be zero, not 4, while
>> generating the rest of the expression.  We would also have to insert
>> code when comparing the ethertype to 0x8100 to instead load the
>> vlan-tagged metadata, so all jumps crossing that point would have to
>> be adjusted, and if the "if-false" instruction was also testing the
>> ethertype, then the ethertype would have to be reloaded (again
>> inserting another instruction).
>>
>> Basically, take a look at the output of "tcpdump -d tcp port 22 or
>> (vlan and tcp port 22)".  Are the IPv4 tcp ports at x+14/x+16, or at
>> x+18/x+20?  If we're filtering in the kernel, they're at x+14/x+16
>> whether the packet is vlan tagged or not.  If we're filtering on the
>> actual packet contents (from a savefile, for example), they're at
>> x+18/x+20 if the packet is vlan tagged.
>>
>> Also, an expression such as 'tcp port 22' would have to have some
>> instructions added at the beginning, for "vlan-tagged == false", or it
>> would match both tagged and untagged packets.
>>
>> This would be much more straightforward to deal with in the code
>> generation phase, except until now the code generation phase hasn't
>> known whether the filter is headed for the kernel or not.
>>
>>   Bill
>
>
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-12-06 Thread Ani Sinha
On Wed, Oct 31, 2012 at 5:50 PM, Guy Harris  wrote:
>
> On Oct 31, 2012, at 3:35 PM, Ani Sinha  wrote:
>
>> yes but if the packet is passed to the filter within libpcap (when we
>> are not using the kernel filter) before the reinsertion,
>
> ...that would be a bug.
>
> Currently, that bug doesn't exist in the recvfrom() code path, but *does* 
> appear to exist in the tpacket code path - and that code path also runs the 
> filter before the SLL header is constructed.  That should be fixed.

Something like this?

Index: libpcap-1.1.1/pcap-linux.c
===
--- libpcap-1.1.1.orig/pcap-linux.c
+++ libpcap-1.1.1/pcap-linux.c
@@ -132,6 +132,7 @@ static const char rcsid[] _U_ =
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -3469,23 +3476,6 @@ pcap_read_linux_mmap(pcap_t *handle, int
  return -1;
  }

- /* run filter on received packet
- * If the kernel filtering is enabled we need to run the
- * filter until all the frames present into the ring
- * at filter creation time are processed.
- * In such case md.use_bpf is used as a counter for the
- * packet we need to filter.
- * Note: alternatively it could be possible to stop applying
- * the filter when the ring became empty, but it can possibly
- * happen a lot later... */
- bp = (unsigned char*)h.raw + tp_mac;
- run_bpf = (!handle->md.use_bpf) ||
- ((handle->md.use_bpf>1) && handle->md.use_bpf--);
- if (run_bpf && handle->fcode.bf_insns &&
- (bpf_filter(handle->fcode.bf_insns, bp,
- tp_len, tp_snaplen) == 0))
- goto skip;
-
  /*
  * Do checks based on packet direction.
  */
@@ -3582,6 +3576,23 @@ pcap_read_linux_mmap(pcap_t *handle, int
  }
 #endif

+ /* run filter on received packet
+ * If the kernel filtering is enabled we need to run the
+ * filter until all the frames present into the ring
+ * at filter creation time are processed.
+ * In such case md.use_bpf is used as a counter for the
+ * packet we need to filter.
+ * Note: alternatively it could be possible to stop applying
+ * the filter when the ring became empty, but it can possibly
+ * happen a lot later... */
+ bp = (unsigned char*)h.raw + tp_mac;
+ run_bpf = (!handle->md.use_bpf) ||
+ ((handle->md.use_bpf>1) && handle->md.use_bpf--);
+ if (run_bpf && handle->fcode.bf_insns &&
+ (bpf_filter(handle->fcode.bf_insns, bp,
+ tp_len, tp_snaplen) == 0))
+ goto skip;
+
  /*
  * The only way to tell the kernel to cut off the
  * packet at a snapshot length is with a filter program;
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-12-06 Thread Ani Sinha
On Sat, Nov 17, 2012 at 3:33 PM, Eric W. Biederman
 wrote:
> the vlan header in packets as we receive them.
>
> The code is correct except for the case of packets in vlan 0.  Currently
> the packet reconstruction is ambiguous.  The most recent kernels have
> a TP_STATUS_VLAN_VALID flag that can be checked to see if the packet was
> in vlan 0 or if there was no vlan at all.  libpcap probably should be
> taught how to handle TP_STATUS_VLAN_VALID so that it can get the vlan 0
> handling correct.
>

May be this?

Index: libpcap-1.1.1/pcap-linux.c
===
--- libpcap-1.1.1.orig/pcap-linux.c
+++ libpcap-1.1.1/pcap-linux.c
@@ -132,6 +132,7 @@ static const char rcsid[] _U_ =
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1486,7 +1487,13 @@ pcap_read_packet(pcap_t *handle, pcap_ha
  continue;

  aux = (struct tpacket_auxdata *)CMSG_DATA(cmsg);
- if (aux->tp_vlan_tci == 0)
+#if defined(TP_STATUS_VLAN_VALID)
+if (!(aux->tp_vlan_tci & TP_STATUS_VLAN_VALID))
+#else
+ if (aux->tp_vlan_tci == 0) /* this is ambigious but without the
+
TP_STATUS_VLAN_VALID flag, there is
+  nothing that we can do */
+#endif
  continue;

  len = packet_len > iov.iov_len ? iov.iov_len : packet_len;
@@ -3565,7 +3555,11 @@ pcap_read_linux_mmap(pcap_t *handle, int
  }

 #ifdef HAVE_TPACKET2
- if (handle->md.tp_version == TPACKET_V2 && h.h2->tp_vlan_tci &&
+#if defined(TP_STATUS_VLAN_VALID)
+if (handle->md.tp_version == TPACKET_V2 &&
(h.h2->tp_vlan_tci & TP_STATUS_VLAN_VALID) &&
+#else
+if (handle->md.tp_version == TPACKET_V2 && h.h2->tp_vlan_tci &&
+#endif
 handle->md.vlan_offset != -1 &&
 tp_snaplen >= (unsigned int) handle->md.vlan_offset) {
  struct vlan_tag *tag;
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-12-06 Thread Ani Sinha
On Thu, Dec 6, 2012 at 2:19 PM, Eric W. Biederman  wrote:
> Ani Sinha  writes:
>
>> On Sat, Nov 17, 2012 at 3:33 PM, Eric W. Biederman
>>  wrote:
>>> the vlan header in packets as we receive them.
>>>
>>> The code is correct except for the case of packets in vlan 0.  Currently
>>> the packet reconstruction is ambiguous.  The most recent kernels have
>>> a TP_STATUS_VLAN_VALID flag that can be checked to see if the packet was
>>> in vlan 0 or if there was no vlan at all.  libpcap probably should be
>>> taught how to handle TP_STATUS_VLAN_VALID so that it can get the vlan 0
>>> handling correct.
>>>
>>
>> May be this?
>
> Two things.
>
> - TP_STATUS_VLAN_VALID lives in the tp_status field not the tp_vlan_tci field.
> - To work on older kernels with binaries compiled with newer headers you
>   first want to test for tp_vlan_tci == 0 and then look at the status field 
> for
>   TP_STATUS_VALID.

yes you are right Eric on both counts. I will resend the patch again in a bit.

ani
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-12-06 Thread Ani Sinha
On Thu, Dec 6, 2012 at 2:19 PM, Eric W. Biederman  wrote:
> Ani Sinha  writes:
>>>
>>
>> May be this?
>
> Two things.
>
> - TP_STATUS_VLAN_VALID lives in the tp_status field not the tp_vlan_tci field.
> - To work on older kernels with binaries compiled with newer headers you
>   first want to test for tp_vlan_tci == 0 and then look at the status field 
> for
>   TP_STATUS_VALID.


trying again :

 pcap-linux.c |   50 +++---
 1 files changed, 31 insertions(+), 19 deletions(-)

diff --git a/pcap-linux.c b/pcap-linux.c
index a42c3ac..8e355d3 100644
--- a/pcap-linux.c
+++ b/pcap-linux.c
@@ -133,6 +133,7 @@ static const char rcsid[] _U_ =
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1543,7 +1544,13 @@ pcap_read_packet(pcap_t *handle, pcap_handler
callback, u_char *userdata)
  continue;

  aux = (struct tpacket_auxdata *)CMSG_DATA(cmsg);
- if (aux->tp_vlan_tci == 0)
+#if defined(TP_STATUS_VLAN_VALID)
+if ((aux->tp_vlan_tci == 0) ||
!(aux->tp_status & TP_STATUS_VLAN_VALID))
+#else
+ if (aux->tp_vlan_tci == 0) /* this is ambigious but without the
+
TP_STATUS_VLAN_VALID flag, there is
+  nothing that we can do */
+#endif
  continue;

  len = packet_len > iov.iov_len ? iov.iov_len : packet_len;
@@ -3936,7 +3926,12 @@ pcap_read_linux_mmap(pcap_t *handle, int
max_packets, pcap_handler callback,
  }

 #ifdef HAVE_TPACKET2
- if (handle->md.tp_version == TPACKET_V2 && h.h2->tp_vlan_tci &&
+if ((handle->md.tp_version == TPACKET_V2) &&
+#if defined(TP_STATUS_VLAN_VALID)
+(h.h2->tp_vlan_tci || (h.h2->tp_status &
TP_STATUS_VLAN_VALID)) &&
+#else
+h.h2->tp_vlan_tci &&
+#endif
 handle->md.vlan_offset != -1 &&
 tp_snaplen >= (unsigned int) handle->md.vlan_offset) {
  struct vlan_tag *tag;
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-12-06 Thread Ani Sinha
>
> The patch is whitespace damaged.  And one of your test is using ||
> instead of &&
>

OK, using alpine now :-)
>
> The test should be && not ||.
>

aargh! I am retarded! Fixed. Hopefully 3rd time is a charm :-)


ani
 pcap-linux.c |   50 +++---
 1 files changed, 31 insertions(+), 19 deletions(-)

diff --git a/pcap-linux.c b/pcap-linux.c
index a42c3ac..b2c1a08 100644
--- a/pcap-linux.c
+++ b/pcap-linux.c
@@ -133,6 +133,7 @@ static const char rcsid[] _U_ =
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1543,7 +1544,13 @@ pcap_read_packet(pcap_t *handle, pcap_handler callback, 
u_char *userdata)
continue;
 
aux = (struct tpacket_auxdata *)CMSG_DATA(cmsg);
-   if (aux->tp_vlan_tci == 0)
+#if defined(TP_STATUS_VLAN_VALID)
+if ((aux->tp_vlan_tci == 0) && !(aux->tp_status & 
TP_STATUS_VLAN_VALID))
+#else
+   if (aux->tp_vlan_tci == 0) /* this is ambigious but 
without the
+  TP_STATUS_VLAN_VALID 
flag, there is
+  nothing that we can do */
+#endif
continue;
 
len = packet_len > iov.iov_len ? iov.iov_len : 
packet_len;
@@ -3936,7 +3926,12 @@ pcap_read_linux_mmap(pcap_t *handle, int max_packets, 
pcap_handler callback,
}
 
 #ifdef HAVE_TPACKET2
-   if (handle->md.tp_version == TPACKET_V2 && h.h2->tp_vlan_tci &&
+if ((handle->md.tp_version == TPACKET_V2) &&
+#if defined(TP_STATUS_VLAN_VALID)
+(h.h2->tp_vlan_tci || (h.h2->tp_status & 
TP_STATUS_VLAN_VALID)) &&
+#else
+h.h2->tp_vlan_tci &&
+#endif
handle->md.vlan_offset != -1 &&
tp_snaplen >= (unsigned int) handle->md.vlan_offset) {
struct vlan_tag *tag;

___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-12-06 Thread Ani Sinha
>
> The patch is whitespace damaged.  And one of your test is using ||
> instead of &&

OK, using alpine now :-)
>
> The test should be && not ||.

aargh! I am retarded! Fixed. Hopefully 3rd time is a charm :-)

ani


 pcap-linux.c |   50 +++---
 1 files changed, 31 insertions(+), 19 deletions(-)

diff --git a/pcap-linux.c b/pcap-linux.c
index a42c3ac..b2c1a08 100644
--- a/pcap-linux.c
+++ b/pcap-linux.c
@@ -133,6 +133,7 @@ static const char rcsid[] _U_ =
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1543,7 +1544,13 @@ pcap_read_packet(pcap_t *handle, pcap_handler callback, 
u_char *userdata)
continue;

aux = (struct tpacket_auxdata *)CMSG_DATA(cmsg);
-   if (aux->tp_vlan_tci == 0)
+#if defined(TP_STATUS_VLAN_VALID)
+if ((aux->tp_vlan_tci == 0) && !(aux->tp_status & 
TP_STATUS_VLAN_VALID))
+#else
+   if (aux->tp_vlan_tci == 0) /* this is ambigious but 
without the
+  TP_STATUS_VLAN_VALID 
flag, there is
+  nothing that we can do */
+#endif
continue;

len = packet_len > iov.iov_len ? iov.iov_len : 
packet_len;
@@ -3936,7 +3926,12 @@ pcap_read_linux_mmap(pcap_t *handle, int max_packets, 
pcap_handler callback,
}

 #ifdef HAVE_TPACKET2
-   if (handle->md.tp_version == TPACKET_V2 && h.h2->tp_vlan_tci &&
+if ((handle->md.tp_version == TPACKET_V2) &&
+#if defined(TP_STATUS_VLAN_VALID)
+(h.h2->tp_vlan_tci || (h.h2->tp_status & 
TP_STATUS_VLAN_VALID)) &&
+#else
+h.h2->tp_vlan_tci &&
+#endif
handle->md.vlan_offset != -1 &&
tp_snaplen >= (unsigned int) handle->md.vlan_offset) {
struct vlan_tag *tag;

___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-12-10 Thread Ani Sinha


On Thu, 6 Dec 2012, Michael Richardson wrote:

> Date: Thu, 6 Dec 2012 17:59:13
> From: Michael Richardson 
> To: Eric W. Biederman 
> Cc: a...@aristanetworks.com, tcpdump-workers@lists.tcpdump.org,
> net...@vger.kernel.org, Francesco Ruggeri 
> Subject: Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
>
>
> >>>>> "Eric" == Eric W Biederman  writes:
> Eric> It is a bit odd that you are indenting with spaces instead of tabs
> Eric> in a file that is indented with tab.  Again libpcap isn't my code 
> so I
> Eric> don't care but someone else might.

Here's an updated patch. I have converted spaces to tabs consistent with
rest of the file. I have also set up github and sent a pull request

 :

commit b977ebd9d9608bb8a8e4927e7a6286bdfd6b94a8
Author: Ani Sinha 
Date:   Mon Dec 10 14:58:15 2012 -0800

Linux kernel 3.0 uses TP_STATUS_VLAN_VALID flag in packet
auxilliary data aux->tp_status to indicate a packet tagged
with a valid vlan ID 0 with another packet that is not
vlan tagged. Use this flag to check for the presence of
a vlan tagged packet. Also be careful not to cause any
breakage when libpcap is compiled against a newer kernel
(>=3.0) and used on top of an older kernel that does not
use this flag.

Signed-off-by: Ani Sinha 

diff --git a/pcap-linux.c b/pcap-linux.c
index a42c3ac..ffcabdf 100644
--- a/pcap-linux.c
+++ b/pcap-linux.c
@@ -133,6 +133,7 @@ static const char rcsid[] _U_ =
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1543,7 +1544,13 @@ pcap_read_packet(pcap_t *handle, pcap_handler callback, 
u_char *userdata)
continue;

aux = (struct tpacket_auxdata *)CMSG_DATA(cmsg);
-   if (aux->tp_vlan_tci == 0)
+#if defined(TP_STATUS_VLAN_VALID)
+   if ((aux->tp_vlan_tci == 0) && !(aux->tp_status & 
TP_STATUS_VLAN_VALID))
+#else
+   if (aux->tp_vlan_tci == 0) /* this is ambigious but 
without the
+   TP_STATUS_VLAN_VALID flag, 
there is
+   nothing that we can do */
+#endif
continue;

len = packet_len > iov.iov_len ? iov.iov_len : 
packet_len;
@@ -3936,7 +3943,12 @@ pcap_read_linux_mmap(pcap_t *handle, int max_packets, 
pcap_handler callback,
}

 #ifdef HAVE_TPACKET2
-   if (handle->md.tp_version == TPACKET_V2 && h.h2->tp_vlan_tci &&
+   if ((handle->md.tp_version == TPACKET_V2) &&
+#if defined(TP_STATUS_VLAN_VALID)
+   (h.h2->tp_vlan_tci || (h.h2->tp_status & TP_STATUS_VLAN_VALID)) 
&&
+#else
+   h.h2->tp_vlan_tci &&
+#endif
handle->md.vlan_offset != -1 &&
tp_snaplen >= (unsigned int) handle->md.vlan_offset) {
struct vlan_tag *tag;
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-12-11 Thread Ani Sinha
>
> It is possible to test for the presence of support of the new vlan bpf
> extensions by attempting to load a filter that uses them.  As only valid
> filters can be loaded, old kernels that do not support filtering of vlan
> tags will fail to load the a test filter with uses them.

Unfortunately I do not see this. The sk_chk_filter() does not have a
default in the case statement and the check will not detect an unknown
instruction. It will fail when the filter is run and as far as I can see,
the packet will be dropped. Something like this might help?

diff --git a/net/core/filter.c b/net/core/filter.c
index c23543c..96338aa 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -548,6 +548,8 @@ int sk_chk_filter(struct sock_filter *filter, unsigned int 
flen)
return -EINVAL;
/* Some instructions need special checks */
switch (code) {
+   /* for unknown instruction, return EINVAL */
+   default : return -EINVAL;
case BPF_S_ALU_DIV_K:
/* check for division by zero */
if (ftest->k == 0)
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-12-11 Thread Ani Sinha
On Tue, Dec 11, 2012 at 3:04 PM, Eric Dumazet  wrote:
> On Tue, 2012-12-11 at 14:36 -0800, Ani Sinha wrote:
>> >
>> > It is possible to test for the presence of support of the new vlan bpf
>> > extensions by attempting to load a filter that uses them.  As only valid
>> > filters can be loaded, old kernels that do not support filtering of vlan
>> > tags will fail to load the a test filter with uses them.
>>
>> Unfortunately I do not see this. The sk_chk_filter() does not have a
>> default in the case statement and the check will not detect an unknown
>> instruction. It will fail when the filter is run and as far as I can see,
>> the packet will be dropped. Something like this might help?
>>
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index c23543c..96338aa 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -548,6 +548,8 @@ int sk_chk_filter(struct sock_filter *filter, unsigned 
>> int flen)
>>   return -EINVAL;
>>   /* Some instructions need special checks */
>>   switch (code) {
>> + /* for unknown instruction, return EINVAL */
>> + default : return -EINVAL;
>>   case BPF_S_ALU_DIV_K:
>>   /* check for division by zero */
>>   if (ftest->k == 0)
>
> This patch is wrong.


yes I generated this patch wrong.

>
> Check lines 546, 547, 548 where we do the check for unknown instructions
>
> code = codes[code];
> if (!code)
> return -EINVAL;

yepph it's OK here.

>
> If you want to test ANCILLARY possible values, its already too late, as
> old kernels wont use any patch anyway.

yepph, I was looking at possible ancilliary values. Basically this
case statement :

#define ANCILLARY(CODE) case SKF_AD_OFF + SKF_AD_##CODE:\
code = BPF_S_ANC_##CODE;\
break
switch (ftest->k) {
ANCILLARY(PROTOCOL);
ANCILLARY(PKTTYPE);
ANCILLARY(IFINDEX);
ANCILLARY(NLATTR);
ANCILLARY(NLATTR_NEST);
ANCILLARY(MARK);
ANCILLARY(QUEUE);
ANCILLARY(HATYPE);
ANCILLARY(RXHASH);
ANCILLARY(CPU);
ANCILLARY(ALU_XOR_X);
ANCILLARY(VLAN_TAG);
ANCILLARY(VLAN_TAG_PRESENT);
}
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-12-11 Thread Ani Sinha
On Tue, Dec 11, 2012 at 3:04 PM, Eric Dumazet  wrote:
> On Tue, 2012-12-11 at 14:36 -0800, Ani Sinha wrote:
>> >
>> > It is possible to test for the presence of support of the new vlan bpf

> If you want to test ANCILLARY possible values, its already too late, as
> old kernels wont use any patch anyway.
>

So basically this means that if we generate a filter with these
special negative offset values and expect that the kernel will
complain if it does not recognize the newer values then we would be
wrong. And you are right. Old kernels never knew about them and the
code wasn't written in a way to return EINVAL if it didn't recognize a
special negative anciliary offset value.
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-12-12 Thread Ani Sinha

>
> unsigned int netdev_8021q_inskb = 1;
>
> ...
>   {
>   .ctl_name   = NET_CORE_8021q_INSKB,
>   .procname   = "netdev_8021q_inskb",
>   .data   = &netdev_8021q_inskb,
>   .maxlen = sizeof(int),
>   .mode   = 0444,
>   .proc_handler   = proc_dointvec
>   },
>
> would seem to do it to me.
> Then pcap can fopen("/proc/sys/net/core/netdev_8021q_inskb") and if it
> finds it, and it is >0, then do the cmsg thing.
>

Does this work? This is just an experimental patch and by no means final.
I just want to have an idea what everyone thought about it. Once we debate
and discusss, I can cook up a final patch that would be worth commiting.

Also instead of having this /proc interface, we can perhaps check for a
specific
kernel version that :

(a) has the vlan tag info in the skb metadata (as opposed to in the packet
itself)
(b) has the following patch that adds the capability to generate a filter
based on the tag value :

commit f3335031b9452baebfe49b8b5e55d3fe0c4677d1
Author: Eric Dumazet 
Date:   Sat Oct 27 02:26:17 2012 +

net: filter: add vlan tag access

WE need both of the above two things for the userland to generate a filter
code that compares vlan tag values in the skb metadata. For kernels that
has the vlan tag in
the skb metadata but does not have the above commit (b), there is nothing
that can be done. For older kernels that had the vlan tag info in the
packet itself, the filter code can be generated differently to look at
specific offsets within the packet (something that libpcap does
currently).

We have already ruled out the idea of generating a filter and trying to
load and see if that fails (see previous emails on this thread).

Hope this makes sense.


diff --git a/include/linux/filter.h b/include/linux/filter.h
index c45eabc..91e2ba3 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -36,6 +36,7 @@ static inline unsigned int sk_filter_len(const struct 
sk_filter *fp)
return fp->len * sizeof(struct sock_filter) + sizeof(*fp);
 }

+extern bool sysctl_8021q_inskb;
 extern int sk_filter(struct sock *sk, struct sk_buff *skb);
 extern unsigned int sk_run_filter(const struct sk_buff *skb,
  const struct sock_filter *filter);
diff --git a/net/core/filter.c b/net/core/filter.c
index c23543c..4f5a657 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -41,6 +41,8 @@
 #include 
 #include 

+bool sysctl_8021q_inskb = 1;
+
 /* No hurry in this branch
  *
  * Exported for the bpf jit load helper.
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index d1b0804..f9a3700 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 

 #include 
 #include 
@@ -189,6 +190,13 @@ static struct ctl_table net_core_table[] = {
.mode   = 0644,
.proc_handler   = proc_dointvec
},
+   {
+   .procname   = "8021q_inskb",
+   .data   = &sysctl_8021q_inskb,
+   .maxlen = sizeof(bool),
+   .mode   = 0444,
+   .proc_handler   = proc_dointvec
+   },
{ }
 };

___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-12-12 Thread Ani Sinha
+ Eric B.


On Wed, Dec 12, 2012 at 1:53 PM, Ani Sinha  wrote:
>
>>
>> unsigned int netdev_8021q_inskb = 1;
>>
>> ...
>>   {
>>   .ctl_name   = NET_CORE_8021q_INSKB,
>>   .procname   = "netdev_8021q_inskb",
>>   .data   = &netdev_8021q_inskb,
>>   .maxlen = sizeof(int),
>>   .mode   = 0444,
>>   .proc_handler   = proc_dointvec
>>   },
>>
>> would seem to do it to me.
>> Then pcap can fopen("/proc/sys/net/core/netdev_8021q_inskb") and if it
>> finds it, and it is >0, then do the cmsg thing.
>>
>
> Does this work? This is just an experimental patch and by no means final.
> I just want to have an idea what everyone thought about it. Once we debate
> and discusss, I can cook up a final patch that would be worth commiting.
>
> Also instead of having this /proc interface, we can perhaps check for a
> specific
> kernel version that :
>
> (a) has the vlan tag info in the skb metadata (as opposed to in the packet
> itself)
> (b) has the following patch that adds the capability to generate a filter
> based on the tag value :
>
> commit f3335031b9452baebfe49b8b5e55d3fe0c4677d1
> Author: Eric Dumazet 
> Date:   Sat Oct 27 02:26:17 2012 +
>
> net: filter: add vlan tag access
>
> WE need both of the above two things for the userland to generate a filter
> code that compares vlan tag values in the skb metadata. For kernels that
> has the vlan tag in
> the skb metadata but does not have the above commit (b), there is nothing
> that can be done. For older kernels that had the vlan tag info in the
> packet itself, the filter code can be generated differently to look at
> specific offsets within the packet (something that libpcap does
> currently).
>
> We have already ruled out the idea of generating a filter and trying to
> load and see if that fails (see previous emails on this thread).
>
> Hope this makes sense.
>
>
> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index c45eabc..91e2ba3 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -36,6 +36,7 @@ static inline unsigned int sk_filter_len(const struct 
> sk_filter *fp)
> return fp->len * sizeof(struct sock_filter) + sizeof(*fp);
>  }
>
> +extern bool sysctl_8021q_inskb;
>  extern int sk_filter(struct sock *sk, struct sk_buff *skb);
>  extern unsigned int sk_run_filter(const struct sk_buff *skb,
>   const struct sock_filter *filter);
> diff --git a/net/core/filter.c b/net/core/filter.c
> index c23543c..4f5a657 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -41,6 +41,8 @@
>  #include 
>  #include 
>
> +bool sysctl_8021q_inskb = 1;
> +
>  /* No hurry in this branch
>   *
>   * Exported for the bpf jit load helper.
> diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
> index d1b0804..f9a3700 100644
> --- a/net/core/sysctl_net_core.c
> +++ b/net/core/sysctl_net_core.c
> @@ -15,6 +15,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -189,6 +190,13 @@ static struct ctl_table net_core_table[] = {
> .mode   = 0644,
> .proc_handler   = proc_dointvec
> },
> +   {
> +   .procname   = "8021q_inskb",
> +   .data   = &sysctl_8021q_inskb,
> +   .maxlen = sizeof(bool),
> +   .mode   = 0444,
> +   .proc_handler   = proc_dointvec
> +   },
> { }
>  };
>
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-12-13 Thread Ani Sinha
On Thu, Dec 13, 2012 at 12:35 AM, Daniel Borkmann  wrote:
> On 12/12/2012 10:53 PM, Ani Sinha wrote:
>>>
>>> unsigned int netdev_8021q_inskb = 1;
>>>
>>> ...
>>> {
>>> .ctl_name   = NET_CORE_8021q_INSKB,
>>> .procname   = "netdev_8021q_inskb",
>>> .data   = &netdev_8021q_inskb,
>>> .maxlen = sizeof(int),
>>> .mode   = 0444,
>>> .proc_handler   = proc_dointvec
>>> },
>>>
>>> would seem to do it to me.
>>> Then pcap can fopen("/proc/sys/net/core/netdev_8021q_inskb") and if it
>>> finds it, and it is >0, then do the cmsg thing.
>>>
>>
>
> I think it doesn't. Because then you are obviously considering adding one
> procfs file into /proc/sys/net/core/ *for each* feature that is added into
> the ancillary ops which cannot be the right way ...

We had already brought up this topic previously in the same thread. A
suggestion was made to add that proc entry and no one from netdev
responded to it saying that it did not make any sense. Therefore
before I went ahead and made the fixes in libpcap, I wanted to run
this by your guys again to make sure we are still on the same page.

I do agree that instead of a /proc entry, we should check for a kenrel
version >= X where X is the upstream version that first started
supporting all the features needed by libpcap for vlan filtering. This
is not a compile time check but a run time one. Does anyone see any
issues with this? Is there any long term implications of this, like if
you backport patches to an older long term supported kernel? Are there
other better ways to do this, like may be returning feature bits from
an ioctl call? This is something we need to deal with on a continuous
basis as we keep supporting newer AUX fields and libpcap and other
user land code needs to make use of it. At the same time, they need to
handle backward compatibility issues with older kernels.

Thanks
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-12-13 Thread Ani Sinha
On Thu, Dec 13, 2012 at 1:49 PM, Daniel Borkmann
 wrote:

> Well, not everyone on netdev might be following this thread (resp.
> following fully). The best way to get responses for a patch is to go
> through the normal patch submission process on netdev, and if you like
> to request for comments, then mark it as RFC in the subject. This way,
> people will know and likely comment on it if it makes sense or not.

OK good to know.

> As Eric mentioned earlier, for now there seems not to be a reliable
> way to get to know which ops are present and which not. It's not
> really nice, but if you want to make use of those new (ANC*) features,
> probably checking kernel version might be the only way if I'm not
> missing something. Now net-next is closed, but if it reopens, I'll
> submit a version 2 of my patch where you've been CC'd to. If it gets
> in, then at least it's for sure that since kernel  this kind of
> feature test is present.

thanks, yes, I believe we do need some sort of validation on the
ancilliary features.
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] vlan tagged packets and libpcap breakage

2012-12-17 Thread Ani Sinha
On Mon, Dec 17, 2012 at 2:35 AM, Guy Harris  wrote:
>
> On Dec 17, 2012, at 1:50 AM, "David Laight"  wrote:
>
>> How are you going to tell whether a feature is present in a non-Linux
>> kernel ?
>
> The Linux memory-mapped capture mechanism is not present in a non-Linux 
> kernel, so all the libpcap work involved here would, if necessary on other 
> platforms, have to be done >differently on those platforms.  Those platforms 
> would have to have their own mechanisms to indicate whether any changes to 
> filter code, processing of VLAN tags supplied out of >band, etc. would need 
> to be done.

Actually lib-pcap has these pcap-.c files that are kind of
like platform specific drivers that plug into platform independent
code like gencode.c or bpf_filter.c. These platform specific drivers
are responsible for getting packets from the kernel and running
filters (kernel or userland) on it. So all linux specific code to get
a packet and packet metadata from the kernel can neatly reside in
pcap-linux.c.

Unfortunately though, in this specific problem involving filtering
with vlan tags, both code generation (gentags.c) and code running the
filter (bpf_filter.c) will have to be aware of linux specific
semantics. Due to the issues that Bill had explained earlier in the
thread, we can not rely on post processing before installing the
kernel filter. Therefore, we need to generate a filter that can be
directly installed in the kernel. For the same reason, bpf_filter()
code also needs to change - be aware of linux specific semantics.
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


[tcpdump-workers] [PATCH][RFC] fix BPF filter for VLAN tagged packets

2013-01-06 Thread Ani Sinha
Hi folks :

This is the first cut of my patch that tries to fix the BPF filter code
generation issue. Please note that this is the first time I have hacked
libpcap, so I might be off in terms of code philosophy/organization of
code etc. This change has been tested by me and independently by another
grad student at UC Berkeley (CC'd here) although none of us have yet done
a very thorough testing of all the aspects. I am sending it out to the
community so that others too can give it a try and report fixes. I will
also give it more testing in the coming weeks.

The description below summarizes the patch. I'd appreciate all the
comments that I can get on this issue.

Thanks,
ani


Fix BPF filter for VLAN tagged packets

 - Linux kernel no longer puts (outermost) vlan tags within the packet but in 
packet metadata.
   Use special ANC negative offsets when generating the filter code to check if 
it's a vlan tagged packet
   and whether the vlan ID matches that of the one specified in the filter 
expression.
 - When executing the filter code in userland, take care of these special ANC 
values - use the vlan tag extracted from
   the packet metadata for comparison.
 - For saving the captured packets, re-insert the vlan tags back into the 
packet. That way, both old and new pcap files
   will have vlan tags inserted into the packet itself in the pcap files. The 
behaviour would not change between old and new
   kernels.
 - For older Linux kernels that had the vlan tags as a part of the packet and 
those kernels that did not have the special ANC opcodes
   for vlan tag access, use the old mechanism - generate filter code that looks 
into the packet offset for vlan tags.
 - For all platforms other than Linux, do not change the current behaviour.

What this patch does not handle :

When 'vlan' keyword is not specified in the filter expression, the filter code 
generator does not by default add a 'not filter'
expresion so that only untagged packets pass the filter. So for example, an 
expression like 'tcp port 80' will also include
vlan tagged packets. We need to handle this in a future patch.

Signed-off-by: Ani Sinha 

diff --git a/Makefile.in b/Makefile.in
index 772cc7d..c030a7e 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -82,7 +82,7 @@ YACC = @V_YACC@
@rm -f $@
$(CC) $(FULL_CFLAGS) -c $(srcdir)/$*.c

-PSRC = pcap-@V_PCAP@.c @USB_SRC@ @BT_SRC@ @CAN_SRC@ @NETFILTER_SRC@ 
@CANUSB_SRC@
+PSRC = pcap-@V_PCAP@.c @USB_SRC@ @BT_SRC@ @CAN_SRC@ @NETFILTER_SRC@ 
@CANUSB_SRC@ @BPF_FILTER_P@
 FSRC =  fad-@V_FINDALLDEVS@.c
 SSRC =  @SSRC@
 CSRC = pcap.c inet.c gencode.c optimize.c nametoaddr.c etherent.c \
@@ -301,6 +301,7 @@ EXTRA_DIST = \
pcap-int.h \
pcap-libdlpi.c \
pcap-linux.c \
+   bpf_filter_linux.c \
pcap-namedb.h \
pcap-netfilter-linux.c \
pcap-netfilter-linux.h \
diff --git a/bpf_filter_linux.c b/bpf_filter_linux.c
new file mode 100644
index 000..1d5d70f
--- /dev/null
+++ b/bpf_filter_linux.c
@@ -0,0 +1,571 @@
+/*-
+ * Copyright (c) 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997
+ * The Regents of the University of California.  All rights reserved.
+ *
+ * This code is derived from the Stanford/CMU enet packet filter,
+ * (net/enet.c) distributed as part of 4.3BSD, and code contributed
+ * to Berkeley by Steven McCanne and Van Jacobson both of Lawrence
+ * Berkeley Laboratory.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 3. All advertising materials mentioning features or use of this software
+ *must display the following acknowledgement:
+ * This product includes software developed by the University of
+ * California, Berkeley and its contributors.
+ * 4. Neither the name of the University nor the names of its contributors
+ *may be used to endorse or promote products derived from this software
+ *without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND 

Re: [tcpdump-workers] PROBLEM: Software injected vlan tagged packets are unable to be identified using recent BPF modifications

2013-01-07 Thread Ani Sinha
On Mon, Jan 7, 2013 at 4:05 PM, Paul Pearce  wrote:
> Hello folks,
>
> PROBLEM:
>
> vlan tagged packets that are injected via software are not picked up
> by filters using recent (kernel commit
> f3335031b9452baebfe49b8b5e55d3fe0c4677d1)
> BPF vlan modifications. I suspect this is a problem with the Linux
> kernel.
>
> linux-netdev and tcpdump-workers are both cc'd.
>

Just to be clear, up until now we did not see this issue because the
BPF filter code generated by libpcap would always look into packet
offsets for vlan tag information. With the patch that I submitted to
tcpdump-workers a day ago, it no longer looks into the packet but in
the skb meta data (which is the right thing to do going forward). This
breaks raw packets. We will have to handle this in the kernel to fix
it.

ani
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] [PATCH net 1/2] net: dev_queue_xmit_nit: fix skb->vlan_tci field value

2013-01-08 Thread Ani Sinha
+tcpdump-workers


On Tue, Jan 8, 2013 at 10:27 PM, Eric Dumazet  wrote:
> On Tue, 2013-01-08 at 22:06 -0800, Ani Sinha wrote:
>
>> The proposed patch tries to fix the issue that arose after the
>> following commit :
>>
>> commit b40863c667c16b7a73d4f034a8eab67029b5b15a
>> Author: Eric Dumazet 
>> Date:   Tue Sep 18 20:44:49 2012 +
>>
>> net: more accurate network taps in transmit path
>>
>>
>> I do not believe 3.6.11 kernel has this change. 3.6.11 should not need
>> the patch.
>
> Thats irrelevant. This only shows that user land was depending on a
> prior undocumented behavior.
>
> It seems a libpcap issue to me. Kernel side provides all needed bits.
>
> When I want "tcpdump src port 2030", filter is :
>
> (000) ldh  [12]
> (001) jeq  #0x86dd  jt 2jf 8
> (002) ldb  [20]
> (003) jeq  #0x84jt 6jf 4
> (004) jeq  #0x6 jt 6jf 5
> (005) jeq  #0x11jt 6jf 19
> (006) ldh  [54]
> (007) jeq  #0x7ee   jt 18   jf 19
> (008) jeq  #0x800   jt 9jf 19
> (009) ldb  [23]
> (010) jeq  #0x84jt 13   jf 11
> (011) jeq  #0x6 jt 13   jf 12
> (012) jeq  #0x11jt 13   jf 19
> (013) ldh  [20]
> (014) jset #0x1fff  jt 19   jf 15
> (015) ldxb 4*([14]&0xf)
> (016) ldh  [x + 14]
> (017) jeq  #0x7ee   jt 18   jf 19
> (018) ret  #96
> (019) ret  #0
>
> See how it handles both IPv4 and IPv6, and various protocols
> automatically ?
>
> If I only wanted "udp and src port 2030" it would give :
>
> (000) ldh  [12]
> (001) jeq  #0x86dd  jt 2jf 6
> (002) ldb  [20]
> (003) jeq  #0x11jt 4jf 15
> (004) ldh  [54]
> (005) jeq  #0x7ee   jt 14   jf 15
> (006) jeq  #0x800   jt 7jf 15
> (007) ldb  [23]
> (008) jeq  #0x11jt 9jf 15
> (009) ldh  [20]
> (010) jset #0x1fff  jt 15   jf 11
> (011) ldxb 4*([14]&0xf)
> (012) ldh  [x + 14]
> (013) jeq  #0x7ee   jt 14   jf 15
> (014) ret  #96
> (015) ret  #0
>
>
>
> So when I want "tcpdump vlan 100" it generates :
>
> (000) ldh  [12]
> (001) jeq  #0x8100  jt 2jf 6
> (002) ldh  [14]
> (003) and  #0xfff
> (004) jeq  #0x64jt 5jf 6
> (005) ret  #96
> (006) ret  #0
>
> What's wrong instructing libpcap to extend the filter to be able to
> get the correct result, vlan id being in skb->vlan_id (vlan accel on),
> or in the packet itself (vlan accel off)
>
> This way, you could chose if you want to get only accelerated vlan,
> or non accelerated vlan, or both. And you need no kernel hacking.
>
>
>
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] [PATCH net 1/2] net: dev_queue_xmit_nit: fix skb->vlan_tci field value

2013-01-09 Thread Ani Sinha
>> Thats irrelevant. This only shows that user land was depending on a
>> prior undocumented behavior.

Why do you say that? The following patch from Pirko ensured that on
both RX and TX regardless whether the driver/hw supported vlan
acceleration, the outermost vlan tags will always be extracted out of
the packet and put in skb aux data :

commit bcc6d47903612c3861201cc3a866fb604f26b8b2
Author: Jiri Pirko 
Date:   Thu Apr 7 19:48:33 2011 +

net: vlan: make non-hw-accel rx path similar to hw-accel

Now this meant that the filter code should always look into the aux
data for vlan tagging, not in the packet, regardless of hw
acceleration availability. Your patch
b40863c667c16b7a73d4f034a8eab67029b5b15a broke this symmetric
semantics - now on TX on the network tap, we do not have the vlan tags
in the skb aux data.

In my opinion, for a given kernel, the filter code should either look
into the packet offset or in the packet aux data for vlan tags but not
both. Otherwise the filter code becomes incredibly complex since an
inline vlan tag in the packet changes offsets of all headers coming
afterwords and I don't even know if filter code can be correctly
generated in this case. tcpdump-workers folks are CC's here and they
clearly have more experience with libpcap filter code that I do. Hence
I leave it up to them to provide inputs here.

>> What's wrong instructing libpcap to extend the filter to be able to
>> get the correct result, vlan id being in skb->vlan_id (vlan accel on),
>> or in the packet itself (vlan accel off)
>>
>> This way, you could chose if you want to get only accelerated vlan,
>> or non accelerated vlan, or both. And you need no kernel hacking.

This is wrong. Accelerated or not, the kernel code was organized to
have the tags in the packet aux data. So I think this is how user land
should be coded as well.

ani
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] [PATCH net 1/2] net: dev_queue_xmit_nit: fix skb->vlan_tci field value

2013-01-09 Thread Ani Sinha
On Wed, Jan 9, 2013 at 11:51 AM, Eric Dumazet  wrote:
> On Wed, 2013-01-09 at 11:27 -0800, Ani Sinha wrote:
>
>> This is wrong. Accelerated or not, the kernel code was organized to
>> have the tags in the packet aux data. So I think this is how user land
>> should be coded as well.
>
> You have your opinion, thats good.
>
> My opinion as a kernel developer is that the network tap is here to have
> a copy of the exact frame given to the _device_.
>

It is fine by me if that is how you see it. In that case. the
behaviour can me made symmetric on both TX and RX. Tap processing in
__netif_receive_skb() can be done before vlan_untag() so that taps see
the exact frame received from the _device_ as you put it.

ani
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] [PATCH net 1/2] net: dev_queue_xmit_nit: fix skb->vlan_tci field value

2013-01-09 Thread Ani Sinha
On Wed, Jan 9, 2013 at 12:01 PM, Ani Sinha  wrote:
> On Wed, Jan 9, 2013 at 11:51 AM, Eric Dumazet  wrote:
>> On Wed, 2013-01-09 at 11:27 -0800, Ani Sinha wrote:
>>
>>> This is wrong. Accelerated or not, the kernel code was organized to
>>> have the tags in the packet aux data. So I think this is how user land
>>> should be coded as well.
>>
>> You have your opinion, thats good.
>>
>> My opinion as a kernel developer is that the network tap is here to have
>> a copy of the exact frame given to the _device_.
>>
>
> It is fine by me if that is how you see it. In that case. the
> behaviour can me made symmetric on both TX and RX. Tap processing in
> __netif_receive_skb() can be done before vlan_untag() so that taps see
> the exact frame received from the _device_ as you put it.

Although for accelerated vlan tags, it will be in the meta data
anyways. All I am asking is, let's have the same behaviour on both TX
and RX. If  the tag in the packet let's have it that way in both ways
in what the tap sees.
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


[tcpdump-workers] "not vlan" filter expression broken catastrophically!

2013-01-31 Thread Ani Sinha
hello folks :

As you guys have been aware, I am hacking libpcap for a while. Me and
Bill noticed something seriously broken for any filter expression that
has a "not vlan" in it. For example, take a look at the filter code
generated by libpcap with an expression like "not vlan and tcp port
80" :

BpfExpression '(not vlan and tcp port 80)'
  { 0x28,  0,  0, 0x000c }, //(000) ldh  [12]
  { 0x15, 19,  0, 0x8100 }, //(001) jeq  #0x8100 jt 21  jf 2
  { 0x28,  0,  0, 0x0010 }, //(002) ldh  [16]
  { 0x15,  0,  6, 0x86dd }, //(003) jeq  #0x86dd jt 4   jf 10
  { 0x30,  0,  0, 0x0018 }, //(004) ldb  [24]
  { 0x15,  0, 15, 0x0006 }, //(005) jeq  #0x6jt 6   jf 21
  { 0x28,  0,  0, 0x003a }, //(006) ldh  [58]
  { 0x15, 12,  0, 0x0050 }, //(007) jeq  #0x50   jt 20  jf 8
  { 0x28,  0,  0, 0x003c }, //(008) ldh  [60]
  { 0x15, 10, 11, 0x0050 }, //(009) jeq  #0x50   jt 20  jf 21
  { 0x15,  0, 10, 0x0800 }, //(010) jeq  #0x800  jt 11  jf 21
  { 0x30,  0,  0, 0x001b }, //(011) ldb  [27]
  { 0x15,  0,  8, 0x0006 }, //(012) jeq  #0x6jt 13  jf 21
  { 0x28,  0,  0, 0x0018 }, //(013) ldh  [24]
  { 0x45,  6,  0, 0x1fff }, //(014) jset #0x1fff jt 21  jf 15
  { 0xb1,  0,  0, 0x0012 }, //(015) ldxb 4*([18]&0xf)
  { 0x48,  0,  0, 0x0012 }, //(016) ldh  [x + 18]
  { 0x15,  2,  0, 0x0050 }, //(017) jeq  #0x50   jt 20  jf 18
  { 0x48,  0,  0, 0x0014 }, //(018) ldh  [x + 20]
  { 0x15,  0,  1, 0x0050 }, //(019) jeq  #0x50   jt 20  jf 21
  {  0x6,  0,  0, 0x }, //(020) ret  #65535
  {  0x6,  0,  0, 0x }, //(021) ret  #0


As you can see, it loads offset 12 (ethertype). For vlan packets, it
jumps to #21 and returns false right away. However, for packets that
are not vlan tagged, it goes to #2 which loads offset 16 in the
packet. Notice that this is wrong! The offsets should be incremented
by 4 only for vlan tagged packets and not for non-vlan packets. The
problem is that in gencode.c, the off_linktype increments by 4
unconditionally whether or not the packet actually contains a vlan
tag. We do not want to increment this offset if "not vlan" is true. So
the above filter code is generated wrong.

I just wanted to point this out to folks who wishes to dig in and fix
it. I do not have time right now to think of a proper solution. It
would seem using unconditional increments of offsets like off_linktype
below the parser is not going to work. How do you know if the parser
is going to take your code generated from the "vlan" expression and
just negate it? Or may be we can hack another rule in grammar.y. I
don't know.

cheers,
ani
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] "not vlan" filter expression broken catastrophically!

2013-02-01 Thread Ani Sinha
In this case, I have used a tool written by Bill Fenner called
BpfExpression for Arista. Not sure if the code is GNU licensed. He's
in the mailing list and I will let him comment.

ani


On Fri, Feb 1, 2013 at 5:07 PM, Michael Richardson  wrote:
>
>>>>>> "Ani" == Ani Sinha  writes:
> Ani> hello folks :
>
> Ani> As you guys have been aware, I am hacking libpcap for a
> Ani> while. Me and Bill noticed something seriously broken for any
> Ani> filter expression that has a "not vlan" in it. For example,
> Ani> take a look at the filter code generated by libpcap with an
> Ani> expression like "not vlan and tcp port 80" :
>
> Ani> BpfExpression '(not vlan and tcp port 80)' { 0x28, 0, 0,
>
> Do we have any way to test libpcap expression outputs other than -d
> options to tcpdump?  I'm thinking regression tests here.
>
> --
> ]   Never tell me the odds! | ipv6 mesh networks [
> ]   Michael Richardson, Sandelman Software Works| network architect  [
> ] m...@sandelman.ca  http://www.sandelman.ca/|   ruby on rails
> [
>
> ___
> tcpdump-workers mailing list
> tcpdump-workers@lists.tcpdump.org
> https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] "not vlan" filter expression broken catastrophically!

2013-02-01 Thread Ani Sinha
On Fri, Feb 1, 2013 at 1:13 PM, Bill Fenner  wrote:
> On Fri, Feb 1, 2013 at 3:50 PM, Paul Pearce  wrote:

>
> That was my proposal to Ani, since the kernel guys seemed to insist
> that asymmetry was a virtue.
>

yes, I think ultimately we will have to do something like what Bill
suggested to me. I am still not an expert with that filter generation
code, so I am trying to handle things in steps, keeping in mind my
company's immediate needs. Too many stuff is broken :-)
The funny thing is, up until that commit in last October, the kernel
was symmetric in how it handled vlan tags :-)

ani
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] "not vlan" filter expression brokencatastrophically!

2013-02-04 Thread Ani Sinha
On Mon, Feb 4, 2013 at 1:29 AM, David Laight  wrote:
>> I agree. Honestly I think a perfectly reasonable stance to take is
>> requesting that the filters get packets *as seen on the wire/nic*. I
>> think that's the mental model everyone uses, and any deviation from
>> that model is prone to bugs in the kernel, libpcap, and for the enduser.
>
> TX and RX segmentation offload also confuse matters here.
> I think Linux can give libpcap large TCP fragments even when the
> hardware isn't doing segmentation offload.
> This also breaks the mental model.
>

It would be nice if someone can initiate a discussion/debate on netdev
on the points that we are raising here. I tried one time with vlan
issues and it met with lot of resistance.  May be another person will
have better success than me :-)
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers