Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
Ani Sinha writes: > cc'ing netdev. > > > On Wed, Oct 31, 2012 at 2:01 PM, Michael Richardson wrote: >> >> Thanks for this email. >> >>> "Ani" == Ani Sinha writes: >> Ani> remove "inline" from vlan_core.c functions >> Ani> Signed-off-by: David S. Miller >> >> Ani> As a result of this patch, with or without HW acceleration support >> in >> Ani> the kernel driver, the vlan tag information is pulled out from the >> Ani> packet and is inserted in the skb metadata. Thereafter, the vlan tag >> Ani> information for a 802.1q packet can be obtained my applications >> Ani> running in the user space by using the auxdata and cmsg >> Ani> structure. >> >> Do you think that the existance of this behaviour could be exposed in >> sysctl, /proc/net or /sys equivalent (we still don't have /sys/net...)? >> As a read only file that had a 0/1 in it? > > yes, we definitely need a run time check. Whether this could be in the > form of a socket option or a /proc entry I don't know. I don't see any need to add any kernel code to allow checking if vlan tags are stripped. Vlan headers are stripped on all kernel interfaces today. Vlan headers have been stripped on all but a handful of software interfaces for 6+ years. For all kernels if the vlan header is stripped it is reported in the auxdata, upon packet reception. Careful code should also look for TP_STATUS_VLAN_VALID which allows for distinguishing a striped vlan header of 0 from no vlan header. The safe assumption then is that testing for vlan headers and vlan values in bpf filters is not possible without the new bpf extentions. It is possible to test for the presence of support of the new vlan bpf extensions by attempting to load a filter that uses them. As only valid filters can be loaded, old kernels that do not support filtering of vlan tags will fail to load the a test filter with uses them. For old kernels that do not support the new extensions it is possible to generate code that looks at the ethernet header and sees if the ethertype is 0x8100 and then does things with it, but that will only work on a small handful of software only interfaces. Eric ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
Michael Richardson writes: > Thank you for this reply. > >>>>>> "Eric" == Eric W Biederman writes: > Eric> I don't see any need to add any kernel code to allow checking > Eric> if vlan tags are stripped. Vlan headers are stripped on all > Eric> kernel interfaces today. Vlan headers have been stripped on > Eric> all but a handful of software interfaces for 6+ years. For > Eric> all kernels if the vlan header is stripped it is reported in > Eric> the auxdata, upon packet reception. Careful code should also > Eric> look for TP_STATUS_VLAN_VALID which allows for distinguishing > Eric> a striped vlan header of 0 from no vlan header. > > I can regularly see vlan tags on raw dumps from the untagged ("eth0") > today, running 3.2 (debian stable): > > obiwan-[~] mcr 4848 %sudo tcpdump -i eth0 -n -p -e | grep -i vlan > listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes > 17:05:15.404909 38:60:77:38:e6:47 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q > (0x8100), length 46: vlan 3800, p 0, ethertype ARP, Request who-has > 172.30.42.1 tell 172.30.42.11, length 28 > > So, I'm curious about the statement that vlan tags have been stripped > for some time, because I don't see them stripped today. My desktop has > an Intel 82579V NIC in it... So I just took a quick look at libpcap 1.1.1.1. In pcap_read_packet in pcap-linux.c the code to readd the vlan header is protected by HAVE_PACKET_AUXDATA. In pcap_read_linux_mmap in pcap-linux.c the code to readd the vlan header is protected by HAVE_TPACKET2. That code is shifting the the ethernet header up 4 bytes and inserting the vlan header in packets as we receive them. The code is correct except for the case of packets in vlan 0. Currently the packet reconstruction is ambiguous. The most recent kernels have a TP_STATUS_VLAN_VALID flag that can be checked to see if the packet was in vlan 0 or if there was no vlan at all. libpcap probably should be taught how to handle TP_STATUS_VLAN_VALID so that it can get the vlan 0 handling correct. > Eric> For old kernels that do not support the new extensions it is > Eric> possible to generate code that looks at the ethernet header > Eric> and sees if the ethertype is 0x8100 and then does things with > Eric> it, but that will only work on a small handful of software > Eric> only interfaces. > > at tcpdump.org, our concern is to release code that works on both new, > and what for kernel.org folks would be considered "ancient" systems, > such as Centos5/RHEL5 machines which are regularly still in production > in the field (sadly...), but often need the latest diagnostics. > What I hear you saying is that our existing code will work on older > kernels, and that once we have new code to use the VLAN tag extensions, > we should simply attempt to load it, and either it loads, or we get an > error, and we go back to the code we had before. That's great news. The existing code will work as well as anything if you don't have the VLAN tag extensions. If you are not using the VLAN tag extensions there is no reliable way to filter for vlan tagged packets in the kernel as on most network devices (all for the last year) the vlan header has been stripped at the time the bpf filter is run. So yes. Just falling back to the existing code seems as good as it is going to get. Which isn't very good but it took 5+ years before people cared enough to get this fixed. :( > The major concern is that if the 802.1q header is gone, although we can > retrieve it somehow (I'm not sure how the AUXDATA is presented on the > MMAP PF_PACKET interface...) we will have to reconstruct it in order to > save it properly to a savefile. Many entities, including most major > Network Telescopes (http://en.wikipedia.org/wiki/Network_telescope) use > libpcap (and often tcpdump itself) to capture packets on probe points, > and having good performance matters here... Yes. I wasn't looking at the mmap path. The vlan information is present in the tpacket2_hdr and tpacket3_hdr structures that accompany packets read with the mmap interface. The old tpacket1_hdr doesn't support the vlan_tci information so that won't work. Not having the vlan header on the packet when the bpf filter is run is justified by performance, and likewise reading vlan tagged packets to userspace with the vlan header stripped is justified by performance. Eric ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
Daniel Borkmann writes: > Speaking of netsniff-ng where we don't reconstruct VLAN headers, users > have reported that depending on the NIC/driver resp. ethtool setting, > they can come in stripped or not (in the pf_packet's rx_ring buffer). > However, I assume VLAN AUXDATA is always consistent (and so the > BPF/BPF JIT filtering). Yes it was a mess before we added software stripping of the vlan headers a year ago. Eric ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
Ani Sinha writes: > On Sat, Nov 17, 2012 at 3:33 PM, Eric W. Biederman > wrote: >> the vlan header in packets as we receive them. >> >> The code is correct except for the case of packets in vlan 0. Currently >> the packet reconstruction is ambiguous. The most recent kernels have >> a TP_STATUS_VLAN_VALID flag that can be checked to see if the packet was >> in vlan 0 or if there was no vlan at all. libpcap probably should be >> taught how to handle TP_STATUS_VLAN_VALID so that it can get the vlan 0 >> handling correct. >> > > May be this? Two things. - TP_STATUS_VLAN_VALID lives in the tp_status field not the tp_vlan_tci field. - To work on older kernels with binaries compiled with newer headers you first want to test for tp_vlan_tci == 0 and then look at the status field for TP_STATUS_VALID. Which means the tests need to look something like: - if (aux->tp_vlan_tci == 0) +#if defined(TP_STATUS_VLAN_VALID) + if ((aux->tp_vlan_tci == 0) && !(aux->tp_status & TP_STATUS_VLAN_VALID)) +#else + if (aux->tp_vlan_tci == 0) /* this is ambigious but without the + TP_STATUS_VLAN_VALID flag, there is + nothing that we can do */ +#endif #ifdef HAVE_TPACKET2 - if (handle->md.tp_version == TPACKET_V2 && h.h2->tp_vlan_tci && + if (handle->md.tp_version == TPACKET_V2 && +#if defined(TP_STATUS_VLAN_VALID) + (h.h2->tp_vlan_tci || (h.h2->tp_status & TP_STATUS_VALID)) && +#else + h.h2->tp_vlan_tci && +#endif Eric ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
Ani Sinha writes: > On Thu, Dec 6, 2012 at 2:19 PM, Eric W. Biederman > wrote: >> Ani Sinha writes: >>>> >>> >>> May be this? >> >> Two things. >> >> - TP_STATUS_VLAN_VALID lives in the tp_status field not the tp_vlan_tci >> field. >> - To work on older kernels with binaries compiled with newer headers you >> first want to test for tp_vlan_tci == 0 and then look at the status field >> for >> TP_STATUS_VALID. > > > trying again : The patch is whitespace damaged. And one of your test is using || instead of && Eric > pcap-linux.c | 50 +++--- > 1 files changed, 31 insertions(+), 19 deletions(-) > > diff --git a/pcap-linux.c b/pcap-linux.c > index a42c3ac..8e355d3 100644 > --- a/pcap-linux.c > +++ b/pcap-linux.c > @@ -133,6 +133,7 @@ static const char rcsid[] _U_ = > #include > #include > #include > +#include > #include > #include > #include > @@ -1543,7 +1544,13 @@ pcap_read_packet(pcap_t *handle, pcap_handler > callback, u_char *userdata) > continue; > > aux = (struct tpacket_auxdata *)CMSG_DATA(cmsg); > - if (aux->tp_vlan_tci == 0) > +#if defined(TP_STATUS_VLAN_VALID) > +if ((aux->tp_vlan_tci == 0) || > !(aux->tp_status & TP_STATUS_VLAN_VALID)) The test should be && not ||. > +#else > + if (aux->tp_vlan_tci == 0) /* this is ambigious but without the > + > TP_STATUS_VLAN_VALID flag, there is > + nothing that we can do > */ > +#endif > continue; > > len = packet_len > iov.iov_len ? iov.iov_len : packet_len; > @@ -3936,7 +3926,12 @@ pcap_read_linux_mmap(pcap_t *handle, int > max_packets, pcap_handler callback, > } > > #ifdef HAVE_TPACKET2 > - if (handle->md.tp_version == TPACKET_V2 && h.h2->tp_vlan_tci && > +if ((handle->md.tp_version == TPACKET_V2) && > +#if defined(TP_STATUS_VLAN_VALID) > +(h.h2->tp_vlan_tci || (h.h2->tp_status & > TP_STATUS_VLAN_VALID)) && > +#else > +h.h2->tp_vlan_tci && > +#endif > handle->md.vlan_offset != -1 && > tp_snaplen >= (unsigned int) handle->md.vlan_offset) { > struct vlan_tag *tag; ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
Ani Sinha writes: >> >> The patch is whitespace damaged. And one of your test is using || >> instead of && > > OK, using alpine now :-) >> >> The test should be && not ||. > > aargh! I am retarded! Fixed. Hopefully 3rd time is a charm :-) Acked-by: "Eric W. Biederman" The code looks ok here. I don't know what format the tcpdump folks want patches in. The typically format is an email with [PATCH] in the subject. You indentation now comes through clear. It is a bit odd that you are indenting with spaces instead of tabs in a file that is indented with tab. Again libpcap isn't my code so I don't care but someone else might. Eric > ani > > > pcap-linux.c | 50 +++--- > 1 files changed, 31 insertions(+), 19 deletions(-) > > diff --git a/pcap-linux.c b/pcap-linux.c > index a42c3ac..b2c1a08 100644 > --- a/pcap-linux.c > +++ b/pcap-linux.c > @@ -133,6 +133,7 @@ static const char rcsid[] _U_ = > #include > #include > #include > +#include > #include > #include > #include > @@ -1543,7 +1544,13 @@ pcap_read_packet(pcap_t *handle, pcap_handler > callback, u_char *userdata) > continue; > > aux = (struct tpacket_auxdata *)CMSG_DATA(cmsg); > - if (aux->tp_vlan_tci == 0) > +#if defined(TP_STATUS_VLAN_VALID) > +if ((aux->tp_vlan_tci == 0) && !(aux->tp_status & > TP_STATUS_VLAN_VALID)) > +#else > + if (aux->tp_vlan_tci == 0) /* this is ambigious but > without the > + TP_STATUS_VLAN_VALID > flag, there is > + nothing that we can do > */ > +#endif > continue; > > len = packet_len > iov.iov_len ? iov.iov_len : > packet_len; > @@ -3936,7 +3926,12 @@ pcap_read_linux_mmap(pcap_t *handle, int max_packets, > pcap_handler callback, > } > > #ifdef HAVE_TPACKET2 > - if (handle->md.tp_version == TPACKET_V2 && h.h2->tp_vlan_tci && > +if ((handle->md.tp_version == TPACKET_V2) && > +#if defined(TP_STATUS_VLAN_VALID) > +(h.h2->tp_vlan_tci || (h.h2->tp_status & > TP_STATUS_VLAN_VALID)) && > +#else > +h.h2->tp_vlan_tci && > +#endif > handle->md.vlan_offset != -1 && > tp_snaplen >= (unsigned int) handle->md.vlan_offset) { > struct vlan_tag *tag; ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] PROBLEM: Software injected vlan tagged packets are unable to be identified using recent BPF modifications
Jiri Pirko writes: > Tue, Jan 08, 2013 at 01:05:39AM CET, pea...@cs.berkeley.edu wrote: >>Hello folks, >> >>PROBLEM: >> >>vlan tagged packets that are injected via software are not picked up >>by filters using recent (kernel commit >>f3335031b9452baebfe49b8b5e55d3fe0c4677d1) >>BPF vlan modifications. I suspect this is a problem with the Linux >>kernel. >> >>linux-netdev and tcpdump-workers are both cc'd. >> >>BACKGROUND: >> >>Kernel commit bcc6d47903612c3861201cc3a866fb604f26b8b2 (Jiri >>Pirko/David S. Miller) removed vlan headers on rx packets prior to >>them reaching the packet filters. This broke BPF/libpcap's ability to >>do kernel-level packet filtering based on vlan tag information (the >>'vlan' keyword). >> >>Kernel commit f3335031b9452baebfe49b8b5e55d3fe0c4677d1 (Eric >>Dumazet/David S. Miller, just merged into Linus's tree >>http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=f3335031b9452baebfe49b8b5e55d3fe0c4677d1) >>added the ability to use BPF to once again filter based on vlan >>tags. Related bpf jit commit: >>http://www.spinics.net/lists/netdev/msg214759.html >> >>libpcap (Ani Sinha) recently RFC'd a patch to use Eric/David's BPF >>modifications to restore vlan filtering to libpcap. >>http://www.mail-archive.com/tcpdump-workers@lists.tcpdump.org/msg06810.html >>I'm using this patch and it works. >> >>DETAILS: >> >>Under these patches vlan tagged packets received from mediam (actual >>packets from the wire) can be identified based on vlan tag information >>using the new BPF functionality.This is good. >> >>However, raw vlan tagged packets that are *injected* into the >>interface using libpcap's pcap_inject() (which is just a fancy wrapper >>for the send() syscall) are not identified by filters using the recent >>BPF modifications. >> >>The bug manifests itself if you attempt to use the new BPF >>modifications to filter vlan tagged packets on a live interface. All >>packets from the medium show up, but all injected packets are dropped. >> >>Prior to commit bcc6d47 both medium and injected packets could both be >>identified using BPFs. >> >>These injected packets can however still be identified using the >>previous, now incorrect "offset into the header" technique. Given >>this, I suspect what's going on is the kernel code path for these >>injected packets is not setting skb->vlan_tci correctly (at all?). >>Since the vlan tag is not in the skb data structure the new BPF >>modifications don't identify the packets as having a vlan tag, >>despite it being in the packet header. > > > You are right. skb->vlan_tci is not set. Looking at packet_snd() function > in net/packet/af_packet.c I guess that something like following patch > would be needed: > > diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c > index e639645..2238559 100644 > --- a/net/packet/af_packet.c > +++ b/net/packet/af_packet.c > @@ -2292,6 +2292,12 @@ static int packet_snd(struct socket *sock, > if (unlikely(extra_len == 4)) > skb->no_fcs = 1; > > + if (skb->protocol == cpu_to_be16(ETH_P_8021Q)) { > + skb = vlan_untag(skb); > + if (unlikely(!skb)) > + goto out_unlock; > + } > + > /* >* Now send it >*/ > > Thoughts? That sounds about right. I don't know how much NIC drivers care but it looks like af_packet is the only path through the code that can generate a vlan tagged packet that we will transmit where a vlan tagged packet can be generated without setting vlan_tci. So it should make the code safer. Certainly we want this to look to the vlan driver like a proper case of nested vlans and not something weird. But note we need to handle tpacket_snd as well as packet_snd. Eric ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] [PATCH net 1/2] net: dev_queue_xmit_nit: fix skb->vlan_tci field value
Paul Pearce writes: >>> My opinion as a kernel developer is that the network tap is here to have >>> a copy of the exact frame given to the _device_. > >> Good: as someone who spends lots of time with tcpdump doing both network >> and protocol diagnostics, it's really important to see exactly there. >> If that means turning off some hardware offload in order to get the >> intact 1p header, then that may be fine for many situations. >> (At 10G, on a live router... well...) > > I agree as well. > > But I think Ani's point was that for RX packets, as of commit > bcc6d47903612c3861201cc3a866fb604f26b8b2, the filters are not > getting exactly what's "on the wire." Independent of hardware > acceleration the vlan headers are being stripped off and skb->vlan_tci > is being set. That's was the origin of this whole mess. The mess goes back much farther than that. That commit just flushed a lot of the mess out into the open, and made it apparent the kernel had insufficient facilities for dealing with packets whose vlan tags had been stripped and that libpcap had not been handling stripped vlan tags. > The msg from that commit reads in part: >> Vlan untagging happens early in __netif_receive_skb so the rest of >> code (ptype_all handlers, rx_handlers) see the skb like it was >> untagged by hw. > > His confusion (which I share) is why it's acceptable to have this > behavior of removing headers and setting skb->vlan_tci (regardless of > hardware acceleration) on the RX path but not also set skb->vlan_tci > on the TX path. On all paths the kernel will now set a flag VLAN_TAG_PRESENT if the vlan_tci is stripped off and used. So there is no pressing need for a kernel change. recvmsg and BPF filters have all of the information they need to figure out what is going on. So at this point this is a libpcap problem not a kernel problem. On the RX path always stripping the header allowed the vlan processing code to be simplified and some bugs to be fixed. Just reading through the code a bit more it looks like stripping the vlan headers on TX if the network device does not support vlan header accelleration is a performance loss. There are other cases besides AF_PACKET in particular vlan_dev_hard_header that will insert the vlan header on a packet before the packet is transmitted. > Indepdent of proposed userspace or PACKET_AUXDATA solutions, > clarification on the RX skb->vlan_tci behavior would be appreciated. There are two variables now available in AUXDATA and in the BPF filters for packets. VLAN_TAG_PRESENT and VLAN_TAG. Packets that have their vlan tags stripped have VLAN_TAG_PRESENT set and the tag is available in VLAN_TAG. > My knowledge of this code is quite limited so it's entirely possible > I'm off base here. If so please tell me. Eric ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers