[tcpdump-workers] RFC: adding netmap support to libpcap ?
Hi, i have recently made an update to the netmap I/O framework http://info.iet.unipi.it/~luigi/netmap/ that should make it easier to add netmap support to libpcap. So I was wondering if there is any interest to implement this and how we can go for it. In short (see the webpage for details) netmap is a kernel module (native on FreeBSD, external in Linux) that supports extremely high tx/rx packet rates (15-20Mpps per core, at least for the raw I/O; of course any processing will reduce your actual packet rate). You can find the most recent sources in the git repository at http://code.google.com/p/netmap/ In the past I have implemented a subset of the pcap library that lets programs run on top of netmap by just pointing LD_LIBRARY_PATH to the netmap-based library. This has some limitations though, and I'd rather see native netmap support in libpcap so we can e.g. reuse filters etc. A basic implementation of the equivalent of pcap_open_live(), pcap_close(), pcap_dispatch() and pcap_next() is in the header file (it is so small and simple that there is really no need for a user library). pcap_inject() is similarly simple. Of course they should be integrated with libpcap and support the full set of methods, so i think to figure out the following: PORTING ISSUES + interface naming netmap provides an alternate method to access standard network interfaces, so the technique I am currently using in applications is to use the interface name to discriminate between standard (bpf, PF_PACKET or other) and netmap mode. This way applications do not need changes, and commandline arguments can be used to select the operating mode. "netmap:*" refers to interfaces in netmap mode, "valeXX:YY" refers to ports of VALE virtual switches (basically dynamically created ethernet bridges; VALE is part of the netmap module), and other names would just fall back to the regular pcap methods. Does this make sense ? + template for source I suppose the way to go is to pick the simplest pcap-*.c backend and use it as a reference for the implementation -- so which one should I use ? + receive side netmap natively uses shared memory, so pcap_dispatch() and pcap_next() are trivial to implement and very cheap. + transmit side pcap_inject() can be implemented easily by copying user data into the (preallocated) buffers supplied by netmap. + zerocopy As mentioned the receive side is alrea dy zerocopy, while for the transmit side i don't know if there is a pcap method that support a transmit callback -- i.e. an equivalent of pcap_dispatch() where pcap supplies the buffer and the callback puts in its data. Also one thing to remember is the address(es) returned by pcap_next() are only good until the next invocation of select()/poll() . Again I have no idea if the pcap API gives some control to the user for that + multiqueue netmap supports multiqueue NICs both on tx and rx. There are two operating modes: 1) one file descriptor binds to all queues; 2) one file descriptor per queue. This is selected at open time. Supporting this feature could be as simple as adding a suffix to the interface name. Comments anyone ? cheers luigi ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
[tcpdump-workers] netmap support for libpcap now available
since there were no takers i went ahead and did it. The repo at http://code.google.com/p/netmap/ has been updated with a newer version of netmap, and the extra/ directory contains a netmap backend for libpcap https://code.google.com/p/netmap/source/browse/extra/libpcap-netmap.diff which works against a recent (Jan11) libpcap version from github https://github.com/the-tcpdump-group/libpcap/commits/master commit fd04c4ff9f9a6b50fccec7afb18af64433730a2b Author: Guy Harris Date: Sat Jan 11 20:38:02 2014 -0800 For some quick testing (linux; FreeBSD is similar): - download the netmap code from code.google.com/p/netmap/ - compile just the netmap module and the examples (cd LINUX; make NODRIVERS=1 ) (cd examples; make) - install the module (cd LINUX; sudo insmod ./netmap_lin.ko) (either change access privs to /dev/netmap, or run netmap clients with root privs) - fetch the pcap code - patch with the netmap support files cd pcap-base-code; patch -p1 < /wherever/is/the/netmap-base-dir/extra/libpcap-netmap-diff ) - make sure the netmap headers are accessible and rebuild libpcap export CFLAGS=-I/wherever/is/the/netmap-base-dir/sys ./configure make - create a link so ld will find it ln -s libpcap.so.1.6.0-PRE-GIT libpcap.so.0.8 - and now you can run tcpdump using the current library (depending on the OS you may need to tell apparmor to allow the library override for tcpdump, or make a non suid copy of tcpdump) LD_LIBRARY_PATH=. tcpdump -ni valexx:yy while in another window you run a netmap traffic generator /wherever/is/the/netmap-base-dir/pkt-gen -i valexx:zz -f tx You can access an ordinary interface in emulated netmap mode by prefixing the name with netmap: , but BEWARE: *** in this mode the interface is only used for capture, *** and goes back to regular mode when you exit the tcpdump LD_LIBRARY_PATH=. tcpdump -ni netmap:eth0 cheers luigi On Wed, Dec 4, 2013 at 7:27 PM, Luigi Rizzo wrote: > Hi, > i have recently made an update to the netmap I/O framework > > http://info.iet.unipi.it/~luigi/netmap/ > > that should make it easier to add netmap support to libpcap. > So I was wondering if there is any interest to implement this > and how we can go for it. > > In short (see the webpage for details) netmap is a kernel module > (native on FreeBSD, external in Linux) that supports extremely high > tx/rx packet rates (15-20Mpps per core, at least for the raw I/O; > of course any processing will reduce your actual packet rate). > You can find the most recent sources in the git repository at > > http://code.google.com/p/netmap/ > > In the past I have implemented a subset of the pcap library that > lets programs run on top of netmap by just pointing LD_LIBRARY_PATH > to the netmap-based library. This has some limitations though, > and I'd rather see native netmap support in libpcap so we can > e.g. reuse filters etc. > > > A basic implementation of the equivalent of pcap_open_live(), > pcap_close(), pcap_dispatch() and pcap_next() is in the > header file (it is so small and simple that there is really no need > for a user library). pcap_inject() is similarly simple. > > Of course they should be integrated with libpcap and support > the full set of methods, so i think to figure out the following: > > PORTING ISSUES > > + interface naming > netmap provides an alternate method to access standard network > interfaces, so the technique I am currently using in applications > is to use the interface name to discriminate between standard > (bpf, PF_PACKET or other) and netmap mode. > This way applications do not need changes, and commandline > arguments can be used to select the operating mode. > > "netmap:*" refers to interfaces in netmap mode, "valeXX:YY" refers > to ports of VALE virtual switches (basically dynamically created > ethernet bridges; VALE is part of the netmap module), and other > names would just fall back to the regular pcap methods. > > Does this make sense ? > > + template for source > I suppose the way to go is to pick the simplest pcap-*.c backend > and use it as a reference for the implementation -- so which one > should I use ? > > + receive side > netmap natively uses shared memory, so pcap_dispatch() and > pcap_next() are trivial to implement and very cheap. > > + transmit side > pcap_inject() can be implemented easily by copying user data > into the (preallocated) buffers supplied by netmap. > > + zerocopy > As mentioned the receive side is alrea > dy > zerocopy, > while for the transmit side i don't know if there is a pcap > method that support a transmit callback -- i.e. an equivalent > of pcap_dispatch() where pcap supplies the
[tcpdump-workers] code available: netmap support for libpcap
netmap]) + NETMAP_SRC=pcap-netmap.c + AC_MSG_NOTICE(netmap is supported)], + AC_MSG_NOTICE(netmap is not supported), + AC_INCLUDES_DEFAULT + ) + AC_SUBST(PCAP_SUPPORT_NETMAP) + AC_SUBST(NETMAP_SRC) +fi + AC_ARG_ENABLE([bluetooth], [AC_HELP_STRING([--enable-bluetooth],[enable Bluetooth support @<:@default=yes, if support available@:>@])], [], diff --git a/inet.c b/inet.c index c699658..d132507 100644 --- a/inet.c +++ b/inet.c @@ -883,6 +883,10 @@ pcap_lookupnet(device, netp, maskp, errbuf) #ifdef PCAP_SUPPORT_USB || strstr(device, "usbmon") != NULL #endif +#ifdef PCAP_SUPPORT_NETMAP + || !strncmp(device, "netmap:", 7) + || !strncmp(device, "vale", 4) +#endif #ifdef HAVE_SNF_API || strstr(device, "snf") != NULL #endif diff --git a/pcap-netmap.c b/pcap-netmap.c new file mode 100644 index 000..df2d01c --- /dev/null +++ b/pcap-netmap.c @@ -0,0 +1,263 @@ +/* + * Copyright (C) 2014 Luigi Rizzo. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``S IS''AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#ifdef HAVE_CONFIG_H +#include "config.h" +#endif + +#include +#include +#include +#include +#include +#include +#include +#include + +#define NETMAP_WITH_LIBS +#include + +#include "pcap-int.h" + +/* + * This code is meant to build also on older versions of libpcap. + * + * older libpcap miss p->priv, use p->md.device instead (and allocate). + * Also opt.timeout was in md.timeout before. + * Use #define PCAP_IF_UP to discriminate + */ +#ifdef PCAP_IF_UP +#define NM_PRIV(p) ((struct pcap_netmap *)(p->priv)) +#define the_timeoutopt.timeout +#else +#define HAVE_NO_PRIV +#defineNM_PRIV(p) ((struct pcap_netmap *)(p->md.device)) +#define SET_PRIV(p, x) p->md.device = (void *)x +#define the_timeoutmd.timeout +#endif + +#if defined (linux) +/* On FreeBSD we use IFF_PPROMISC which is in ifr_flagshigh. + * remap to IFF_PROMISC on linux + */ +#define IFF_PPROMISC IFF_PROMISC +#define ifr_flagshigh ifr_flags +#endif /* linux */ + +struct pcap_netmap { + struct nm_desc *d; /* pointer returned by nm_open() */ + pcap_handler cb;/* callback and argument */ + u_char *cb_arg; + int must_clear_promisc; /* flag */ + uint64_t rx_pkts; /* # of pkts received before the filter */ +}; + +static int +pcap_netmap_stats(pcap_t *p, struct pcap_stat *ps) +{ + struct pcap_netmap *pn = NM_PRIV(p); + + ps->ps_recv = pn->rx_pkts; + ps->ps_drop = 0; + ps->ps_ifdrop = 0; + return 0; +} + +static void +pcap_netmap_filter(u_char *arg, struct pcap_pkthdr *h, const u_char *buf) +{ + pcap_t *p = (pcap_t *)arg; + struct pcap_netmap *pn = NM_PRIV(p); + + ++pn->rx_pkts; + if (bpf_filter(p->fcode.bf_insns, buf, h->len, h->caplen)) + pn->cb(pn->cb_arg, h, buf); +} + +static int +pcap_netmap_dispatch(pcap_t *p, int cnt, pcap_handler cb, u_char *user) +{ + int ret; + struct pcap_netmap *pn = NM_PRIV(p); + struct nm_desc *d = pn->d; + struct pollfd pfd = { .fd = p->fd, .events = POLLIN, .revents = 0 }; + + pn->cb = cb; + pn->cb_arg = user; + + for (;;) { + if (p->break_loop) { + p->break_loop = 0; + return PCAP_ERROR_BREAK; + } + /* nm_dispatch won't run forever */ + ret = nm_dispatch((void *)d, cnt, (void *)pcap_netmap_filter, (void *)p); + if (ret != 0) +
Re: [tcpdump-workers] code available: netmap support for libpcap
On Sat, Feb 15, 2014 at 1:15 PM, Michael Richardson wrote: > > So, basically if we use a device name like "netmap:" or "vale", > then we would get support for it. Are there dependancies that would > piss off distros that we should worry about? You say that we need netmap, > but I don't see where in the build it references some new library. > There isn't any new library, which eases distributing binaries. ./configure checks for the presence of the netmap headers, and if so compiles the extra file. At runtime, netmap only uses open(), ioctl(), mmap() and poll(). Data structures are simple enough that a few macros or inline functions in netmap_user.h are all is needed to access the port and do I/O. cheers luigi ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] code available: netmap support for libpcap
On Sat, Feb 15, 2014 at 1:37 PM, Guy Harris wrote: > > On Feb 15, 2014, at 1:24 PM, Luigi Rizzo wrote: > > > At runtime, netmap only uses open(), ioctl(), mmap() and poll(). > > ...and nm_dispatch(). Is that an inline function defined in the headers? > yes, same as nm_open() and a few others: this is what i meant when I said Data structures are simple enough that a few macros or inline functions in netmap_user.h are all is needed to access the port and do I/O. cheers luigi -- -+------- Prof. Luigi RIZZO, ri...@iet.unipi.it . Dip. di Ing. dell'Informazione http://www.iet.unipi.it/~luigi/. Universita` di Pisa TEL +39-050-2211611 . via Diotisalvi 2 Mobile +39-338-6809875 . 56122 PISA (Italy) -+--- ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] code available: netmap support for libpcap
On Sat, Feb 15, 2014 at 01:41:41PM -0800, Guy Harris wrote: > > On Feb 15, 2014, at 12:17 PM, Luigi Rizzo wrote: > > > + p->linktype = DLT_EN10MB; > > So this either > > 1) only works on Ethernet devices and devices that supply Ethernet > headers > > or > > 2) generates Ethernet headers that replace the native link-layer > headers for devices that don't supply Ethernet headers? it is #1. > > > @@ -307,6 +311,9 @@ struct capture_source_type { > > int (*findalldevs_op)(pcap_if_t **, char *); > > pcap_t *(*create_op)(const char *, char *, int *); > > } capture_source_types[] = { > > +#ifdef PCAP_SUPPORT_NETMAP > > + { NULL, pcap_netmap_create }, > > +#endif > > #ifdef HAVE_DAG_API > > { dag_findalldevs, dag_create }, > > #endif > > This means that "tcpdump -D/tshark -D" and the Wireshark GUI won't show > netmap or vale devices; for command-line tools, this means you have to enter > those devices manually, but it might make it impossible to capture on those > devices in the Wireshark GUI. > > Can you enumerate the netmap and vale devices? If so, you should have a > findalldevs routine. Netmap works at least on any interface visible to the OS (in native or emulated mode, the latter with some limitations e.g not when the interface is bound to a switch), but ports of VALE switches and netmap pipes are dynamically created so any name that starts with netmap: and vale results in a valid netmap port. Also, when a port is in netmap mode is temporarily disconnected from the host stack, so you want to be careful on where you use it. The monitoring folks (bro, suricata...) will probably love this feature but for others it might be more problematic. I did have a findalldevs routine in earlier versions of the code (mostly copying the one in pcap-bpf; perhaps i could even hook on those), but removed it because it can only return a partial list of ports and i thought it would not be very useful. cheers luigi ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] code available: netmap support for libpcap
On Sat, Feb 15, 2014 at 11:24:28PM +0100, Luigi Rizzo wrote: ... > I think what Michael means is that if we include net/netmap.h and > net/netmap_user.h in the libpcap distribution, we can have the support > always compiled in and postpone the decision at compile time. ^^^ clearly i meant "run" time, cheers luigi ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] code available: netmap support for libpcap
On Sat, Feb 15, 2014 at 01:59:48PM -0800, Guy Harris wrote: > > On Feb 15, 2014, at 1:44 PM, Michael Richardson wrote: > > > where do those headers come from? Would it make sense to just include > > those headers with libpcap? That way netmap would always be available. > > There's "netmap", which is available only if the kernel includes netmap > support; as long as all systems with a kernel with netmap also provide the > headers (at least if you have a "developer package" for the OS installed if > necessary), the headers aren't an issue for the availability of netmap. first of all, thanks all for the feedback. I think what Michael means is that if we include net/netmap.h and net/netmap_user.h in the libpcap distribution, we can have the support always compiled in and postpone the decision at compile time. This seems a very interesting idea actually. We can make the build privilege system headers if available (in case something changes) and fall back to the one included in the libpcap distribution otherwise. > There's also "netmap support in libpcap", which would only be available if > the headers are available on the system on which libpcap is built; that's > also the case for some other OS features libpcap can use. If the OS kernel > doesn't include netmap support by default, and we want the user to be able to > add it to the kernel *and* have libpcap automatically be able to use it > without having to rebuild libpcap, the headers *are* an issue. > > > Are there any issues if someone makes tcpdump (or wireshark, or some other > > libpcap using program) setuid? (I don't see any call to popen()...) > > (I.e., is there any code in the netmap support that could be tricked into > doing Bad Things, including handing off privileges to arbitrary programs if > the program using libpcap is privileged?) apart from bugs, the nm_* functions in the headers only call open/ioctl/mmap, nothing else. Auditing the headers will certainly help figure out if there are bugs. The netmap module gives access to raw packets, and potentially disconnect a NIC from the system, so normally access is reserved to those who have access to /dev/netmap (which defaults to -rw-- root root on linux, and something similar on FreeBSD). So in this respect things are not much different from what happens with bpf or equivalent, if you make tcpdump setuid hopefully there are other restrictions in place that limit who can run tcpdump and see everyone's traffic. cheers luigi ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] code available: netmap support for libpcap
On Thu, Feb 27, 2014 at 11:24 AM, Guy Harris wrote: > > On Feb 15, 2014, at 2:10 PM, Luigi Rizzo wrote: > > > Netmap works at least on any interface visible to the OS > > (in native or emulated mode, the latter with some limitations > > e.g not when the interface is bound to a switch), > > So if I want to capture on eth0 in netmap mode, what interface name do I > use? > netmap:eth0 -- -+--- Prof. Luigi RIZZO, ri...@iet.unipi.it . Dip. di Ing. dell'Informazione http://www.iet.unipi.it/~luigi/. Universita` di Pisa TEL +39-050-2211611 . via Diotisalvi 2 Mobile +39-338-6809875 . 56122 PISA (Italy) -+--- ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Re: [tcpdump-workers] code available: netmap support for libpcap
On Thu, Feb 27, 2014 at 1:05 PM, Guy Harris wrote: > > On Feb 27, 2014, at 12:57 PM, Luigi Rizzo wrote: > > > this can be used to plumb things together. > > If you want to plumb things together, do you need libpcap? > the plumbing is done by netmap/vale/netmap pipes. libpcap is "only" a shim layer that can be used by tools that only speak libpcap so you do not need to recompile them. But it is a crucially important shim layer that gives you a lot of flexibility. > > Say you want to interconnect two VMs, > > Why would I use libpcap for that? > > > or a traffic generator and a firewall/ids/monitor > > that you want to test for performance, etc. > > But wouldn't I create a netmap pipe using something other than libpcap, > and only use libpcap if I want to watch the traffic on that pipe? > > I.e., what would be lost if, for example, libpcap only supported capturing > on existing netmap devices, and didn't support creating new ones on the fly? > Well you would lose the ability to connect to a VALE switch or a pipe (which only support dynamically created endpoints). Most importantly, you would need additional code to disable the functionality, because if you look at the pcap-netmap.c everything is handled in the nm_open() call. cheers luigi -- -+--- Prof. Luigi RIZZO, ri...@iet.unipi.it . Dip. di Ing. dell'Informazione http://www.iet.unipi.it/~luigi/. Universita` di Pisa TEL +39-050-2211611 . via Diotisalvi 2 Mobile +39-338-6809875 . 56122 PISA (Italy) -+--- ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers