[tcpdump-workers] RFC: adding netmap support to libpcap ?

2013-12-04 Thread Luigi Rizzo
Hi,
i have recently made an update to the netmap I/O framework

http://info.iet.unipi.it/~luigi/netmap/

that should make it easier to add netmap support to libpcap.
So I was wondering if there is any interest to implement this
and how we can go for it.

In short (see the webpage for details) netmap is a kernel module
(native on FreeBSD, external in Linux) that supports extremely high
tx/rx packet rates (15-20Mpps per core, at least for the raw I/O;
of course any processing will reduce your actual packet rate).
You can find the most recent sources in the git repository at

http://code.google.com/p/netmap/

In the past I have implemented a subset of the pcap library that
lets programs run on top of netmap by just pointing LD_LIBRARY_PATH
to the netmap-based library.  This has some limitations though,
and I'd rather see native netmap support in libpcap so we can
e.g. reuse filters etc.


A basic implementation of the equivalent of pcap_open_live(),
pcap_close(), pcap_dispatch() and pcap_next() is in the 
header file (it is so small and simple that there is really no need
for a user library). pcap_inject() is similarly simple.

Of course they should be integrated with libpcap and support
the full set of methods, so i think to figure out the following:

PORTING ISSUES

+ interface naming
  netmap provides an alternate method to access standard network
  interfaces, so the technique I am currently using in applications
  is to use the interface name to discriminate between standard
  (bpf, PF_PACKET or other) and netmap mode.
  This way applications do not need changes, and commandline
  arguments can be used to select the operating mode.

  "netmap:*" refers to interfaces in netmap mode, "valeXX:YY" refers
  to ports of VALE virtual switches (basically dynamically created
  ethernet bridges; VALE is part of the netmap module), and other
  names would just fall back to the regular pcap methods.

  Does this make sense ?

+ template for source
  I suppose the way to go is to pick the simplest pcap-*.c backend
  and use it as a reference for the implementation -- so which one
  should I use ?

+ receive side
  netmap natively uses shared memory, so pcap_dispatch() and
  pcap_next() are trivial to implement and very cheap.

+ transmit side
  pcap_inject() can be implemented easily by copying user data
  into the (preallocated) buffers supplied by netmap.

+ zerocopy
  As mentioned the receive side is alrea
dy
zerocopy,
  while for the transmit side i don't know if there is a pcap
  method that support a transmit callback -- i.e. an equivalent
  of pcap_dispatch() where pcap supplies the buffer and the
  callback puts in its data.
  Also one thing to remember is the address(es) returned by
  pcap_next() are only good until the next invocation of
  select()/poll() . Again I have no idea if the pcap API
  gives some control to the user for that

+ multiqueue
  netmap supports multiqueue NICs both on tx and rx.
  There are two operating modes: 1) one file descriptor
  binds to all queues; 2) one file descriptor per queue.
  This is selected at open time.
  Supporting this feature could be as simple as adding
  a suffix to the interface name.


Comments anyone ?

cheers
luigi
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


[tcpdump-workers] netmap support for libpcap now available

2014-01-14 Thread Luigi Rizzo
since there were no takers i went ahead and did it.

The repo at  http://code.google.com/p/netmap/ has been
updated with a newer version of netmap, and the extra/
directory contains a netmap backend for libpcap
https://code.google.com/p/netmap/source/browse/extra/libpcap-netmap.diff

which works against a recent (Jan11) libpcap version from github

https://github.com/the-tcpdump-group/libpcap/commits/master

commit fd04c4ff9f9a6b50fccec7afb18af64433730a2b
Author: Guy Harris 
Date:   Sat Jan 11 20:38:02 2014 -0800

For some quick testing (linux; FreeBSD is similar):

- download the netmap code from code.google.com/p/netmap/

- compile just the netmap module and the examples

   (cd LINUX; make NODRIVERS=1 )
   (cd examples; make)

- install the module
   (cd LINUX; sudo insmod ./netmap_lin.ko)
  (either change access privs to /dev/netmap, or
  run netmap clients with root privs)

- fetch the pcap code

- patch with the netmap support files

   cd pcap-base-code;
   patch -p1 < /wherever/is/the/netmap-base-dir/extra/libpcap-netmap-diff )

- make sure the netmap headers are accessible and rebuild libpcap

   export CFLAGS=-I/wherever/is/the/netmap-base-dir/sys
   ./configure
   make

- create a link so ld will find it
   ln -s libpcap.so.1.6.0-PRE-GIT libpcap.so.0.8

- and now you can run tcpdump using the current library
  (depending on the OS you may need to tell apparmor to
  allow the library override for tcpdump, or make a non suid
  copy of tcpdump)

   LD_LIBRARY_PATH=. tcpdump -ni valexx:yy

   while in another window you run a netmap traffic generator

   /wherever/is/the/netmap-base-dir/pkt-gen -i valexx:zz -f tx

You can access an ordinary interface in emulated netmap mode by
prefixing the name with netmap: , but BEWARE:
*** in this mode the interface is only used for capture,
*** and goes back to regular mode when you exit the tcpdump

   LD_LIBRARY_PATH=. tcpdump -ni netmap:eth0

cheers
luigi


On Wed, Dec 4, 2013 at 7:27 PM, Luigi Rizzo  wrote:

> Hi,
> i have recently made an update to the netmap I/O framework
>
> http://info.iet.unipi.it/~luigi/netmap/
>
> that should make it easier to add netmap support to libpcap.
> So I was wondering if there is any interest to implement this
> and how we can go for it.
>
> In short (see the webpage for details) netmap is a kernel module
> (native on FreeBSD, external in Linux) that supports extremely high
> tx/rx packet rates (15-20Mpps per core, at least for the raw I/O;
> of course any processing will reduce your actual packet rate).
> You can find the most recent sources in the git repository at
>
> http://code.google.com/p/netmap/
>
> In the past I have implemented a subset of the pcap library that
> lets programs run on top of netmap by just pointing LD_LIBRARY_PATH
> to the netmap-based library.  This has some limitations though,
> and I'd rather see native netmap support in libpcap so we can
> e.g. reuse filters etc.
>
>
> A basic implementation of the equivalent of pcap_open_live(),
> pcap_close(), pcap_dispatch() and pcap_next() is in the 
> header file (it is so small and simple that there is really no need
> for a user library). pcap_inject() is similarly simple.
>
> Of course they should be integrated with libpcap and support
> the full set of methods, so i think to figure out the following:
>
> PORTING ISSUES
>
> + interface naming
>   netmap provides an alternate method to access standard network
>   interfaces, so the technique I am currently using in applications
>   is to use the interface name to discriminate between standard
>   (bpf, PF_PACKET or other) and netmap mode.
>   This way applications do not need changes, and commandline
>   arguments can be used to select the operating mode.
>
>   "netmap:*" refers to interfaces in netmap mode, "valeXX:YY" refers
>   to ports of VALE virtual switches (basically dynamically created
>   ethernet bridges; VALE is part of the netmap module), and other
>   names would just fall back to the regular pcap methods.
>
>   Does this make sense ?
>
> + template for source
>   I suppose the way to go is to pick the simplest pcap-*.c backend
>   and use it as a reference for the implementation -- so which one
>   should I use ?
>
> + receive side
>   netmap natively uses shared memory, so pcap_dispatch() and
>   pcap_next() are trivial to implement and very cheap.
>
> + transmit side
>   pcap_inject() can be implemented easily by copying user data
>   into the (preallocated) buffers supplied by netmap.
>
> + zerocopy
>   As mentioned the receive side is alrea
> dy
> zerocopy,
>   while for the transmit side i don't know if there is a pcap
>   method that support a transmit callback -- i.e. an equivalent
>   of pcap_dispatch() where pcap supplies the 

[tcpdump-workers] code available: netmap support for libpcap

2014-02-15 Thread Luigi Rizzo
netmap])
+   NETMAP_SRC=pcap-netmap.c
+   AC_MSG_NOTICE(netmap is supported)],
+   AC_MSG_NOTICE(netmap is not supported),
+   AC_INCLUDES_DEFAULT
+  )
+   AC_SUBST(PCAP_SUPPORT_NETMAP)
+   AC_SUBST(NETMAP_SRC)
+fi
+
 AC_ARG_ENABLE([bluetooth],
 [AC_HELP_STRING([--enable-bluetooth],[enable Bluetooth support 
@<:@default=yes, if support available@:>@])],
 [],
diff --git a/inet.c b/inet.c
index c699658..d132507 100644
--- a/inet.c
+++ b/inet.c
@@ -883,6 +883,10 @@ pcap_lookupnet(device, netp, maskp, errbuf)
 #ifdef PCAP_SUPPORT_USB
|| strstr(device, "usbmon") != NULL
 #endif
+#ifdef PCAP_SUPPORT_NETMAP
+   || !strncmp(device, "netmap:", 7)
+   || !strncmp(device, "vale", 4)
+#endif
 #ifdef HAVE_SNF_API
|| strstr(device, "snf") != NULL
 #endif
diff --git a/pcap-netmap.c b/pcap-netmap.c
new file mode 100644
index 000..df2d01c
--- /dev/null
+++ b/pcap-netmap.c
@@ -0,0 +1,263 @@
+/*
+ * Copyright (C) 2014 Luigi Rizzo. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ *   1. Redistributions of source code must retain the above copyright
+ *  notice, this list of conditions and the following disclaimer.
+ *   2. Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in the
+ *  documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``S IS''AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define NETMAP_WITH_LIBS
+#include 
+
+#include "pcap-int.h"
+
+/*
+ * This code is meant to build also on older versions of libpcap.
+ *
+ * older libpcap miss p->priv, use p->md.device instead (and allocate).
+ * Also opt.timeout was in md.timeout before.
+ * Use #define PCAP_IF_UP to discriminate
+ */
+#ifdef PCAP_IF_UP
+#define NM_PRIV(p) ((struct pcap_netmap *)(p->priv))
+#define the_timeoutopt.timeout
+#else
+#define HAVE_NO_PRIV
+#defineNM_PRIV(p)  ((struct pcap_netmap *)(p->md.device))
+#define SET_PRIV(p, x) p->md.device = (void *)x
+#define the_timeoutmd.timeout
+#endif
+
+#if defined (linux)
+/* On FreeBSD we use IFF_PPROMISC which is in ifr_flagshigh.
+ * remap to IFF_PROMISC on linux
+ */
+#define IFF_PPROMISC   IFF_PROMISC
+#define ifr_flagshigh  ifr_flags
+#endif /* linux */
+
+struct pcap_netmap {
+   struct nm_desc *d;  /* pointer returned by nm_open() */
+   pcap_handler cb;/* callback and argument */
+   u_char *cb_arg;
+   int must_clear_promisc; /* flag */
+   uint64_t rx_pkts;   /* # of pkts received before the filter */
+};
+
+static int
+pcap_netmap_stats(pcap_t *p, struct pcap_stat *ps)
+{
+   struct pcap_netmap *pn = NM_PRIV(p);
+
+   ps->ps_recv = pn->rx_pkts;
+   ps->ps_drop = 0;
+   ps->ps_ifdrop = 0;
+   return 0;
+}
+
+static void
+pcap_netmap_filter(u_char *arg, struct pcap_pkthdr *h, const u_char *buf)
+{
+   pcap_t *p = (pcap_t *)arg;
+   struct pcap_netmap *pn = NM_PRIV(p);
+
+   ++pn->rx_pkts;
+   if (bpf_filter(p->fcode.bf_insns, buf, h->len, h->caplen))
+   pn->cb(pn->cb_arg, h, buf);
+}
+
+static int
+pcap_netmap_dispatch(pcap_t *p, int cnt, pcap_handler cb, u_char *user)
+{
+   int ret;
+   struct pcap_netmap *pn = NM_PRIV(p);
+   struct nm_desc *d = pn->d;
+   struct pollfd pfd = { .fd = p->fd, .events = POLLIN, .revents = 0 };
+
+   pn->cb = cb;
+   pn->cb_arg = user;
+
+   for (;;) {
+   if (p->break_loop) {
+   p->break_loop = 0;
+   return PCAP_ERROR_BREAK;
+   }
+   /* nm_dispatch won't run forever */
+   ret = nm_dispatch((void *)d, cnt, (void *)pcap_netmap_filter, 
(void *)p);
+   if (ret != 0)
+ 

Re: [tcpdump-workers] code available: netmap support for libpcap

2014-02-15 Thread Luigi Rizzo
On Sat, Feb 15, 2014 at 1:15 PM, Michael Richardson wrote:

>
> So, basically if we use a device name like "netmap:" or "vale",
> then we would get support for it.  Are there dependancies that would
> piss off distros that we should worry about?  You say that we need netmap,
> but I don't see where in the build it references some new library.
>

There isn't any new library, which eases distributing binaries.

./configure checks for the presence of the netmap headers,
and if so compiles the extra file.

At runtime, netmap only uses open(), ioctl(), mmap() and poll().

Data structures are simple enough that a few macros
or inline functions in netmap_user.h are all is needed
to access the port and do I/O.

cheers
luigi

___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] code available: netmap support for libpcap

2014-02-15 Thread Luigi Rizzo
On Sat, Feb 15, 2014 at 1:37 PM, Guy Harris  wrote:

>
> On Feb 15, 2014, at 1:24 PM, Luigi Rizzo  wrote:
>
> > At runtime, netmap only uses open(), ioctl(), mmap() and poll().
>
> ...and nm_dispatch().  Is that an inline function defined in the headers?
>

yes, same as nm_open() and a few others: this is what
i meant when I said

Data structures are simple enough that a few macros
or inline functions in netmap_user.h are all is needed
to access the port and do I/O.




cheers
luigi



-- 
-+-------
 Prof. Luigi RIZZO, ri...@iet.unipi.it  . Dip. di Ing. dell'Informazione
 http://www.iet.unipi.it/~luigi/. Universita` di Pisa
 TEL  +39-050-2211611   . via Diotisalvi 2
 Mobile   +39-338-6809875   . 56122 PISA (Italy)
-+---
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] code available: netmap support for libpcap

2014-02-15 Thread Luigi Rizzo
On Sat, Feb 15, 2014 at 01:41:41PM -0800, Guy Harris wrote:
> 
> On Feb 15, 2014, at 12:17 PM, Luigi Rizzo  wrote:
> 
> > +   p->linktype = DLT_EN10MB;
> 
> So this either
> 
>   1) only works on Ethernet devices and devices that supply Ethernet 
> headers
> 
> or
> 
>   2) generates Ethernet headers that replace the native link-layer 
> headers for devices that don't supply Ethernet headers?

it is #1.

> 
> > @@ -307,6 +311,9 @@ struct capture_source_type {
> > int (*findalldevs_op)(pcap_if_t **, char *);
> > pcap_t *(*create_op)(const char *, char *, int *);
> > } capture_source_types[] = {
> > +#ifdef PCAP_SUPPORT_NETMAP
> > +   { NULL, pcap_netmap_create },
> > +#endif
> > #ifdef HAVE_DAG_API
> > { dag_findalldevs, dag_create },
> > #endif
> 
> This means that "tcpdump -D/tshark -D" and the Wireshark GUI won't show 
> netmap or vale devices; for command-line tools, this means you have to enter 
> those devices manually, but it might make it impossible to capture on those 
> devices in the Wireshark GUI.
> 
> Can you enumerate the netmap and vale devices?  If so, you should have a 
> findalldevs routine.

Netmap works at least on any interface visible to the OS
(in native or emulated mode, the latter with some limitations
e.g not when the interface is bound to a switch),
but ports of VALE switches and netmap pipes are dynamically created
so any name that starts with netmap: and vale results in a
valid netmap port.

Also, when a port is in netmap mode is temporarily disconnected from
the host stack, so you want to be careful on where you use it.
The monitoring folks (bro, suricata...) will probably love this
feature but for others it might be more problematic.

I did have a findalldevs routine in earlier versions of the code
(mostly copying the one in pcap-bpf; perhaps i could even hook
on those),
but removed it because it can only return a partial list of ports
and i thought it would not be very useful.

cheers
luigi
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] code available: netmap support for libpcap

2014-02-15 Thread Luigi Rizzo
On Sat, Feb 15, 2014 at 11:24:28PM +0100, Luigi Rizzo wrote:
...
> I think what Michael means is that if we include net/netmap.h and
> net/netmap_user.h in the libpcap distribution, we can have the support
> always compiled in and postpone the decision at compile time.
  ^^^

clearly i meant "run" time,

cheers
luigi
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] code available: netmap support for libpcap

2014-02-15 Thread Luigi Rizzo
On Sat, Feb 15, 2014 at 01:59:48PM -0800, Guy Harris wrote:
> 
> On Feb 15, 2014, at 1:44 PM, Michael Richardson  wrote:
> 
> > where do those headers come from?  Would it make sense to just include
> > those headers with libpcap?  That way netmap would always be available.
> 
> There's "netmap", which is available only if the kernel includes netmap 
> support; as long as all systems with a kernel with netmap also provide the 
> headers (at least if you have a "developer package" for the OS installed if 
> necessary), the headers aren't an issue for the availability of netmap.

first of all, thanks all for the feedback.

I think what Michael means is that if we include net/netmap.h and
net/netmap_user.h in the libpcap distribution, we can have the support
always compiled in and postpone the decision at compile time.

This seems a very interesting idea actually.
We can make the build privilege system headers if available
(in case something changes) and fall back to the one included
in the libpcap distribution otherwise.

> There's also "netmap support in libpcap", which would only be available if 
> the headers are available on the system on which libpcap is built; that's 
> also the case for some other OS features libpcap can use.  If the OS kernel 
> doesn't include netmap support by default, and we want the user to be able to 
> add it to the kernel *and* have libpcap automatically be able to use it 
> without having to rebuild libpcap, the headers *are* an issue.
> 
> > Are there any issues if someone makes tcpdump (or wireshark, or some other
> > libpcap using program) setuid?  (I don't see any call to popen()...)
> 
> (I.e., is there any code in the netmap support that could be tricked into 
> doing Bad Things, including handing off privileges to arbitrary programs if 
> the program using libpcap is privileged?)

apart from bugs, the nm_* functions in the headers only call open/ioctl/mmap,
nothing else. Auditing the headers will certainly help figure out if there
are bugs.

The netmap module gives access to raw packets, and potentially
disconnect a NIC from the system, so normally access is reserved to those
who have access to /dev/netmap (which defaults to -rw-- root root on linux,
and something similar on FreeBSD).
So in this respect things are not much different from what happens with
bpf or equivalent, if you make tcpdump setuid hopefully there are
other restrictions in place that limit who can run tcpdump and
see everyone's traffic.

cheers
luigi
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] code available: netmap support for libpcap

2014-02-27 Thread Luigi Rizzo
On Thu, Feb 27, 2014 at 11:24 AM, Guy Harris  wrote:

>
> On Feb 15, 2014, at 2:10 PM, Luigi Rizzo  wrote:
>
> > Netmap works at least on any interface visible to the OS
> > (in native or emulated mode, the latter with some limitations
> > e.g not when the interface is bound to a switch),
>
> So if I want to capture on eth0 in netmap mode, what interface name do I
> use?
>

netmap:eth0



-- 
-+---
 Prof. Luigi RIZZO, ri...@iet.unipi.it  . Dip. di Ing. dell'Informazione
 http://www.iet.unipi.it/~luigi/. Universita` di Pisa
 TEL  +39-050-2211611   . via Diotisalvi 2
 Mobile   +39-338-6809875   . 56122 PISA (Italy)
-+---
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] code available: netmap support for libpcap

2014-02-27 Thread Luigi Rizzo
On Thu, Feb 27, 2014 at 1:05 PM, Guy Harris  wrote:

>
> On Feb 27, 2014, at 12:57 PM, Luigi Rizzo  wrote:
>
> > this can be used to plumb things together.
>
> If you want to plumb things together, do you need libpcap?
>

the plumbing is done by netmap/vale/netmap pipes.

libpcap is "only" a shim layer that can be used by
tools that only speak libpcap so you do not need
to recompile them.

But it is a crucially important shim layer that
gives you a lot of flexibility.



> > Say you want to interconnect two VMs,
>
> Why would I use libpcap for that?
>
> > or a traffic generator and a firewall/ids/monitor
> > that you want to test for performance, etc.
>
> But wouldn't I create a netmap pipe using something other than libpcap,
> and only use libpcap if I want to watch the traffic on that pipe?
>
> I.e., what would be lost if, for example, libpcap only supported capturing
> on existing netmap devices, and didn't support creating new ones on the fly?
>

Well you would lose the ability to connect to a
VALE switch or a pipe (which only support dynamically
created endpoints).

Most importantly, you would need additional code to
disable the functionality, because if you look
at the pcap-netmap.c everything is handled in the
nm_open() call.

cheers
luigi



-- 
-+---
 Prof. Luigi RIZZO, ri...@iet.unipi.it  . Dip. di Ing. dell'Informazione
 http://www.iet.unipi.it/~luigi/. Universita` di Pisa
 TEL  +39-050-2211611   . via Diotisalvi 2
 Mobile   +39-338-6809875   . 56122 PISA (Italy)
-+---
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers