[tcpdump-workers] User-space bridge on Solaris?

2008-09-19 Thread Ben Greear

I noticed that pcap_setdirection doesn't appear to work on Solaris.

Anyone know if it would be possible to get this functionality implemented?

Without this, it is very difficult (and not efficient even if possible) 
to write

software to bridge two interfaces in Solaris.

Thanks,
Ben

--
Ben Greear <[EMAIL PROTECTED]> 
Candela Technologies Inc  http://www.candelatech.com



-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.


Re: [tcpdump-workers] User-space bridge on Solaris?

2008-09-20 Thread Ben Greear

Guy Harris wrote:


On Sep 19, 2008, at 8:16 PM, Ben Greear wrote:


I noticed that pcap_setdirection doesn't appear to work on Solaris.

Anyone know if it would be possible to get this functionality 
implemented?


Libpcap runs atop DLPI in Solaris.  In my experience with at least one 
version of Solaris, if you don't enable promiscuous mode, only packets 
received by the host, not packets sent by the host, are delivered.  
The downside is that, if you just want to capture traffic to and from 
your machine, you can't do that by just turning promiscuous mode off 
(and, at least on older versions of Solaris, you have to be root to 
turn promiscuous mode on, even if you have permission to open the DLPI 
device).  The upside is that, if your application doesn't need 
promiscuous mode, and your application doesn't want to see outgoing 
packets, you get that for free by not requesting promiscuous mode.
To be a bridge, you have to receive all traffic, so disabling PROMISC 
isn't really an option as far as I can tell.


I implemented a work-around where I keep a list of transmitted pkts and 
then walk it on receive and

discard any received that are in the tx-list.

But, that causes a lot of extra work, and in certain degenerate cases 
could cause improper ignoring of
a received packet.   I got it running at around 100Mbps with only 
sporadic drops,
but can easily push several times that with Linux on the same hardware 
(using the very easy to use

raw-packet socket API, not pcap).

Another thing:  I tried setting the time-out to 1ms in pcap_open_live, 
and it seems the fd
never became active for read as far as select was concerned until I set 
the system clock to 1ms

(echo set hires_ticks=1 >> /etc/system)

Thanks,
Ben



-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.



--
Ben Greear <[EMAIL PROTECTED]> 
Candela Technologies Inc  http://www.candelatech.com



-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.


Re: [tcpdump-workers] libpcap & poll()

2008-11-13 Thread Ben Greear

Aaron Turner wrote:

I've been told by an end user under Linux 2.6.x at least that, he's
seeing very high CPU utilization numbers with tcpbridge which uses
libpcap to read packets.  Sounds like the cause of the issue is that
I'm using poll() to determine when to read from libpcap.  I'm using
poll() because my code listens on multiple interfaces, hence I need a
way to look at multiple pcap handles.

Questions basically boil down to:
1) Is this expected?
2) Is there a better way?


poll is good generally.

Are you reading multiple packets when poll says the descriptor
is readable?

If you set the descriptor to non-blocking mode, you can read as
many as are available each loop...

Thanks,
Ben



Thanks,
Aaron




--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.


Re: [tcpdump-workers] libpcap & poll()

2008-11-13 Thread Ben Greear

Aaron Turner wrote:

On Thu, Nov 13, 2008 at 1:34 PM, Ben Greear <[EMAIL PROTECTED]> wrote:

Aaron Turner wrote:

I've been told by an end user under Linux 2.6.x at least that, he's
seeing very high CPU utilization numbers with tcpbridge which uses
libpcap to read packets.  Sounds like the cause of the issue is that
I'm using poll() to determine when to read from libpcap.  I'm using
poll() because my code listens on multiple interfaces, hence I need a
way to look at multiple pcap handles.

Questions basically boil down to:
1) Is this expected?
2) Is there a better way?

poll is good generally.

Are you reading multiple packets when poll says the descriptor
is readable?


yes, I'm passing -1 as the count to pcap_dispatch() on the active pcap
handle(s).


If you set the descriptor to non-blocking mode, you can read as
many as are available each loop...


So basically instead of using poll(), use pcap_setnonblock() on both
handles, and then use pcap_dispatch() in a loop, processing both
handles?


I run poll on all of the pcap descriptors I have open.
All descriptors are set to nonblocking.

Then, when poll returns, I run this for each descriptor
to read up to 30 packets at a time:

  int cnt = 0;
  while (cnt < 30) {
 cnt++;

 errno = 0;
 int res = pcap_next_ex(pcap_dev, &header, &pkt_data);
 if (res < 0) {
VLOG << "pcap_next_ex failed, possible error:  " << strerror(errno)
 << endl;
stopTest(false, true, "pcap_next_ex failed");
 }
 else if (res == 0) {
// no more to read right now
break;
 }
 else {
// Got one, process packet
 }
 }//while



Is this going to be portable & consistent?  I remember there being
some issues with something like this under Solaris with timeouts- i
think it was the timeout didn't start counting down until the first
packet arrived or something like that.  I would of assumed that using
poll() would be more efficient since the code effectively blocks until
a packet arrives rather then looping constantly even when idle.


The code above works on Solaris, but does not work on Windows since
there is nothing to poll() on windows.

On Linux, I just use raw sockets, which are faster and easier to deal
with than pcap..but my app is probabl different in nature from yours.
I see no reason why this would NOT work on Linux.

Thanks,
Ben


--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.


Re: [tcpdump-workers] libpcap & poll()

2008-11-13 Thread Ben Greear

Eloy Paris wrote:

Hi Ben,

On Thu, Nov 13, 2008 at 03:13:05PM -0800, Ben Greear wrote:

[...]


The code above works on Solaris, but does not work on Windows since
there is nothing to poll() on windows.


Windows has select() but it is my understanding that you can't use it on
a packet capture descriptor. At least pcap_get_selectable_fd() is not
available on Windows, according to the pcap man page.


I ended up with a thread per pcap descriptor to read the windows pcap 
descriptor in a blocking
manner, then send the packet over a local socket to my application
(and running select on that socket in my main app).

This is a total pain, and performs like shit, but it does
allow it to basically work without rewriting the core application.


This is a major bummer for my application and is the only reason I
haven't tried to attempt a port of my application to Windows. I'd love
to know how to efficiently read from multiple packet capture descriptors
on Linux...


On Linux, I just use raw sockets, which are faster and easier to deal
with than pcap..but my app is probabl different in nature from yours.


If portability is not needed raw sockets are nice. The nice thing about
PCAP is that it's portable.


Yep, I use both  and #ifdef my code as needed.

Ben

--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.


Re: [tcpdump-workers] libpcap & poll()

2008-11-13 Thread Ben Greear

Aaron Turner wrote:

On Thu, Nov 13, 2008 at 1:31 PM, Eloy Paris <[EMAIL PROTECTED]> wrote:
[snip]
  

One possible reason for high CPU when using poll() or select() is
spurious readiness notifications - in this case the program is not
sleeping waiting for data but is instead running, causing high CPU.

Both the poll() and select() manual pages on Linux have the following
comment:

"Under Linux, select() may report a socket file descriptor as "ready
for reading", while nevertheless a subsequent read blocks. This could
for example happen when data has arrived but upon examination has wrong
checksum and is discarded. There may be other circumstances in which a
file descriptor is spuriously reported as ready. Thus it may be safer to
use O_NONBLOCK on sockets that should not block."



Interesting.  Since most of my development work is on OS X, I didn't
notice that.

  

This has never been an issue for me although you could look into it.
This could cause high CPU only if you call pcap_setnonblock() for the
packet capture descriptors, and you experience the problem described
above, though.

Have you recreated the issue that the end-user reports? A very basic and
simple troubleshooting technique that has always been very helpful for
me is putting a "printf("select() returned %d\n", retval");" after a
"retval = select(...)" (in your case poll() ) call. You could tell your
user to do that to see how often you're coming back from poll().

If the network is pretty busy and there's lots of data to be read from
the packet capture descriptors then high CPU is obviously expected. has
the user indicated how busy the network is?



I did ask him about that.  Says traffic is ~400Mbps, but I support
(and he's using) a BPF filter to only grab a very small subset of that
(less then 1Mbps).   He says tcpdump runs with much less CPU load so I
don't think it's a traffic issue.

One unfortunate issue my application has is that since it  sends &
receives traffic on an interface, every packet I send I usually end up
reading.  I know some OS's support pcap_setdirection() which helps,
but last time I checked I don't think Linux is one of them since
libpcap uses PF_PACKET on the back end.  Still, we're talking about
less then 2Mbps of traffic which shouldn't cause 100% CPU utilization.
  

I guess you have some way of knowing you are reading a packet you just wrote
so that you don't do this in a loop?

I do know that if you use a PF_PACKET socket, if you write to it you do not
read that packet back on the PF socket.  I'm not sure about using pcap 
to read/write on Linux,

however.

To see the program's behaviour, I'd also 'strace' it.  That will show 
system calls

and their return values.  It's usually easy to see a busy spin this way...

Thanks,
Ben

--
Ben Greear <[EMAIL PROTECTED]> 
Candela Technologies Inc  http://www.candelatech.com



-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.


Re: [tcpdump-workers] libpcap & poll()

2008-11-13 Thread Ben Greear

Aaron Turner wrote:

On Thu, Nov 13, 2008 at 8:15 PM, Ben Greear <[EMAIL PROTECTED]> wrote:
  

I guess you have some way of knowing you are reading a packet you just wrote
so that you don't do this in a loop?



Yep.  Basically it's a software bridge (two interfaces, copying all
packets from one interface to the other) I track the source MAC
address so I know which direction a packet should go.
  
I pretty much do the same, but I'm overly paranoid and actually store 
the entire packet
in a queue and compare against those to stop retransmits on Solaris.  
(You typically
immediately read what you just wrote, so the queue comparison usually 
just pops off
the top packet).   On Windows, you can use winpcap and it has the 
ability to not receive what it sends.


On Linux, as mentioned, I just use raw packet sockets.

I do know that if you use a PF_PACKET socket, if you write to it you do not
read that packet back on the PF socket.  I'm not sure about using pcap to
read/write on Linux,
however.



Interesting... Right now I'm using different handles for read & write
so I see packets I send.  Obviously  not ideal, but if I could use the
same handle for read & write that would help out a lot.
  
Well, in a bridge you have to bind to two interfaces, so you'll read 
from one and
write to the other.  But, you should be able to do this with only two 
pcap sockets

total.

If you get anything working on Windows, I'm interested to know your 
throughput.  I

can't get above about 10Mbps full duplex across my winpcap bridge...

Thanks,
Ben

--
Ben Greear <[EMAIL PROTECTED]> 
Candela Technologies Inc  http://www.candelatech.com



-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.


Re: [tcpdump-workers] libpcap & poll()

2008-11-14 Thread Ben Greear

Gianluca Varenni wrote:


If you get anything working on Windows, I'm interested to know your 
throughput.  I

can't get above about 10Mbps full duplex across my winpcap bridge...


Is it because of the latency introduced by the bridging process (so 
for example the round trip time is higher and the throughput of a 
bridged TCP connection goes down) or because of the CPU load bouncing 
to 100%?
It's not just latency..I have tested driving both UDP and TCP through 
it..and UDP won't back off.


I'm not 100% sure where the problem is..likely part of it is that I am 
running the threads & resend

pkt to self on a socket so I can select in the main loop.

We just tell customers to use something other than Windows for high 
speed, and that seems to

be good enough these days...

Thanks,
Ben

--
Ben Greear <[EMAIL PROTECTED]> 
Candela Technologies Inc  http://www.candelatech.com



-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.


Re: [tcpdump-workers] libpcap capture performance drop

2011-05-27 Thread Ben Greear

On 05/27/2011 11:03 AM, Guy Harris wrote:


On May 27, 2011, at 5:20 AM, ri...@happyleptic.org wrote:


If I understand this code correctly, in the next release of the libpcap
if a client program ask for a capture length bigger than the MTU then
the size allocated for each frame in the ring buffer will be sized down
to avoid wasting space ?


If, for a device that's supplying frames with Ethernet headers, and that's not doing any 
form of offloading that might lead to oversized "packets" being delivered, a 
program asks for a capture length bigger than the current MTU + 18, the size allocated 
for each frame in the ring buffer will be sized down.

That's not currently being done for any other link-layer type, as the "+ 18" 
part is necessary - the frame size needs to be large enough to include not just the 
payload but the link-layer header (and, in case it happens to get delivered, the CRC), 
and the MTU for an interface is the size of the largest chunk of data that can be handed 
to the Ethernet layer to have the header and CRC added, i.e. it's 1500 for 
non-jumbo-frame Ethernet, and that's not big enough for a maximum-sized Ethernet packet.


Might want to add an extra 4 bytes for a possible VLAN header too.

Thanks,
Ben

--
Ben Greear 
Candela Technologies Inc  http://www.candelatech.com

-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.