Re: [tcpdump-workers] Multiple interface capture and thread safety status in libpcap
On May 10, 2012, at 7:43 AM, Wiener Schnitzel wrote: > I need to perform packet sniffing on several interfaces at the same time. Are you processing packets from each interface independently, so that a packet on interface A is not looked at when processing packets from interface B, or are you processing the packets from all of the interfaces as a single stream, so that you need to see packets from multiple interfaces in order? > My natural approach would be to open a pcap_t object for each interface and > place a "select" - considering Linux -call to deal with packet dispatching. > My only constraint is that I have to treat the received packets in > chronological order: indeed, I would like to process the data as it gets to > the interfaces, without introducing any reordering. Even if you're sniffing on *one* interface, I think that, with at least some versions of the Linux kernel, "as it gets to the interfaces" and "as it gets delivered to the PF_PACKET socket from which libpcap reads" are not necessarily the same thing on multi-core machines. I seem to remember that some people have seen packets with out-of-order timestamps, and have the impression that the problem is that if two packets are processed on different cores, the packet that arrived second might be queued up on the socket before the packet that arrived first if, for whatever reason, the thread on the second core manages to get its job done faster. I don't know whether this is still a problem with reasonably recent versions of the kernel. > If I am not mistaken, it might be possible that a "select" call does not read > data in temporal order, if multiple FDs are ready at the time the process is > scheduled for running by the OS. Is that correct ? Well, to be technical, the select() call doesn't read data, it's the calls to pcap_dispatch() made as a result of select() saying various FDs are ready, but, yes - if pcap_dispatch() processes more than one packet, you'll be processing the currently-available packets from the the first interface you find when scanning select()s results, followed by the currently-available packets from the second interface, and so on, even if the last packet from an earlier interface has a later time stamp than the first packet from a later interface. Now, if you put all the pcap_t's into non-blocking mode, and pass a count of 1 to pcap_dispatch(), so it processes only one packet, or if you use pcap_next() or pcap_next_ex(), you could try reading from each of the interfaces, process the packet with the lowest time stamp, and, in the next trip through the loop, read another packet from the interface from which the packet you processed came and re-check the packets read previously from the other interfaces, you'd process the packets in time stamp order (modulo any out-of-order delivery from the kernel on any single interface). > A work-around to this problem might be to move the capture on different > threads: each thread has its own pthread_t object and captures traffic on a > different interface. If each thread is processing packets independently, so that you don't have to worry about processing packets from multiple interfaces in the right order for all of those interfaces, then you could do it in one thread - for each call to pcap_dispatch(), do the processing for packets from the interface in question. Doing it in multiple threads would make better use of multiple cores in your application, however. If that's *not* the case, doing the capture in different threads still requires some scheme to process packets from different interfaces in order. > In this case, I do not have a clear picture about which parts of libpcap are > thread-safe and which not (my version of reference is the 1.1.1); I have > found really old posts about thread-safety issues in pcap_compile and > pcap_setfilter (which I would need: 1 common filter for each thread) but > nothing more. pcap_compile() uses YACC and Lex, or uses replacements thereof in YACC-compatible/Lex-compatible mode, so they're *not* thread safe - the lexical analyzer and parser have non-thread-safe state. pcap_setfilter(), however, should be thread-safe, and the rest of the APIs are thread-safe as long as any given pcap_t is only being processed in one thread at a time; there's no locking to allow a given pcap_t to be processed in more than one thread simultaneously.- This is the tcpdump-workers list. Visit https://cod.sandelman.ca/ to unsubscribe.
Re: [tcpdump-workers] Multiple interface capture and thread safety
On 11.05.2012 09:02, Guy Harris wrote: On May 10, 2012, at 7:43 AM, Wiener Schnitzel wrote: I need to perform packet sniffing on several interfaces at the same time. Are you processing packets from each interface independently, so that a packet on interface A is not looked at when processing packets from interface B, or are you processing the packets from all of the interfaces as a single stream, so that you need to see packets from multiple interfaces in order? At a certain point, I'd like to treat the packets as they came from a single source. Even if you're sniffing on *one* interface, I think that, with at least some versions of the Linux kernel, "as it gets to the interfaces" and "as it gets delivered to the PF_PACKET socket from which libpcap reads" are not necessarily the same thing on multi-core machines. I seem to remember that some people have seen packets with out-of-order timestamps, and have the impression that the problem is that if two packets are processed on different cores, the packet that arrived second might be queued up on the socket before the packet that arrived first if, for whatever reason, the thread on the second core manages to get its job done faster. I don't know whether this is still a problem with reasonably recent versions of the kernel. I'd would be very interested in this kind of details. Do you think it is documented somewhere ? Also, does that mean that PCAP timestamps are normally reliable (if the NIC cannot expose its own RX timestamp) ? Now, if you put all the pcap_t's into non-blocking mode, and pass a count of 1 to pcap_dispatch(), so it processes only one packet, or if you use pcap_next() or pcap_next_ex(), you could try reading from each of the interfaces, process the packet with the lowest time stamp, and, in the next trip through the loop, read another packet from the interface from which the packet you processed came and re-check the packets read previously from the other interfaces, you'd process the packets in time stamp order (modulo any out-of-order delivery from the kernel on any single interface). Nice suggestion. If each thread is processing packets independently, so that you don't have to worry about processing packets from multiple interfaces in the right order for all of those interfaces, then you could do it in one thread - for each call to pcap_dispatch(), do the processing for packets from the interface in question. Doing it in multiple threads would make better use of multiple cores in your application, however. If that's *not* the case, doing the capture in different threads still requires some scheme to process packets from different interfaces in order. I see. As I said, I might need to merge the data coming from the interfaces, so I need an algorithm to compare the age of packets with different sources. pcap_compile() uses YACC and Lex, or uses replacements thereof in YACC-compatible/Lex-compatible mode, so they're *not* thread safe - the lexical analyzer and parser have non-thread-safe state. pcap_setfilter(), however, should be thread-safe, and the rest of the APIs are thread-safe as long as any given pcap_t is only being processed in one thread at a time; Hence, if each thread wants to compile a different BPF, I need an external lock to the function. Otherwise, I can compile a shared BPF in the main thread and set it in the sniffing threads without any issues. Am I right ? Thanks - This is the tcpdump-workers list. Visit https://cod.sandelman.ca/ to unsubscribe.
Re: [tcpdump-workers] Multiple interface capture and thread safety
On 05/11/2012 06:26 AM, Wiener Schnitzel wrote: I see. As I said, I might need to merge the data coming from the interfaces, so I need an algorithm to compare the age of packets with different sources. I don't think you will be able to arrive at that goal with perfect accuracy. Can it be like the game of horseshoes and be "close enough?" In addition to packets from even the same interface taking different paths up the stack, there is also the matter of different interfaces providing notification of packet arrival at the host at different times - mechanisms like interrupt avoidance/coalescing mean that if Packet 1 arrived on NIC A a microsecond before Packet 2 arrived on NIC B, NIC B may still tell the host about Packet 2 before NIC A told the host about Packet 1. You could I suppose disable interrupt coalescing, and perhaps even get NIC HW timestamping going, but even then I suspect there will be some skew between NIC A's concept of time and NIC B's I trust there isn't any assumption being made about the relative send times of packets based on their arrival times - certainly in different flows, but depending on the nature of the transport(s) carrying the flows perhaps even within a single flow (eg a flow of UDP traffic) rick jones - This is the tcpdump-workers list. Visit https://cod.sandelman.ca/ to unsubscribe.