Near real-time scenario - suppose I need to process packets as quickly as they arrive. The per-packet processing time can exceed the inter-arrival time and so I want to create a bunch of worker threads to process packets in parallel. Within my app, I can serialize sufficiently granular as to not single thread the workers. Just take this as a given.
There are two ways I see to accomplish this. The classical is to create a manager thread, which runs pcap_loop(), accepts each packet and dispatches it to one of the ready workers. The limiting cycle time is then this read/dispatch time. What has been proposed is different - to create multiple worker threads with no manager. The pcap_open_live() is performed and then the worker threads are created, each of which runs pcap_loop() using the same pcap_t structure from their creator. This appears to work - each worker seems to get a different packet. But this scares me... (1) It depends on what I see as undocumented behavior of libpcap, that each callback receives 'the next packet' across multiple threads. (2) Internally within pcap_t there is NO serialization. pcap_t is an opaque structure, but if you look at it in pcap_int.h, it's just a bunch of counters, fields and pointers. (2) The actual read operation is different for EACH environment, through the read_op: pcap-dag.c:484: handle->read_op = dag_read; pcap-dlpi.c:713: p->read_op = pcap_read_dlpi; pcap-linux.c:406: handle->read_op = pcap_read_linux; pcap-nit.c:288: p->read_op = pcap_read_nit; pcap-pf.c:442: p->read_op = pcap_read_pf; pcap-snit.c:347: p->read_op = pcap_read_snit; pcap-snoop.c:331: p->read_op = pcap_read_snoop; pcap-win32.c:302: p->read_op = pcap_read_win32; This seems like it could easily lead to different behavior under different OSes. Linux ends up using recvfrom(), bpf uses read(). Win32 and dag use the memory addresses of their ring buffers directly. Especially under SMP (dual core), it seems possible for multiple threads for the same device to be running and end up inside the packet retrieval code at about the same instant. This then depends on timing (before/after the ring buffer address is updated) and/or the underlying OS behavior as to whether the same packet is returned from buffer or new packets are obtained. And so it probably depends on the internals of the device driver, the buffering (zero copy) model, etc. (3) Different behavior under different libpcaps. Between 0.7.x and 0.8.x the read code underwent a major rewrite to make the OS/device specific operations work more transparently. C++ style class overloading (read_op), vs. direct calls. It shouldn't FUNCTIONALLY be different, but even so, these areas get a lot of changes especially with quirks for new OSes. (4) General thread safety. In the mailing list - at least as of a year ago - there were still some questions as to whether the DAG, DLPI (HP/Solaris) versions were really thread safe. http://www.tcpdump.org/lists/workers/2004/04/msg00179.html. Am I nuts? Or am I right to be scared??? -----Burton - This is the tcpdump-workers list. Visit https://lists.sandelman.ca/ to unsubscribe.