[tcpdump-workers] Re: Flush OS buffer before termination

Garri Djavadyan Sun, 20 Oct 2024 02:57:54 -0700

On Sun, 2024-10-20 at 01:03 -0700, Guy Harris wrote:
> 
> 
> > On Oct 20, 2024, at 12:11 AM, Garri Djavadyan
> > <g.djavad...@gmail.com> wrote:
> > 
> > On Sat, 2024-10-19 at 23:58 -0700, Guy Harris wrote:
> > > On Oct 19, 2024, at 5:01 PM, Garri Djavadyan
> > > <g.djavad...@gmail.com>
> > > wrote:
> > > 
> > > > I am looking for a way to force tcpdump flush Linux OS buffer
> > > > before
> > > > terminating. I have checked the man page and the mailing list
> > > > archives
> > > > but did not manage to find anything related.
> > > > 
> > > > When I terminate tcpdump process with SIGINT or SIGTERM, the
> > > > process
> > > > quits immediately, leaving packets in the buffer. I know that
> > > > the
> > > > signal USR2 forces the buffer to be flushed, but it does stop
> > > > filling
> > > > the buffer and the process remains active.
> > > > 
> > > > I have to use a very big buffer with a very slow storage, much
> > > > slower
> > > > than the rate of coming packets received by the filter, and it
> > > > is
> > > > preferred not to lose a single packet after initiating
> > > > termination
> > > > the
> > > > process.
> > > 
> > > OK, so is the buffer to which you're referring the buffer that
> > > holds
> > > captured packets for tcpdump to read, i.e. the *input* buffer for
> > > tcpdump, rather than, for example, the standard I/O buffer
> > > containing
> > > packet dissection text to be printed or the I/O buffer containing
> > > packets to be written to the file specified by -w, i.e. an
> > > *output*
> > > buffer for tcpdump?
> > 
> > Correct. I meant the input buffer, specified with the -B flag.
> 
> OK, so by "flushing" the buffer - which, for an input buffer, usually
> means discarding everything that's in the buffer and, for an output
> buffer, usually means writing the buffer contents to the target file
> - you meant "draining" the buffer, as in "processing all the packets
> in the buffer".


Thank you for the correction. Indeed, I should have used "draining"
here.


> > When I terminate tcpdump process with SIGINT or SIGTERM, the
> > process
> > quits immediately, leaving packets in the buffer. I know that the
> > signal USR2 forces the buffer to be flushed, but it does stop
> > filling
> > the buffer and the process remains active.
> 
> No, SIGUSR2 flushes the *output* buffer for the file being written to
> with -w.  The tcpdump man page does not make that clear; I will
> update it to do so.

Hmm. I see. Thank you in advance for updating the man page.


> > I have to use a very big buffer with a very slow storage, much
> > slower
> > than the rate of coming packets received by the filter, and it is
> > preferred not to lose a single packet after initiating termination
> > the
> > process.
> 
> What do you mean by "with a very slow storage"?  You can set the size
> with -B, but that just tells the capture mechanism in the kernel how
> big a buffer to allocate.  It's not as if it tells it to be stored in
> some slower form of memory.

Let me show an example. To demonstrate the issue, I am generating 2MB/s
stream of dummy packets:

[src]# pv -L 2M /dev/zero | dd bs=1472 > /dev/udp/192.168.0.1/12345


and dumping them to a storage, with cgroup-v2-restricted write speed of
1MB/s:

[dst]# lsblk /dev/loop0
NAME  MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
loop0   7:0    0  3.9G  0 loop /mnt/test

[dst]# cat /sys/fs/cgroup/test/io.max
7:0 rbps=max wbps=1024000 riops=max wiops=max


To temporarily avoid kernel-level drops, I set a 1GB (sufficient to get
needed packets before it overflows) input buffer:

[dst]# tcpdump -i veth0 -w /mnt/test/udp.pcap -B 1024000


Now, if start inspecting tcpdump's stats every second:

[dst]# while true; do killall -10 tcpdump; sleep 1; done


it is clearly seen that the input buffer is being filled at 1MB/s rate
(the diff between the generated traffic rate (2MB/s) and the writing
speed of the storage (1MB/s):

tcpdump: 0 packets captured, 0 packets received by filter, 0 packets
dropped by kernel
tcpdump: 218 packets captured, 715 packets received by filter, 0
packets dropped by kernel
tcpdump: 890 packets captured, 2145 packets received by filter, 0
packets dropped by kernel
tcpdump: 1575 packets captured, 3575 packets received by filter, 0
packets dropped by kernel
tcpdump: 2246 packets captured, 5005 packets received by filter, 0
packets dropped by kernel
tcpdump: 2931 packets captured, 6435 packets received by filter, 0
packets dropped by kernel
tcpdump: 3603 packets captured, 7867 packets received by filter, 0
packets dropped by kernel
tcpdump: 4288 packets captured, 9440 packets received by filter, 0
packets dropped by kernel
tcpdump: 4960 packets captured, 10870 packets received by filter, 0
packets dropped by kernel
tcpdump: 5645 packets captured, 12300 packets received by filter, 0
packets dropped by kernel
tcpdump: 6317 packets captured, 13730 packets received by filter, 0
packets dropped by kernel
tcpdump: 6988 packets captured, 15160 packets received by filter, 0
packets dropped by kernel
tcpdump: 7675 packets captured, 16590 packets received by filter, 0
packets dropped by kernel
tcpdump: 8347 packets captured, 18020 packets received by filter, 0
packets dropped by kernel
tcpdump: 9032 packets captured, 19450 packets received by filter, 0
packets dropped by kernel
tcpdump: 9704 packets captured, 20880 packets received by filter, 0
packets dropped by kernel
tcpdump: 10389 packets captured, 22310 packets received by filter, 0
packets dropped by kernel
tcpdump: 11061 packets captured, 23740 packets received by filter, 0
packets dropped by kernel


If at this point I stop tcpdump, then more than 10k packets will be
lost.


> > There are a few options to overcome the problem. For example,
> > by dumping packets to the memory storage first (e.g. /dev/shm)
> 
> Presumably meaning you specified "-w /dev/shm" or something such as
> that?
> 
> If so, how does that make a difference?

I mean I can first dump packets to the lightning-fast RAM storage and
after being done with the capturing part, copy the dump to the slow
storage.

For example, with RAM-based destination, no large buffer is needed in
my case:

[dst]# tcpdump -i veth0 -w /dev/shm/udp.pcap
...
tcpdump: 0 packets captured, 0 packets received by filter, 0 packets
dropped by kernel
tcpdump: 328 packets captured, 429 packets received by filter, 0
packets dropped by kernel
tcpdump: 1804 packets captured, 1859 packets received by filter, 0
packets dropped by kernel
tcpdump: 3280 packets captured, 3289 packets received by filter, 0
packets dropped by kernel
tcpdump: 4592 packets captured, 4719 packets received by filter, 0
packets dropped by kernel
tcpdump: 6232 packets captured, 6292 packets received by filter, 0
packets dropped by kernel
tcpdump: 7710 packets captured, 7724 packets received by filter, 0
packets dropped by kernel
tcpdump: 9022 packets captured, 9154 packets received by filter, 0
packets dropped by kernel
tcpdump: 10498 packets captured, 10584 packets received by filter, 0
packets dropped by kernel
tcpdump: 11974 packets captured, 12014 packets received by filter, 0
packets dropped by kernel
tcpdump: 13286 packets captured, 13444 packets received by filter, 0
packets dropped by kernel
tcpdump: 14927 packets captured, 15018 packets received by filter, 0
packets dropped by kernel


The RAM-based storage is fast enough, so the diff between the
"captured" and "received by filter" counters is minimal, not posing any
considerations for losing significant number of packets.


> > Still, I wonder if this can be done by tcpdump itself.
> 
> That would require that tcpdump be able to tell the capture mechanism
> to stop capturing packets; otherwise, tcpdump could continue reading
> packets from the buffer an processing them, but it's not as if the
> capture mechanism will stop adding packets to the buffer, so that
> would behave as if tcpdump continued capturing.
> 
> There is no current mechanism in libpcap by which tcpdump (or any
> other program using libpcap to capture networking traffic, e.e.
> Wireshark) can indicate to libpcap that it doesn't want any *more*
> packets from the network device, but wants to be able to keep reading
> from the packets already *in* the buffer until the last packet has
> been retrieved. That means tcpdump can't be told to do that with any
> existing version of libpcap.

I see. Thank you so much for the explanation.

Do you think this case can justify feature requests both for libpcap
and tcpdump on github?


Thank you.

Regards,
Garri
_______________________________________________
tcpdump-workers mailing list -- tcpdump-workers@lists.tcpdump.org
To unsubscribe send an email to tcpdump-workers-le...@lists.tcpdump.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[tcpdump-workers] Re: Flush OS buffer before termination

Reply via email to