On Sep 20, 2019, at 2:02 PM, Michael Richardson wrote:
> Guy Harris wrote:
>> Currently, Wireshark's pcapng reading code imposes a block size limit
>> for all blocks:
>
>> enough that * the resulting block size would be less than the previous
>> 16 MiB limit. */ #define MAX_BLOCK_SIZE (MIN_EPB_SIZE +
>> WTAP_MAX_PACKET_SIZE_DBUS + 131072)
>
>> WTAP_MAX_PACKET_SIZE_DBUS is 16 MiB.
>
> So, MIN_EPB_SIZE (28?) + 16MiB + 128KiB.
> I think that this is a fine maximum for quite a number of block types.
> I propose to introduce sane maximums for each block type, on a block type
> basis.
Your recent checkin has an SHB maximum size of 1 MiB.
An SHB is 24 bytes of fixed data plus option, so that allows almost 1 MiB of
options.
The size of an EPB is 28 + packet data size (padded to a multiple of 4 bytes)
plus options, so your Wireshark-derived maximum size for an EPB is pretty much
based on a maximum 128 KiB of options.
Is there a reason to have different maximum-bytes-of-options values for
different blocks? If not, I'm OK with a maximum of either 128 KiB or 1 MiB (or
other reasonable values) for the maximum number of option bytes. The maximum
size of an option is 4 plus 65536 (maximum option value size, rounded up to a
multiple of 4), so 128 KiB is slightly under 2 maximum-sized options. 1 MiB
wouldn't be enough to store all of *War and Peace* in a sequence of comment
options (storing it in an English translation; storing it in the initial
Russian would be worse, as that's two bytes per letter in UTF-8), but *The
Great Gatsby* would fit. :-)
> I can live with 16GiB as the *maximum* that we will allocate.
> I'd like to put this in the draft: every block should have a *reasonable*
> maximum. I plan to work on a mmap() based reading API,
Note that memory-mapping means that, on a read error, the program will probably
die with a signal (UN*X) or exception (Windows). Disks are pretty reliable, so
you probably won't get many EIOs from the disk (I *did* get them at Sun when
some SMD disk was failing, but that was the mid-to-late 1980's). However:
if the drive is removable, the user unplugging the drive could cause an
error;
if the "drive" is a share mounted from a file server, unless it's an
uninterruptible NFS hard mount, either ^Cing a hard mount or getting a timeout
on other mounts could cause an error.
At Apple, at least some software only used mmap() for files on a local,
non-removable drive. fstatfs() might be able to tell you whether the file is
on a local drive (the MNT_LOCAL flag on at least some BSD-flavored OSes;
checking the file system type field against known non-local file systems on
Linux, although the latter is less robust). I don't remember offhand how you
distinguish volumes on removable vs. non-removable media.
> and I that shouldn't
> have a problem with block size on 64-bit systems. But maybe on 32-bit
> systems, it should use mmap() in some more creative way.
Map in a region of the file and, if you need something outside that region,
uncap the old region and map the new region.
> I'm not sure here. Are there any good libraries to outsource this problem?
I don't know of any offhand.
> I'd like to do an AIO (libuio)
libuio or libaio:
https://pagure.io/libaio
?
Using POSIX aio_ routines would allow it to work on at least some other UN*Xes
as well. I guess the Windows equivalent is $QIOW^Woverlapped I/O.
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers