[tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.
I've built a wireshark dissector for fd.io vpp graph dispatcher pcap traces. Please see https://fdio-vpp.readthedocs.io/en/latest/ for a description of the code base / project, etc. For development purposes, I borrowed one of the USERxxx encap types. Please allocate a LINKTYPE_/DLT_ type for this file format, so I can upstream the dissector. Thanks... Dave Barach Fd.io vpp PTL Trace Record format --- VPP graph dispatch trace record description, in network byte order. Integers wider than 8 bits are in little endian byte order. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Major Version |Minor Version |Buffer index high 16 bits | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Buffer index low 16 bits |Node Name Len | Node name ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + Node name cont'd... ... | NULL octet| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Primary buffer metadata (64 octets) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | [Secondary buffer metadata (64 octets, major version > 1)]| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ASCII trace length 16 bits| ASCII trace ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ASCII trace cont'd ...... | NULL octet| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Packet data (up to 16K) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Notes: as of this writing, major version = 1, minor version = 0. See below for pro forma definitions of the primary buffer metadata and primary opaque data. Please refer to fd.io vpp source code before you invest, send money, or write code: "git clone https://gerrit.fd.io/r/vpp"; Trace records are generated by code in .../src/vlib/main.c:dispatch_pcap_trace(...). The secondary buffer metadata shown in the diagram above is NOT present in version 1 traces. Pro forma structure definitions: /* * BIG FAT WARNING: it's impossible to #include the vpp header files, * so this is a private copy of .../src/vnet/buffer.h, with * some vpp typedefs thrown in for good measure. */ typedef unsigned int u32; typedef unsigned short int u16; typedef short int i16; typedef unsigned char u8; typedef unsigned long long u64; /* VLIB buffer representation. */ typedef struct { /* Offset within data[] that we are currently processing. If negative current header points into predata area. */ i16 current_data; /**< signed offset in data[], pre_data[] that we are currently processing. If negative current header points into predata area. */ u16 current_length; /**< Nbytes between current data and the end of this buffer. */ u32 flags; /**< buffer flags */ u32 flow_id; /**< Generic flow identifier */ u32 next_buffer; /**< Next buffer for this linked-list of buffers. Only valid if VLIB_BUFFER_NEXT_PRESENT flag is set. */ u32 current_config_index; /**< Used by feature subgraph arcs to visit enabled feature nodes */ u16 error;/**< Error code for buffers to be enqueued to error handler. */ u8 n_add_refs; /**< Number of additional references to this buffer. */ u8 buffer_pool_index; /**< index of buffer pool this buffer belongs. */ u32 opaque[10]; /**< Opaque data used by sub-graphs for their own purposes. See above */ u32 trace_index; /**< Specifies index into trace buffer if VLIB_PACKET_IS_TRACED flag is set. */ u32 recycle_count; /**< Used by L2 path recycle code */ u32 total_length_not_including_first_buffer; /**< Only valid for first buffer in chain. Current length plus total length given here give total number of bytes in buffer chain. */ u8 free_list_index; /** < only used if VLIB_BUFFER_NON_DEFAULT_FREELIST flag is set */ u8 align_pad[3]; /**< available */ u32 opaque2[12]; /**< More opaque data, see ../vnet/vnet/buffer.h */ /* end of second cache line */ u8 pre_data[VLIB_BUFFER_PRE_DATA_SIZE]; /**< Space for inserting data before buffer start. Packet rewrite strin
Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.
On Nov 26, 2018, at 6:03 AM, Dave Barach (dbarach) wrote: > I've built a wireshark dissector for fd.io vpp graph dispatcher pcap traces. > Please see https://fdio-vpp.readthedocs.io/en/latest/ for a description of > the code base / project, etc. > > For development purposes, I borrowed one of the USERxxx encap types. Please > allocate a LINKTYPE_/DLT_ type for this file format, so I can upstream the > dissector. > > Thanks... Dave Barach > Fd.io vpp PTL > > Trace Record format > --- > > VPP graph dispatch trace record description, in network byte order. Integers > wider than 8 bits are in little endian byte order. "Byte order" doesn't apply to 8-bit fields; if all fields are in little-endian byte order, what, if anything, is in network byte order (big-endian)? And is everything guaranteed to be in little-endian byte order *even if the tracing code is running on, for example, a Power ISA processor running in big-endian mode, or on z/Architecture processor (which *only* runs big-endian)? > 0 1 2 3 > 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > |Major Version |Minor Version |Buffer index high 16 bits | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > |Buffer index low 16 bits |Node Name Len | Node name ... | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > + Node name cont'd... ... | NULL octet| > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | Primary buffer metadata (64 octets) | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | [Secondary buffer metadata (64 octets, major version > 1)]| > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | ASCII trace length 16 bits| ASCII trace ... | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | ASCII trace cont'd ...... | NULL octet| > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | Packet data (up to 16K) | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Is there a page at any fd.io or VPP Web site that describes the header, to which we could point? > Notes: as of this writing, major version = 1, minor version = 0. Presumably any code that can read major version M, minor version N will also be able to read major version M, minor version K, for all values of K <= N. > See below for pro forma definitions of the primary buffer metadata and > primary opaque data. Please refer to fd.io vpp source code before you invest, > send money, or write code: "git clone https://gerrit.fd.io/r/vpp"; > > Trace records are generated by code in > .../src/vlib/main.c:dispatch_pcap_trace(...). > > The secondary buffer metadata shown in the diagram above is NOT present in > version 1 traces. So if some future version 2 of the trace is defined, an update will be sent to tcpdump-workers, describing the secondary buffer metadata? For the fields defined in that header: What is the buffer index? Does the node name length include the terminating NUL? (Presumably anything writing those files MUST, in the RFC 2119 sense, null-terminate strings, and anything writing those files MUST not assume that the strings are null-terminated; a count *and* a terminating NUL is redundant.) Does the ASCII trace length include the terminating NUL? Is that just an opaque string to display to the user, or are there any ways in which an application can parse it? In an earlier mail on another list you said: > Packet data can be anything: L2, L3, L4 or above. The vpp dissector knows > from the node name what to expect. I have a [seriously incomplete as of this > writing] table of the form: > > #define foreach_node_to_dissector_handle\ > _("ip6-lookup", "ipv6", ip6_dissector_handle) \ > _("ip4-input-no-checksum", "ip", ip4_dissector_handle) \ > _("ip4-lookup", "ip", ip4_dissector_handle) \ > _("ip4-local", "ip", ip4_dissector_handle) \ > _("ip4-udp-lookup", "ip", udp_dissector_handle) \ > _("ip4-icmp-error", "ip", ip4_dissector_handle) \ > _("ip4-glean", "ip", ip4_dissector_handle) \ > _("ethernet-input", "eth_maybefcs", eth_dissector_handle) Presumably, once a node name is used for a particular type of payload, it will always indicate that particular payload type. Could new node names be added in the future? Is there a page at any fd.io or VPP Web site that gives the current list of node names, showing what payload type each node name indicates? > Pro forma structure definitions: So which of those structures describes the primary met
Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.
Inline, see >>> Thanks... Dave -Original Message- From: Guy Harris Sent: Monday, November 26, 2018 3:01 PM To: Dave Barach (dbarach) Cc: tcpdump-workers@lists.tcpdump.org Subject: Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type. On Nov 26, 2018, at 6:03 AM, Dave Barach (dbarach) wrote: > I've built a wireshark dissector for fd.io vpp graph dispatcher pcap traces. > Please see https://fdio-vpp.readthedocs.io/en/latest/ for a description of > the code base / project, etc. > > For development purposes, I borrowed one of the USERxxx encap types. Please > allocate a LINKTYPE_/DLT_ type for this file format, so I can upstream the > dissector. > > Thanks... Dave Barach > Fd.io vpp PTL > > Trace Record format > --- > > VPP graph dispatch trace record description, in network byte order. Integers > wider than 8 bits are in little endian byte order. "Byte order" doesn't apply to 8-bit fields; if all fields are in little-endian byte order, what, if anything, is in network byte order (big-endian)? And is everything guaranteed to be in little-endian byte order *even if the tracing code is running on, for example, a Power ISA processor running in big-endian mode, or on z/Architecture processor (which *only* runs big-endian)? >>> Good point. It would be easy to trace the 1 x 32-bit and 1 x 16 bit >>> quantities in for-real network byte order. I'll just go do that. Frankly, >>> we haven't run the code base on a PowerPC or other big-endian processor in >>> years. I'm fairly sure that the dispatch trace code would be the least of >>> anyone's problems if/when we go there again. >0 1 2 3 >0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > |Major Version |Minor Version |Buffer index high 16 bits | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > |Buffer index low 16 bits |Node Name Len | Node name ... | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > + Node name cont'd... ... | NULL octet| > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | Primary buffer metadata (64 octets) | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | [Secondary buffer metadata (64 octets, major version > 1)]| > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | ASCII trace length 16 bits| ASCII trace ... | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | ASCII trace cont'd ...... | NULL octet| > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | Packet data (up to 16K) | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Is there a page at any fd.io or VPP Web site that describes the header, to which we could point? > Notes: as of this writing, major version = 1, minor version = 0. Presumably any code that can read major version M, minor version N will also be able to read major version M, minor version K, for all values of K <= N. >>> That's the goal, but since the paint is barely dry on v1.0 it would be >>> slightly rash of me to make that claim... > See below for pro forma definitions of the primary buffer metadata and > primary opaque data. Please refer to fd.io vpp source code before you invest, > send money, or write code: "git clone https://gerrit.fd.io/r/vpp"; > > Trace records are generated by code in > .../src/vlib/main.c:dispatch_pcap_trace(...). > > The secondary buffer metadata shown in the diagram above is NOT present in > version 1 traces. So if some future version 2 of the trace is defined, an update will be sent to tcpdump-workers, describing the secondary buffer metadata? For the fields defined in that header: What is the buffer index? >>> A 32-bit buffer handle which can be rapidly converted into either a virtual >>> address or a physical addresses. It's highly useful as a filter in WS: >>> since we trace e.g. 100 packets in ethernet-input, then 100 packets in >>> ip4-input, etc. Does the node name length include the terminating NUL? (Presumably anything writing those files MUST, in the RFC 2119 sense, null-terminate strings, and anything writing those files MUST not assume that the strings are null-terminated; a count *and* a terminating NUL is redundant.) Does the ASCII trace length include the terminating NUL? Is that just an opaque string to display to the user, or are there any ways in which an application can parse it? >>> Yes, the NULL is included in the count. Yes, it's slightly redundant. Yes, >>> it keeps people from shooting themselves in the foot when processing the >>> data. In an earlier mail on another list you said: > Packet data c
Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.
On Nov 26, 2018, at 12:43 PM, Dave Barach (dbarach) wrote: > On November 26, 2018, at 3:01 PM, Guy Harris wrote: > >> On Nov 26, 2018, at 6:03 AM, Dave Barach (dbarach) wrote: >> >>> VPP graph dispatch trace record description, in network byte order. >>> Integers wider than 8 bits are in little endian byte order. >> >> "Byte order" doesn't apply to 8-bit fields; if all fields are in >> little-endian byte order, what, if anything, is in network byte order >> (big-endian)? >> >> And is everything guaranteed to be in little-endian byte order *even if the >> tracing code is running on, for example, a Power ISA processor running in >> big-endian mode, or on z/Architecture processor (which *only* runs >> big-endian)? > > Good point. It would be easy to trace the 1 x 32-bit and 1 x 16 bit > quantities in for-real network byte order. I'll just go do that. Frankly, we > haven't run the code base on a PowerPC or other big-endian processor in > years. I'm fairly sure that the dispatch trace code would be the least of > anyone's problems if/when we go there again. So, in other words, you meant "Integers wider than 8 bits are in *the byte order of the host writing the trace*", not "...are in little-endian byte order". Either big-endian or little-endian byte order would work easily, as long as it's standardized. Host-endian can be made to work, *but* it means that any code that reads pcap or pcapng files has to byte-swap the VPP header if the byte order claimed by the pcap file header or the pcapng section header differs from the native byte order of the host reading the file; we have code to do that in libpcap and in Wireshark's libwiretap, but we'd really prefer not to have to introduce that here. >>> Notes: as of this writing, major version = 1, minor version = 0. >> >> Presumably any code that can read major version M, minor version N will also >> be able to read major version M, minor version K, for all values of K <= N. > > That's the goal, but since the paint is barely dry on v1.0 it would be > slightly rash of me to make that claim... That shouldn't just be a goal, it should be a definition of how the major and minor versions work. This is similar to, for example, SunOS 4.x's shared library version numbering - if you add new capabilities to a library, so that programs using the new capabilities won't work with older versions of the library, *but* the capabilities are added in a compatible fashion, so that programs using only the capabilities of older versions of the library will work with newer versions of the library, you increase the minor version, *but* if you make incompatible changes (removing routines, changing the signature of existing functions, etc.), you increase the major version. So you should probably specify that's how the major and minor versions are used. >> For the fields defined in that header: >> >> What is the buffer index? > > A 32-bit buffer handle which can be rapidly converted into either a virtual > address or a physical addresses. It's highly useful as a filter in WS: since > we trace e.g. 100 packets in ethernet-input, then 100 packets in ip4-input, > etc. So "as a filter" means that if the handle value is equal to some particular value - either an arbitrary value or the same value as another packet - that's significant? >> Does the node name length include the terminating NUL? (Presumably anything >> writing those files MUST, in the RFC 2119 sense, null-terminate strings, and >> anything writing those files MUST not assume that the strings are >> null-terminated; a count *and* a terminating NUL is redundant.) >> >> Does the ASCII trace length include the terminating NUL? Is that just an >> opaque string to display to the user, or are there any ways in which an >> application can parse it? > > Yes, the NULL is included in the count. The spec should indicate that. > Yes, it's slightly redundant. Yes, it keeps people from shooting themselves > in the foot when processing the data. It doesn't prevent code that processes the data from having to check for a terminating NUL, unless you're in *so* tightly-controlled an environment that you can guarantee that you will *never* see maliciously-constructed files that don't have a terminating NUL. Neither tcpdump nor Wireshark, for example, are always run in environments like that. >> In an earlier mail on another list you said: >> >>> Packet data can be anything: L2, L3, L4 or above. The vpp dissector knows >>> from the node name what to expect. I have a [seriously incomplete as of >>> this writing] table of the form: >>> >>> #define foreach_node_to_dissector_handle\ >>> _("ip6-lookup", "ipv6", ip6_dissector_handle) \ >>> _("ip4-input-no-checksum", "ip", ip4_dissector_handle) \ >>> _("ip4-lookup", "ip", ip4_dissector_handle) \ >>> _("ip4-local", "ip", ip4_dissector_handle) \ >>>
Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.
On Nov 26, 2018, at 12:43 PM, Dave Barach (dbarach) wrote: > On November 26, 2018, at 3:01 PM, Guy Harris wrote: > >> So which of those structures describes the primary metadata? > > vlib_buffer_t. The key fields are flags, current_data, and current_length. So that's: > /* VLIB buffer representation. */ > typedef struct > { > /* Offset within data[] that we are currently processing. > If negative current header points into predata area. */ > i16 current_data; /**< signed offset in data[], pre_data[] >that we are currently processing. >If negative current header points into predata area. > */ > u16 current_length; /**< Nbytes between current data and > the end of this buffer. > */ > u32 flags; /**< buffer flags */ > u32 flow_id; /**< Generic flow identifier */ > > > u32 next_buffer; /**< Next buffer for this linked-list of buffers. >Only valid if VLIB_BUFFER_NEXT_PRESENT flag is set. > */ > > u32 current_config_index; /**< Used by feature subgraph arcs to > visit enabled feature nodes >*/ > u16 error; /**< Error code for buffers to be enqueued > to error handler. >*/ > u8 n_add_refs; /**< Number of additional references to this buffer. */ > > u8 buffer_pool_index;/**< index of buffer pool this buffer belongs. > */ > > u32 opaque[10]; /**< Opaque data used by sub-graphs for their own purposes. > See above */ > u32 trace_index; /**< Specifies index into trace buffer > if VLIB_PACKET_IS_TRACED flag is set. > */ > u32 recycle_count; /**< Used by L2 path recycle code */ > > u32 total_length_not_including_first_buffer; > /**< Only valid for first buffer in chain. Current length plus > total length given here give total number of bytes in buffer chain. > */ > u8 free_list_index; /** < only used if > > VLIB_BUFFER_NON_DEFAULT_FREELIST > flag is set */ > u8 align_pad[3]; /**< available */ > u32 opaque2[12]; /**< More opaque data, see ../vnet/vnet/buffer.h */ > > /* end of second cache line */ > u8 pre_data[VLIB_BUFFER_PRE_DATA_SIZE]; /**< Space for inserting data > before buffer start. > Packet rewrite string will be > rewritten backwards and may > extend > back before buffer->data[0]. > Must come directly before > packet data. >*/ > > u8 data[0]; /**< Packet data. Hardware DMA here */ > } vlib_buffer_t; /* Must be a multiple of 64B. */ which is 128 bytes followed by VLIB_BUFFER_PRE_DATA_SIZE bytes of data. Which of those 64 of those 128 bytes are the primary metadata? ___ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers