From: "Wesierski, Dawid" <[email protected]> Introduce rte_pcapng_copy_ts() alongside the existing rte_pcapng_copy() so that callers with a hardware PTP or pre-captured timestamp can inject an exact epoch-ns value directly into the packet record.
Timestamp handling in rte_pcapng_copy_ts(): - ts != 0: caller-supplied nanoseconds since the Unix epoch, stored as-is. - ts == 0: TSC captured at copy time with bit 63 set as a sentinel. rte_pcapng_write_packets() detects the sentinel and converts the TSC to epoch ns using the file's calibrated clock. The TSC will not reach bit 63 for centuries, and epoch-ns values stay below bit 63 until 2554, so the bit is safe to use as a disambiguation flag. rte_pcapng_copy() is retained as a real exported function (not an inline wrapper) so the stable ABI symbol is preserved. It simply calls rte_pcapng_copy_ts(..., 0) to capture the current TSC. rte_pcapng_tsc_to_ns() is added as a new experimental helper (addressing review requests from Stephen Hemminger and Morten Brørup). It exposes the same calibrated, drift-compensated, divide-free TSC-to-epoch-ns conversion used internally by rte_pcapng_write_packets(), allowing callers to convert a TSC captured at packet arrival time before passing it to rte_pcapng_copy_ts(). Signed-off-by: Marek Kasiewicz <[email protected]> Signed-off-by: Dawid Wesierski <[email protected]> --- Hi Stephen, Morten, Thank you very much for your review and comments. I have prepared a v4 patch. ABI failure > I have restored rte_pcapng_copy() as a real exported function instead of a static inline wrapper. This should fix the iol-abi-testing failure. It now simply calls rte_pcapng_copy_ts(..., 0) internally. As suggested, I've added a new experimental function uint64_t rte_pcapng_tsc_to_ns(const rte_pcapng_t *self, uint64_t tsc); I exposed the internal calibrated clock state maintained by the pcapng. Regards, Dawid Węsierski. .mailmap | 2 ++ lib/pcapng/rte_pcapng.c | 71 +++++++++++++++++++++++++++++++++-------- lib/pcapng/rte_pcapng.h | 64 +++++++++++++++++++++++++++++++++++++ 3 files changed, 124 insertions(+), 13 deletions(-) diff --git a/.mailmap b/.mailmap index 4001e5fb0e..a7d97a631e 100644 --- a/.mailmap +++ b/.mailmap @@ -366,6 +366,7 @@ David Zeng <[email protected]> Davide Caratti <[email protected]> Dawid Gorecki <[email protected]> Dawid Jurczak <[email protected]> +Dawid Wesierski <[email protected]> Wesierski, Dawid <[email protected]> Dawid Zielinski <[email protected]> Dawid Łukwiński <[email protected]> Daxue Gao <[email protected]> @@ -1014,6 +1015,7 @@ Marcin Wilk <[email protected]> Marcin Wojtas <[email protected]> Marcin Zapolski <[email protected]> Marco Varlese <[email protected]> +Marek Kasiewicz <[email protected]> Marek Mical <[email protected]> Marek Zalfresso-jundzillo <[email protected]> Maria Lingemark <[email protected]> diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c index b5d1026891..f583fae995 100644 --- a/lib/pcapng/rte_pcapng.c +++ b/lib/pcapng/rte_pcapng.c @@ -546,14 +546,14 @@ pcapng_vlan_insert(struct rte_mbuf *m, uint16_t ether_type, uint16_t tci) */ /* Make a copy of original mbuf with pcapng header and options */ -RTE_EXPORT_SYMBOL(rte_pcapng_copy) +RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_pcapng_copy_ts, 26.07) struct rte_mbuf * -rte_pcapng_copy(uint16_t port_id, uint32_t queue, +rte_pcapng_copy_ts(uint16_t port_id, uint32_t queue, const struct rte_mbuf *md, struct rte_mempool *mp, uint32_t length, enum rte_pcapng_direction direction, - const char *comment) + const char *comment, uint64_t ts) { struct pcapng_enhance_packet_block *epb; uint32_t orig_len, pkt_len, padding, flags; @@ -690,8 +690,20 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue, /* Interface index is filled in later during write */ mc->port = port_id; - /* Put timestamp in cycles here - adjust in packet write */ - timestamp = rte_get_tsc_cycles(); + /* + * Timestamp handling: + * - If the caller supplied an explicit timestamp (ts != 0), it is + * already in nanoseconds since the Unix epoch, so store it as-is. + * - If the caller did not (ts == 0), store the current TSC and set + * the high bit as a sentinel so rte_pcapng_write_packets() knows + * it must convert TSC -> epoch ns at write time. The TSC counter + * will not reach bit 63 for centuries, and epoch-ns values stay + * below bit 63 until the year 2554, so the bit is safe to use. + */ + if (ts != 0) + timestamp = ts; + else + timestamp = rte_get_tsc_cycles() | (UINT64_C(1) << 63); epb->timestamp_hi = timestamp >> 32; epb->timestamp_lo = (uint32_t)timestamp; epb->capture_length = pkt_len; @@ -707,6 +719,35 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue, return NULL; } +/* + * Compatibility wrapper: captures current TSC (converted at write time). + * Equivalent to rte_pcapng_copy_ts(..., 0). + */ +RTE_EXPORT_SYMBOL(rte_pcapng_copy) +struct rte_mbuf * +rte_pcapng_copy(uint16_t port_id, uint32_t queue, + const struct rte_mbuf *md, + struct rte_mempool *mp, + uint32_t length, + enum rte_pcapng_direction direction, + const char *comment) +{ + return rte_pcapng_copy_ts(port_id, queue, md, mp, length, direction, + comment, 0); +} + +/* + * Convert a TSC value to nanoseconds since the Unix epoch using the + * calibrated clock of the capture file. Uses the same pre-computed + * reciprocal multiplier as the internal write path (no integer division). + */ +RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_pcapng_tsc_to_ns, 26.07) +uint64_t +rte_pcapng_tsc_to_ns(const rte_pcapng_t *self, uint64_t tsc) +{ + return tsc_to_ns_epoch(&self->clock, tsc); +} + /* Write pre-formatted packets to file. */ RTE_EXPORT_SYMBOL(rte_pcapng_write_packets) ssize_t @@ -720,7 +761,7 @@ rte_pcapng_write_packets(rte_pcapng_t *self, for (i = 0; i < nb_pkts; i++) { struct rte_mbuf *m = pkts[i]; struct pcapng_enhance_packet_block *epb; - uint64_t cycles, timestamp; + uint64_t timestamp; /* sanity check that is really a pcapng mbuf */ epb = rte_pktmbuf_mtod(m, struct pcapng_enhance_packet_block *); @@ -738,14 +779,18 @@ rte_pcapng_write_packets(rte_pcapng_t *self, } /* - * When data is captured by pcapng_copy the current TSC is stored. - * Adjust the value recorded in file to PCAP epoch units. + * If rte_pcapng_copy[_ts]() stored a TSC value (high bit set + * as sentinel), convert it to nanoseconds since the Unix epoch + * using the per-file clock. Otherwise the timestamp is already + * in epoch ns and is written unchanged. */ - cycles = (uint64_t)epb->timestamp_hi << 32; - cycles += epb->timestamp_lo; - timestamp = tsc_to_ns_epoch(&self->clock, cycles); - epb->timestamp_hi = timestamp >> 32; - epb->timestamp_lo = (uint32_t)timestamp; + timestamp = ((uint64_t)epb->timestamp_hi << 32) | epb->timestamp_lo; + if (timestamp & (UINT64_C(1) << 63)) { + timestamp &= ~(UINT64_C(1) << 63); + timestamp = tsc_to_ns_epoch(&self->clock, timestamp); + epb->timestamp_hi = timestamp >> 32; + epb->timestamp_lo = (uint32_t)timestamp; + } /* * Handle case of highly fragmented and large burst size diff --git a/lib/pcapng/rte_pcapng.h b/lib/pcapng/rte_pcapng.h index d8d328f710..6eeaeada05 100644 --- a/lib/pcapng/rte_pcapng.h +++ b/lib/pcapng/rte_pcapng.h @@ -108,9 +108,50 @@ enum rte_pcapng_direction { RTE_PCAPNG_DIRECTION_OUT = 2, }; +/** + * Format an mbuf with a caller-supplied timestamp for writing to file. + * + * @param port_id + * The Ethernet port on which packet was received + * or is going to be transmitted. + * @param queue + * The queue on the Ethernet port where packet was received + * or is going to be transmitted. + * @param mp + * The mempool from which the "clone" mbufs are allocated. + * @param m + * The mbuf to copy + * @param length + * The upper limit on bytes to copy. Passing UINT32_MAX + * means all data (after offset). + * @param direction + * The direction of the packer: receive, transmit or unknown. + * @param comment + * Optional per packet comment. + * Truncated to UINT16_MAX characters. + * @param ts + * Packet timestamp in nanoseconds since the Unix epoch. If zero, the + * current TSC is captured and converted to epoch ns by + * rte_pcapng_write_packets() when the packet is written. + * + * @return + * - The pointer to the new mbuf formatted for pcapng_write + * - NULL on error such as invalid port or out of memory. + */ +__rte_experimental +struct rte_mbuf * +rte_pcapng_copy_ts(uint16_t port_id, uint32_t queue, + const struct rte_mbuf *m, struct rte_mempool *mp, + uint32_t length, + enum rte_pcapng_direction direction, const char *comment, + uint64_t ts); + /** * Format an mbuf for writing to file. * + * Equivalent to rte_pcapng_copy_ts() with ts=0: the current TSC is + * captured at copy time and converted to epoch ns at write time. + * * @param port_id * The Ethernet port on which packet was received * or is going to be transmitted. @@ -153,6 +194,29 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue, uint32_t rte_pcapng_mbuf_size(uint32_t length); +/** + * Convert a TSC value to nanoseconds since the Unix epoch. + * + * Uses the same calibrated clock reference as the capture file so that + * the result is consistent with timestamps written by + * rte_pcapng_write_packets(). The conversion is drift-compensated and + * uses a pre-computed reciprocal multiplier (no integer division). + * + * Typical use: convert a TSC timestamp captured close to packet arrival + * (e.g., from a PMD or hardware register) to an epoch-ns value before + * passing it to rte_pcapng_copy_ts(). + * + * @param self + * The handle to the packet capture file. + * @param tsc + * TSC value to convert. + * @return + * Nanoseconds since the Unix epoch corresponding to @p tsc. + */ +__rte_experimental +uint64_t +rte_pcapng_tsc_to_ns(const rte_pcapng_t *self, uint64_t tsc); + /** * Write packets to the capture file. * -- 2.47.3 --------------------------------------------------------------------- Intel Technology Poland sp. z o.o. ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN. Spolka oswiadcza, ze posiada status duzego przedsiebiorcy w rozumieniu ustawy z dnia 8 marca 2013 r. o przeciwdzialaniu nadmiernym opoznieniom w transakcjach handlowych. Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.

