On Thu, Oct 12, 2017 at 03:48:07PM -0700, Cong Wang wrote:
> We need a real-time notification for tcp retransmission
> for monitoring.
> 
> Of course we could use ftrace to dynamically instrument this
> kernel function too, however we can't retrieve the connection
> information at the same time, for example perf-tools [1] reads
> /proc/net/tcp for socket details, which is slow when we have
> a lots of connections.
> 
> Therefore, this patch adds a tracepoint for tcp_retransmit_skb()
> and exposes src/dst IP addresses and ports of the connection.
> This also makes it easier to integrate into perf.
> 
> Note, I expose both IPv4 and IPv6 addresses at the same time:
> for a IPv4 socket, v4 mapped address is used as IPv6 addresses,
> for a IPv6 socket, LOOPBACK4_IPV6 is already filled by kernel.
> Also, add sk and skb pointers as they are useful for BPF.
> 
> 1. https://github.com/brendangregg/perf-tools/blob/master/net/tcpretrans
> 
> Cc: Eric Dumazet <eduma...@google.com>
> Cc: Alexei Starovoitov <alexei.starovoi...@gmail.com>
> Cc: Hannes Frederic Sowa <han...@stressinduktion.org>
> Cc: Brendan Gregg <brendan.d.gr...@gmail.com>
> Cc: Neal Cardwell <ncardw...@google.com>
> Signed-off-by: Cong Wang <xiyou.wangc...@gmail.com>
> ---
>  include/trace/events/tcp.h | 68 
> ++++++++++++++++++++++++++++++++++++++++++++++
>  net/core/net-traces.c      |  1 +
>  net/ipv4/tcp_output.c      |  3 ++
>  3 files changed, 72 insertions(+)
>  create mode 100644 include/trace/events/tcp.h
> 
> diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
> new file mode 100644
> index 000000000000..749f93c542ab
> --- /dev/null
> +++ b/include/trace/events/tcp.h
> @@ -0,0 +1,68 @@
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM tcp
> +
> +#if !defined(_TRACE_TCP_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_TCP_H
> +
> +#include <linux/ipv6.h>
> +#include <linux/tcp.h>
> +#include <linux/tracepoint.h>
> +#include <net/ipv6.h>
> +
> +TRACE_EVENT(tcp_retransmit_skb,
> +
> +     TP_PROTO(struct sock *sk, struct sk_buff *skb, int segs),
> +
> +     TP_ARGS(sk, skb, segs),
> +
> +     TP_STRUCT__entry(
> +             __field(void *, skbaddr)
> +             __field(void *, skaddr)
> +             __field(__u16, sport)
> +             __field(__u16, dport)
> +             __array(__u8, saddr, 4)
> +             __array(__u8, daddr, 4)
> +             __array(__u8, saddr_v6, 16)
> +             __array(__u8, daddr_v6, 16)
> +     ),
...
>       if (likely(!err)) {
>               TCP_SKB_CB(skb)->sacked |= TCPCB_EVER_RETRANS;
> +             trace_tcp_retransmit_skb(sk, skb, segs);

looks great to me, but why 'segs' is there?
It's unused.

Reply via email to