On Thu, Jan 31, 2019 at 11:03:56PM -0800, Martin KaFai Lau wrote: > In kernel, it is common to check "!skb->sk && sk_fullsock(skb->sk)" > before accessing the fields in sock. For example, in __netdev_pick_tx: > > static u16 __netdev_pick_tx(struct net_device *dev, struct sk_buff *skb, > struct net_device *sb_dev) > { > /* ... */ > > struct sock *sk = skb->sk; > > if (queue_index != new_index && sk && > sk_fullsock(sk) && > rcu_access_pointer(sk->sk_dst_cache)) > sk_tx_queue_set(sk, new_index); > > /* ... */ > > return queue_index; > } > > This patch adds a "struct bpf_sock *sk" pointer to the "struct __sk_buff" ... > Some of the fileds in "bpf_sock" will not be directly > accessible through the "__sk_buff->sk" pointer. ... > The newly added "struct bpf_sock *bpf_sk_fullsock(struct bpf_sock *sk)" > can be used to get a sk with all accessible fields in "bpf_sock". > This helper is added to both cg_skb and sched_(cls|act). > > int cg_skb_foo(struct __sk_buff *skb) { > struct bpf_sock *sk; > __u32 family; > > sk = skb->sk; > if (!sk) > return 1; > > sk = bpf_sk_fullsock(sk); > if (!sk) > return 1; > > if (sk->family != AF_INET6 || sk->protocol != IPPROTO_TCP) > return 1; > > /* some_traffic_shaping(); */ > > return 1; > } > > (1) The sk is read only > > (2) There is no new "struct bpf_sock_common" introduced. > > (3) Future kernel sock's members could be added to bpf_sock only > instead of repeatedly adding at multiple places like currently > in bpf_sock_ops_md, bpf_sock_addr_md, sk_reuseport_md...etc.
All, this patchset sets a direction on how access to kernel socket datastructures should be made from bpf programs of networking types. It makes bpf program access to sk_common, sk, tcp_sock fields look and feel like kernel code. We think it's the most flexible and fixes the copy-paste issue of existing api. I wish we thought of it earlier :) Please review. For the patch set: Acked-by: Alexei Starovoitov <a...@kernel.org>