On Thu, Jan 31, 2019 at 11:03:56PM -0800, Martin KaFai Lau wrote:
> In kernel, it is common to check "!skb->sk && sk_fullsock(skb->sk)"
> before accessing the fields in sock.  For example, in __netdev_pick_tx:
> 
> static u16 __netdev_pick_tx(struct net_device *dev, struct sk_buff *skb,
>                           struct net_device *sb_dev)
> {
>       /* ... */
> 
>       struct sock *sk = skb->sk;
> 
>               if (queue_index != new_index && sk &&
>                   sk_fullsock(sk) &&
>                   rcu_access_pointer(sk->sk_dst_cache))
>                       sk_tx_queue_set(sk, new_index);
> 
>       /* ... */
> 
>       return queue_index;
> }
> 
> This patch adds a "struct bpf_sock *sk" pointer to the "struct __sk_buff"
...
> Some of the fileds in "bpf_sock" will not be directly
> accessible through the "__sk_buff->sk" pointer.
...
> The newly added "struct bpf_sock *bpf_sk_fullsock(struct bpf_sock *sk)"
> can be used to get a sk with all accessible fields in "bpf_sock".
> This helper is added to both cg_skb and sched_(cls|act).
> 
> int cg_skb_foo(struct __sk_buff *skb) {
>       struct bpf_sock *sk;
>       __u32 family;
> 
>       sk = skb->sk;
>       if (!sk)
>               return 1;
> 
>       sk = bpf_sk_fullsock(sk);
>       if (!sk)
>               return 1;
> 
>       if (sk->family != AF_INET6 || sk->protocol != IPPROTO_TCP)
>               return 1;
> 
>       /* some_traffic_shaping(); */
> 
>       return 1;
> }
> 
> (1) The sk is read only
> 
> (2) There is no new "struct bpf_sock_common" introduced.
> 
> (3) Future kernel sock's members could be added to bpf_sock only
>     instead of repeatedly adding at multiple places like currently
>     in bpf_sock_ops_md, bpf_sock_addr_md, sk_reuseport_md...etc.

All,

this patchset sets a direction on how access to kernel socket datastructures
should be made from bpf programs of networking types.

It makes bpf program access to sk_common, sk, tcp_sock fields look and feel
like kernel code.
We think it's the most flexible and fixes the copy-paste issue of existing api.
I wish we thought of it earlier :)

Please review.

For the patch set:
Acked-by: Alexei Starovoitov <a...@kernel.org>

Reply via email to