On Tue, Oct 16, 2018 at 10:56:05PM -0700, Song Liu wrote: > BPF programs of BPF_PROG_TYPE_CGROUP_SKB need to access headers in the > skb. This patch enables direct access of skb for these programs.
The lack of direct packet access in CGROUP_SKB progs was an unpleasant surprise to me, so thank you for fixing it, but there are few issues with the patch. See below. > In __cgroup_bpf_run_filter_skb(), bpf_compute_data_pointers() is called > to compute proper data_end for the BPF program. > > Signed-off-by: Song Liu <songliubrav...@fb.com> > --- > kernel/bpf/cgroup.c | 4 ++++ > net/core/filter.c | 26 +++++++++++++++++++++++++- > 2 files changed, 29 insertions(+), 1 deletion(-) > > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c > index 00f6ed2e4f9a..340d496f35bd 100644 > --- a/kernel/bpf/cgroup.c > +++ b/kernel/bpf/cgroup.c > @@ -566,6 +566,10 @@ int __cgroup_bpf_run_filter_skb(struct sock *sk, > save_sk = skb->sk; > skb->sk = sk; > __skb_push(skb, offset); > + > + /* compute pointers for the bpf prog */ > + bpf_compute_data_pointers(skb); > + > ret = BPF_PROG_RUN_ARRAY(cgrp->bpf.effective[type], skb, > bpf_prog_run_save_cb); > __skb_pull(skb, offset); > diff --git a/net/core/filter.c b/net/core/filter.c > index 1a3ac6c46873..8b5a502e241f 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -5346,6 +5346,30 @@ static bool sk_filter_is_valid_access(int off, int > size, > return bpf_skb_is_valid_access(off, size, type, prog, info); > } > > +static bool cg_skb_is_valid_access(int off, int size, > + enum bpf_access_type type, > + const struct bpf_prog *prog, > + struct bpf_insn_access_aux *info) > +{ > + if (type == BPF_WRITE) > + return false; this disables writes into cb[0..4] that were allowed for cgroup_inet_* before. One can argue that this may break existing progs, but looking at the place where BPF_CGROUP_RUN_PROG_INET_INGRESS is called it seems it's actually not correct in all cases to access cb there. Just few lines down we call bpf_prog_run_save_cb() which save/restores these 24 bytes. So we have two option either add save/restore for INET_INGRESS only or disable read and write access to cb[0..4] for CGROUP_SKB progs. I prefer the former. > + > + switch (off) { > + case bpf_ctx_range(struct __sk_buff, len): > + break; > + case bpf_ctx_range(struct __sk_buff, data): > + info->reg_type = PTR_TO_PACKET; > + break; > + case bpf_ctx_range(struct __sk_buff, data_end): > + info->reg_type = PTR_TO_PACKET_END; > + break; > + default: > + return false; > + } this also enables access to a range of fields family..local_port. It's ok to do for egress, but not for ingress unless we add code similar to the bottom of sk_filter_trim_cap() that inits skb->sk. above change also allows access to data_meta and flow_keys which is not correct. Considering all that I'm proposing to fix INET_INGRESS call site similar to code below it in sk_filter_trim_cap(). In particular to do: struct sock *save_sk = skb->sk; skb->sk = sk; save and clear cb BPF_CGROUP_RUN_PROG_INET_INGRESS restore cb skb->sk = save_sk; all of above can probaby be inside BPF_CGROUP_RUN_PROG_INET_INGRESS macro. Then in this cg_skb_is_valid_access() allow access to data/data_end and family..local_port range as well. while disallowing access to flow_keys and data_meta. In patch 2 we gotta have tests for all these fields. Thoughts? > + > + return bpf_skb_is_valid_access(off, size, type, prog, info); > +} > + > static bool lwt_is_valid_access(int off, int size, > enum bpf_access_type type, > const struct bpf_prog *prog, > @@ -7038,7 +7062,7 @@ const struct bpf_prog_ops xdp_prog_ops = { > > const struct bpf_verifier_ops cg_skb_verifier_ops = { > .get_func_proto = cg_skb_func_proto, > - .is_valid_access = sk_filter_is_valid_access, > + .is_valid_access = cg_skb_is_valid_access, > .convert_ctx_access = bpf_convert_ctx_access, > }; > > -- > 2.17.1 >