On 03/12/2018 08:23 PM, John Fastabend wrote: > A single sendmsg or sendfile system call can contain multiple logical > messages that a BPF program may want to read and apply a verdict. But, > without an apply_bytes helper any verdict on the data applies to all > bytes in the sendmsg/sendfile. Alternatively, a BPF program may only > care to read the first N bytes of a msg. If the payload is large say > MB or even GB setting up and calling the BPF program repeatedly for > all bytes, even though the verdict is already known, creates > unnecessary overhead. > > To allow BPF programs to control how many bytes a given verdict > applies to we implement a bpf_msg_apply_bytes() helper. When called > from within a BPF program this sets a counter, internal to the > BPF infrastructure, that applies the last verdict to the next N > bytes. If the N is smaller than the current data being processed > from a sendmsg/sendfile call, the first N bytes will be sent and > the BPF program will be re-run with start_data pointing to the N+1 > byte. If N is larger than the current data being processed the > BPF verdict will be applied to multiple sendmsg/sendfile calls > until N bytes are consumed. > > Note1 if a socket closes with apply_bytes counter non-zero this > is not a problem because data is not being buffered for N bytes > and is sent as its received. > > Note2 if this is operating in the sendpage context the data > pointers may be zeroed after this call if the apply walks beyond > a msg_pull_data() call specified data range. (helper implemented > shortly in this series). > > Signed-off-by: John Fastabend <john.fastab...@gmail.com> > --- > include/uapi/linux/bpf.h | 3 ++- > net/core/filter.c | 16 ++++++++++++++++ > 2 files changed, 18 insertions(+), 1 deletion(-) > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index b8275f0..e50c61f 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -769,7 +769,8 @@ enum bpf_attach_type { > FN(getsockopt), \ > FN(override_return), \ > FN(sock_ops_cb_flags_set), \ > - FN(msg_redirect_map), > + FN(msg_redirect_map), \ > + FN(msg_apply_bytes), > > /* integer value in 'imm' field of BPF_CALL instruction selects which helper > * function eBPF program intends to call > diff --git a/net/core/filter.c b/net/core/filter.c > index 314c311..df2a8f4 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -1928,6 +1928,20 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff > *msg) > .arg4_type = ARG_ANYTHING, > }; > > +BPF_CALL_2(bpf_msg_apply_bytes, struct sk_msg_buff *, msg, u64, bytes) > +{ > + msg->apply_bytes = bytes;
Here in bpf_msg_apply_bytes() but also in bpf_msg_cork_bytes() the signature is u64, but in struct sk_msg_buff and struct smap_psock it's type int, so user provided u64 will make these negative. Is there a reason to have this allow a negative value and not being of type u32 everywhere? > + return 0; > +} > + > +static const struct bpf_func_proto bpf_msg_apply_bytes_proto = { > + .func = bpf_msg_apply_bytes, > + .gpl_only = false, > + .ret_type = RET_INTEGER, > + .arg1_type = ARG_PTR_TO_CTX, > + .arg2_type = ARG_ANYTHING, > +}; > + > BPF_CALL_1(bpf_get_cgroup_classid, const struct sk_buff *, skb) > { > return task_get_classid(skb); > @@ -3634,6 +3648,8 @@ static const struct bpf_func_proto > *sk_msg_func_proto(enum bpf_func_id func_id) > switch (func_id) { > case BPF_FUNC_msg_redirect_map: > return &bpf_msg_redirect_map_proto; > + case BPF_FUNC_msg_apply_bytes: > + return &bpf_msg_apply_bytes_proto; > default: > return bpf_base_func_proto(func_id); > } >