On Sun, Dec 16, 2018 at 03:47:04PM -0800, John Fastabend wrote: > This adds metadata to sk_msg_md for BPF programs to read the sk_msg > size. > > When the SK_MSG program is running under an application that is using > sendfile the data is not copied into sk_msg buffers by default. Rather > the BPF program uses sk_msg_pull_data to read the bytes in. This > avoids doing the costly memcopy instructions when they are not in > fact needed. However, if we don't know the size of the sk_msg we > have to guess if needed bytes are available by doing a pull request > which may fail. By including the size of the sk_msg BPF programs can > check the size before issuing sk_msg_pull_data requests. > > Additionally, the same applies for sendmsg calls when the application > provides multiple iovs. Here the BPF program needs to pull in data > to update data pointers but its not clear where the data ends without > a size parameter. In many cases "guessing" is not easy to do > and results in multiple calls to pull and without bounded loops > everything gets fairly tricky. > > Clean this up by including a u32 size field. Note, all writes into > sk_msg_md are rejected already from sk_msg_is_valid_access so nothing > additional is needed there. > > Signed-off-by: John Fastabend <john.fastab...@gmail.com> > --- > include/linux/skmsg.h | 3 +++ > include/uapi/linux/bpf.h | 1 + > net/core/filter.c | 6 ++++++ > 3 files changed, 10 insertions(+) > > diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h > index 2a11e9d..eb8f6cb 100644 > --- a/include/linux/skmsg.h > +++ b/include/linux/skmsg.h > @@ -36,6 +36,9 @@ struct sk_msg_sg { > struct scatterlist data[MAX_MSG_FRAGS + 1]; > }; > > +/* UAPI in filter.c depends on struct sk_msg_sg being first element. If > + * this is moved filter.c also must be updated. > + */ > struct sk_msg { > struct sk_msg_sg sg; > void *data; > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index 597afdb..498badc 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -2608,6 +2608,7 @@ struct sk_msg_md { > __u32 local_ip6[4]; /* Stored in network byte order */ > __u32 remote_port; /* Stored in network byte order */ > __u32 local_port; /* stored in host byte order */ > + __u32 size; /* Total size of sk_msg */ > }; > > struct sk_reuseport_md { > diff --git a/net/core/filter.c b/net/core/filter.c > index bd0df75..55fd237 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -7513,6 +7513,12 @@ static u32 sk_msg_convert_ctx_access(enum > bpf_access_type type, > *insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->dst_reg, > offsetof(struct sock_common, skc_num)); > break; > + > + case offsetof(struct sk_msg_md, size): > + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_msg_sg, size), > + si->dst_reg, si->src_reg, > + offsetof(struct sk_msg_sg, size)); > + break;
John, this change broke "test_verifier 129" test. Now it's failing. But upon further examination both sk_msg_is_valid_access() and that test were incorrect. Here is the bug: if (off < 0 || off >= sizeof(struct sk_msg_md)) sizeof() includes padding to 8 bytes. So out of bounds access passes sk_msg_is_valid_access(), but rewritten incorrectly by sk_msg_convert_ctx_access() and the test is testing for wrong thing. errstr = "R0 !read_ok" is not the message it should be looking for. After this patch and following adjustment to test_verifier.c that test is now failing while verifier doing the right thing. Please submit two patches to fix this: 1 - fix sk_msg_convert_ctx_access 2 - fix test_verifier