On 2/9/21 4:46 PM, Arjun Roy wrote: > On Tue, Feb 9, 2021 at 8:59 AM Jakub Kicinski <k...@kernel.org> wrote: >> >> On Mon, 8 Feb 2021 20:20:29 -0700 David Ahern wrote: >>> On 2/8/21 7:53 PM, Jakub Kicinski wrote: >>>> On Mon, 8 Feb 2021 19:24:05 -0700 David Ahern wrote: >>>>> That would be the case for new userspace on old kernel. Extending the >>>>> check to the end of the struct would guarantee new userspace can not ask >>>>> for something that the running kernel does not understand. >>>> >>>> Indeed, so we're agreeing that check_zeroed_user() is needed before >>>> original optlen from user space gets truncated? >>> >>> I thought so, but maybe not. To think through this ... >>> >>> If current kernel understands a struct of size N, it can only copy that >>> amount from user to kernel. Anything beyond is ignored in these >>> multiplexed uAPIs, and that is where the new userspace on old kernel falls. >>> >>> Known value checks can only be done up to size N. In this case, the >>> reserved field is at the end of the known struct size, so checking just >>> the field is fine. Going beyond the reserved field has implications for >>> extensions to the API which should be handled when those extensions are >>> added. >> >> Let me try one last time. >> >> There is no check in the kernels that len <= N. User can pass any >> length _already_. check_zeroed_user() forces the values beyond the >> structure length to be known (0) rather than anything. It can only >> avoid breakages in the future. >> >>> So, in short I think the "if (zc.reserved)" is correct as Leon noted. >> >> If it's correct to check some arbitrary part of the buffer is zeroed >> it should be correct to check the entire tail is zeroed. > > So, coming back to the thread, I think the following appears to be the > current thoughts: > > 1. It is requested that, on the kernel as it stands today, fields > beyond zc.msg_flags (including zc.reserved, the only such field as of > this patch) are zero'd out. So a new userspace asking to do specific > things would fail on this old kernel with EINVAL. Old userspace would > work on old or new kernels. New of course works on new kernels. > 2. If it's correct to check some arbitrary field (zc.reserved) to be > 0, then it should be fine to check this for all future fields >= > reserved in the struct. So some advanced userspace down the line > doesn't get confused. > > Strictly speaking, I'm not convinced this is necessary - eg. 64 bytes > struct right now, suppose userspace of the future gives us 96 bytes of > which the last 32 are non-zero for some feature or the other. We, in > the here and now kernel, truncate that length to 64 (as in we only > copy to kernel those first 64 bytes) and set the returned length to > 64. The understanding being, any (future, past or present) userspace > consults the output value; and considers anything byte >= the returned > len to be untouched by the kernel executing the call (ie. garbage, > unacted upon). > > So, how would this work for old+new userspace on old+new kernel? > > A) old+old, new+new: sizes match, no issue > B) new kernel, old userspace: That's not an issue. We have the > switch(len) statement for that. > C) old kernel, new userspace: that's the 96 vs. 64 B example above - > new userspace would see that the kernel only operated on 64 B and > treat the last 32 B as garbage/unacted on. > > In this case, we would not give EINVAL on case C, as we would if we > returned EINVAL on a check_zeroed_user() case for fields past > zc.reserved. We'd do a zerocopy operating on just the features we know > about, and communicate to the user that we only acted on features up > until this byte offset. > > Now, given this is the case, we still have the padding confusion with > zc.reserved and the current struct size, so we have to force it to 0 > as we are doing. But I think we don't need to go beyond this so far. > > Thus, my personal preference is to not have the check_zeroed_user() > check. But if the consensus demands it, then it's an easy enough fix. > What are your thoughts? >
bpf uses check_zeroed_user to make sure extensions to its structs are compatible, so yes, this is required. Also, you need to address legitimate msg_flags as I mentioned in another response.