On 06/06/2019 08:15 PM, Jonathan Lemon wrote: > On 6 Jun 2019, at 9:15, Toke Høiland-Jørgensen wrote: >> Alexei Starovoitov <alexei.starovoi...@gmail.com> writes: >>> On Thu, Jun 6, 2019 at 8:51 AM Daniel Borkmann <dan...@iogearbox.net> wrote: >>>> On 06/06/2019 03:24 PM, Toke Høiland-Jørgensen wrote: >>>>> From: Toke Høiland-Jørgensen <t...@redhat.com> >>>>> >>>>> The bpf_redirect_map() helper used by XDP programs doesn't return any >>>>> indication of whether it can successfully redirect to the map index it was >>>>> given. Instead, BPF programs have to track this themselves, leading to >>>>> programs using duplicate maps to track which entries are populated in the >>>>> devmap. >>>>> >>>>> This patch adds a flag to the XDP version of the bpf_redirect_map() >>>>> helper, >>>>> which makes the helper do a lookup in the map when called, and return >>>>> XDP_PASS if there is no value at the provided index. >>>>> >>>>> With this, a BPF program can check the return code from the helper call >>>>> and >>>>> react if it is XDP_PASS (by, for instance, substituting a different >>>>> redirect). This works for any type of map used for redirect. >>>>> >>>>> Signed-off-by: Toke Høiland-Jørgensen <t...@redhat.com> >>>>> --- >>>>> include/uapi/linux/bpf.h | 8 ++++++++ >>>>> net/core/filter.c | 10 +++++++++- >>>>> 2 files changed, 17 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h >>>>> index 7c6aef253173..d57df4f0b837 100644 >>>>> --- a/include/uapi/linux/bpf.h >>>>> +++ b/include/uapi/linux/bpf.h >>>>> @@ -3098,6 +3098,14 @@ enum xdp_action { >>>>> XDP_REDIRECT, >>>>> }; >>>>> >>>>> +/* Flags for bpf_xdp_redirect_map helper */ >>>>> + >>>>> +/* If set, the help will check if the entry exists in the map and return >>>>> + * XDP_PASS if it doesn't. >>>>> + */ >>>>> +#define XDP_REDIRECT_F_PASS_ON_INVALID BIT(0) >>>>> +#define XDP_REDIRECT_ALL_FLAGS XDP_REDIRECT_F_PASS_ON_INVALID >>>>> + >>>>> /* user accessible metadata for XDP packet hook >>>>> * new fields must be added to the end of this structure >>>>> */ >>>>> diff --git a/net/core/filter.c b/net/core/filter.c >>>>> index 55bfc941d17a..2e532a9b2605 100644 >>>>> --- a/net/core/filter.c >>>>> +++ b/net/core/filter.c >>>>> @@ -3755,9 +3755,17 @@ BPF_CALL_3(bpf_xdp_redirect_map, struct bpf_map *, >>>>> map, u32, ifindex, >>>>> { >>>>> struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); >>>>> >>>>> - if (unlikely(flags)) >>>>> + if (unlikely(flags & ~XDP_REDIRECT_ALL_FLAGS)) >>>>> return XDP_ABORTED; >>>>> >>>>> + if (flags & XDP_REDIRECT_F_PASS_ON_INVALID) { >>>>> + void *val; >>>>> + >>>>> + val = __xdp_map_lookup_elem(map, ifindex); >>>>> + if (unlikely(!val)) >>>>> + return XDP_PASS; >>>> >>>> Generally looks good to me, also the second part with the flag. Given we >>>> store into >>>> the per-CPU scratch space and function like xdp_do_redirect() pick this up >>>> again, we >>>> could even propagate val onwards and save a second lookup on the /same/ >>>> element (which >>>> also avoids a race if the val was dropped from the map in the meantime). >>>> Given this >>>> should all still be within RCU it should work. Perhaps it even makes sense >>>> to do the >>>> lookup unconditionally inside bpf_xdp_redirect_map() helper iff we manage >>>> to do it >>>> only once anyway? >>> >>> +1 >>> >>> also I don't think we really need a new flag here. >>> Yes, it could be considered an uapi change, but it >>> looks more like bugfix in uapi to me. >>> Since original behavior was so clunky to use. >> >> Hmm, the problem with this is that eBPF programs generally do something >> like: >> >> return bpf_redirect_map(map, idx, 0); >> >> after having already modified the packet headers. This will get them a >> return code of XDP_REDIRECT, and the lookup will then subsequently fail, >> which returns in XDP_ABORTED in the driver, which you can catch with >> tracing. >> >> However, if we just change it to XDP_PASS, the packet will go up the >> stack, but because it has already been modified the stack will drop it, >> more or less invisibly. >> >> So the question becomes, is that behaviour change really OK? > > Another option would be treating the flags (or the lower bits of flags) > as the default xdp action taken if the lookup fails. 0 just happens to > map to XDP_ABORTED, which gives the initial behavior. Then the new behavior > would be: > > return bpf_redirect_map(map, index, XDP_PASS);
Makes sense, that should work, but as default (flags == 0), you'd have to return XDP_REDIRECT to stay consistent with existing behavior. Thanks, Daniel