On Fri, Jan 25, 2019 at 03:42:43PM -0800, Alexei Starovoitov wrote:
> On Fri, Jan 25, 2019 at 10:10:57AM +0100, Peter Zijlstra wrote:
> > Do we want something like (the completely untested) below to avoid
> > having to manually audit this over and over?
> >
> > ---
> > include/linux/filter.h | 2 +-
> > include/linux/kernel.h | 9 +++++++--
> > kernel/sched/core.c | 28 ++++++++++++++++++++++++++++
> > 3 files changed, 36 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/filter.h b/include/linux/filter.h
> > index d531d4250bff..4ab51e78da36 100644
> > --- a/include/linux/filter.h
> > +++ b/include/linux/filter.h
> > @@ -513,7 +513,7 @@ struct sk_filter {
> > struct bpf_prog *prog;
> > };
> >
> > -#define BPF_PROG_RUN(filter, ctx) (*(filter)->bpf_func)(ctx,
> > (filter)->insnsi)
> > +#define BPF_PROG_RUN(filter, ctx) ({ cant_sleep();
> > (*(filter)->bpf_func)(ctx, (filter)->insnsi); })
>
> That looks reasonable and I intent to apply this patch to bpf-next after
> testing.
> Can you pls reply with a sob ?
Sure; with the caveat that I didn't even hold it near a compiler, and it
probably should grow a comment to explain the interface (similar to
might_sleep):
Suggested-by: Jann Horn <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> The easiest fix is to add preempt_disable/enable for socket filters.
> There is a concern that such fix will make classic bpf non-preemptable
> and classic bpf can be quite cpu expensive.
> Also on the receive side classic runs in bh, so 4k flow_dissector calls
> in classic has to be dealt with anyway.
Right and agreed; per that argument the worst case (legacy) BPF was
already present under non-preempt and thus making it consistently so
should not affect the worst case.