On Tue, May 5, 2020 at 1:30 PM Yonghong Song <[email protected]> wrote:
>
>
>
> On 5/5/20 1:25 PM, Andrii Nakryiko wrote:
> > On Sun, May 3, 2020 at 11:28 PM Yonghong Song <[email protected]> wrote:
> >>
> >> Macro DEFINE_BPF_ITER_FUNC is implemented so target
> >> can define an init function to capture the BTF type
> >> which represents the target.
> >>
> >> The bpf_iter_meta is a structure holding meta data, common
> >> to all targets in the bpf program.
> >>
> >> Additional marker functions are called before/after
> >> bpf_seq_read() show() and stop() callback functions
> >> to help calculate precise seq_num and whether call bpf_prog
> >> inside stop().
> >>
> >> Two functions, bpf_iter_get_info() and bpf_iter_run_prog(),
> >> are implemented so target can get needed information from
> >> bpf_iter infrastructure and can run the program.
> >>
> >> Signed-off-by: Yonghong Song <[email protected]>
> >> ---
> >> include/linux/bpf.h | 11 +++++
> >> kernel/bpf/bpf_iter.c | 94 ++++++++++++++++++++++++++++++++++++++++---
> >> 2 files changed, 100 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> >> index 26daf85cba10..70c71c3cd9e8 100644
> >> --- a/include/linux/bpf.h
> >> +++ b/include/linux/bpf.h
> >> @@ -1129,6 +1129,9 @@ int bpf_obj_pin_user(u32 ufd, const char __user
> >> *pathname);
> >> int bpf_obj_get_user(const char __user *pathname, int flags);
> >>
> >> #define BPF_ITER_FUNC_PREFIX "__bpf_iter__"
> >> +#define DEFINE_BPF_ITER_FUNC(target, args...) \
> >> + extern int __bpf_iter__ ## target(args); \
> >> + int __init __bpf_iter__ ## target(args) { return 0; }
> >
> > Why is extern declaration needed here? Doesn't the same macro define
>
> Silence sparse warning. Apparently in kernel, any global function, they
> want a declaration?
Ah.. alright :)
>
> > global function itself? I'm probably missing some C semantics thingy,
> > sorry...
> >
> >>
> >> typedef int (*bpf_iter_init_seq_priv_t)(void *private_data);
> >> typedef void (*bpf_iter_fini_seq_priv_t)(void *private_data);
> >> @@ -1141,11 +1144,19 @@ struct bpf_iter_reg {
> >> u32 seq_priv_size;
> >> };
> >>
> >> +struct bpf_iter_meta {
> >> + __bpf_md_ptr(struct seq_file *, seq);
> >> + u64 session_id;
> >> + u64 seq_num;
> >> +};
> >> +
> >
> > [...]
> >
> >> /* bpf_seq_read, a customized and simpler version for bpf iterator.
> >> * no_llseek is assumed for this file.
> >> * The following are differences from seq_read():
> >> @@ -83,12 +119,15 @@ static ssize_t bpf_seq_read(struct file *file, char
> >> __user *buf, size_t size,
> >> if (!p || IS_ERR(p))
> >> goto Stop;
> >>
> >> + bpf_iter_inc_seq_num(seq);
> >
> > so seq_num is one-based, not zero-based? So on first show() call it
> > will be set to 1, not 0, right?
>
> It is 1 based, we need to document this clearly. I forgot to adjust my
> bpf program for this. Will adjust them properly in the next revision.
I see. IMO, seq_num starting at 0 is more natural, but whichever way
is fine with me.
> >
> >> err = seq->op->show(seq, p);
> >> if (seq_has_overflowed(seq)) {
> >> + bpf_iter_dec_seq_num(seq);
> >> err = -E2BIG;
> >> goto Error_show;
> >> } else if (err) {
> >> /* < 0: go out, > 0: skip */
> >> + bpf_iter_dec_seq_num(seq);
> >> if (likely(err < 0))
> >> goto Error_show;
> >> seq->count = 0;
> >
> > [...]
> >