On Sat, Jun 20, 2020 at 8:58 AM Roman Gushchin <g...@fb.com> wrote:
>
> On Fri, Jun 19, 2020 at 08:00:41PM -0700, Cong Wang wrote:
> > On Fri, Jun 19, 2020 at 6:14 PM Roman Gushchin <g...@fb.com> wrote:
> > >
> > > On Sat, Jun 20, 2020 at 09:00:40AM +0800, Zefan Li wrote:
> > > > I think so, though I'm not familiar with the bfp cgroup code.
> > > >
> > > > > If so, we might wanna fix it in a different way,
> > > > > just checking if (!(css->flags & CSS_NO_REF)) in cgroup_bpf_put()
> > > > > like in cgroup_put(). It feels more reliable to me.
> > > > >
> > > >
> > > > Yeah I also have this idea in my mind.
> > >
> > > I wonder if the following patch will fix the issue?
> >
> > Interesting, AFAIU, this refcnt is for bpf programs attached
> > to the cgroup. By this suggestion, do you mean the root
> > cgroup does not need to refcnt the bpf programs attached
> > to it? This seems odd, as I don't see how root is different
> > from others in terms of bpf programs which can be attached
> > and detached in the same way.
> >
> > I certainly understand the root cgroup is never gone, but this
> > does not mean the bpf programs attached to it too.
> >
> > What am I missing?
>
> It's different because the root cgroup can't be deleted.
>
> All this reference counting is required to automatically detach bpf programs
> from a _deleted_ cgroup (look at cgroup_bpf_offline()). It's required
> because a cgroup can be in dying state for a long time being pinned by a
> pagecache page, for example. Only a user can detach a bpf program from
> an existing cgroup.

Yeah, but users can still detach the bpf programs from root cgroup.
IIUC, after detaching, the pointer in the bpf array will be empty_prog_array
which is just an array of NULL. Then __cgroup_bpf_run_filter_skb() will
deref it without checking NULL (as check_non_null == false).

This matches the 0000000000000010 pointer seen in the bug reports,
the 0x10, that is 16, is the offset of items[] in struct bpf_prog_array.
So looks like we have to add a NULL check there regardless of refcnt.

Also, I am not sure whether your suggested patch makes a difference
for percpu refcnt, as percpu_ref_put() will never call ->release() until
percpu_ref_kill(), which is never called on root cgroup?

Thanks.

Reply via email to