Hi Tejun,
On 08/24/2016 11:54 PM, Tejun Heo wrote:
> On Wed, Aug 24, 2016 at 10:24:19PM +0200, Daniel Mack wrote:
>> +void cgroup_bpf_free(struct cgroup *cgrp)
>> +{
>> + unsigned int type;
>> +
>> + rcu_read_lock();
>> +
>> + for (type = 0; type < __MAX_BPF_ATTACH_TYPE; type++) {
>> + if (!cgrp->bpf.prog[type])
>> + continue;
>> +
>> + bpf_prog_put(cgrp->bpf.prog[type]);
>> + static_branch_dec(&cgroup_bpf_enabled_key);
>> + }
>> +
>> + rcu_read_unlock();
>
> These rcu locking seem suspicious to me. RCU locking on writer side
> is usually bogus. We sometimes do it to work around locking
> assertions in accessors but it's a better idea to make the assertions
> better in those cases - e.g. sth like assert_mylock_or_rcu_locked().
Right, in this case, it is unnecessary, as the bpf.prog[] is not under RCU.
>> +void cgroup_bpf_inherit(struct cgroup *cgrp, struct cgroup *parent)
>> +{
>> + unsigned int type;
>> +
>> + rcu_read_lock();
>
> Ditto.
>
>> + for (type = 0; type < __MAX_BPF_ATTACH_TYPE; type++)
>> + rcu_assign_pointer(cgrp->bpf.prog_effective[type],
>> + rcu_dereference(parent->bpf.prog_effective[type]));
Okay, yes. We're under cgroup_mutex write-path protection here, so
that's unnecessary too.
>> +void __cgroup_bpf_update(struct cgroup *cgrp,
>> + struct cgroup *parent,
>> + struct bpf_prog *prog,
>> + enum bpf_attach_type type)
>> +{
>> + struct bpf_prog *old_prog, *effective;
>> + struct cgroup_subsys_state *pos;
>> +
>> + rcu_read_lock();
>
> Ditto.
Yes, agreed, as above.
>> + old_prog = xchg(cgrp->bpf.prog + type, prog);
>> + if (old_prog) {
>> + bpf_prog_put(old_prog);
>> + static_branch_dec(&cgroup_bpf_enabled_key);
>> + }
>> +
>> + if (prog)
>> + static_branch_inc(&cgroup_bpf_enabled_key);
>
> Minor but probably better to inc first and then dec so that you can
> avoid unnecessary enabled -> disabled -> enabled sequence.
Good point. Will fix.
>> + rcu_read_unlock();
>> +
>> + css_for_each_descendant_pre(pos, &cgrp->self) {
>
> On the other hand, this walk actually requires rcu read locking unless
> you're holding cgroup_mutex.
I am - this function is always called with cgroup_mutex held through the
wrapper in kernel/cgroup.c.
Thanks a lot - will put all that changes in v3.
Daniel