On 08/30/2016 12:42 AM, Daniel Borkmann wrote:
> On 08/26/2016 09:58 PM, Daniel Mack wrote:
>> This patch adds two sets of eBPF program pointers to struct cgroup.
>> One for such that are directly pinned to a cgroup, and one for such
>> that are effective for it.
>>
>> To illustrate the logic behind that, assume the following example
>> cgroup hierarchy.
>>
>>    A - B - C
>>          \ D - E
>>
>> If only B has a program attached, it will be effective for B, C, D
>> and E. If D then attaches a program itself, that will be effective for
>> both D and E, and the program in B will only affect B and C. Only one
>> program of a given type is effective for a cgroup.
>>
>> Attaching and detaching programs will be done through the bpf(2)
>> syscall. For now, ingress and egress inet socket filtering are the
>> only supported use-cases.
>>
>> Signed-off-by: Daniel Mack <dan...@zonque.org>
> [...]
>> +void __cgroup_bpf_update(struct cgroup *cgrp,
>> +                     struct cgroup *parent,
>> +                     struct bpf_prog *prog,
>> +                     enum bpf_attach_type type)
>> +{
>> +    struct bpf_prog *old_prog, *effective;
>> +    struct cgroup_subsys_state *pos;
>> +
>> +    old_prog = xchg(cgrp->bpf.prog + type, prog);
>> +
>> +    if (prog)
>> +            static_branch_inc(&cgroup_bpf_enabled_key);
>> +
>> +    if (old_prog) {
>> +            bpf_prog_put(old_prog);
>> +            static_branch_dec(&cgroup_bpf_enabled_key);
>> +    }
>> +
>> +    effective = (!prog && parent) ?
>> +            rcu_dereference_protected(parent->bpf.effective[type],
>> +                                      lockdep_is_held(&cgroup_mutex)) :
>> +            prog;
>> +
>> +    css_for_each_descendant_pre(pos, &cgrp->self) {
>> +            struct cgroup *desc = container_of(pos, struct cgroup, self);
>> +
>> +            /* skip the subtree if the descendant has its own program */
>> +            if (desc->bpf.prog[type] && desc != cgrp)
>> +                    pos = css_rightmost_descendant(pos);
>> +            else
>> +                    rcu_assign_pointer(desc->bpf.effective[type],
>> +                                       effective);
>> +    }
> 
> Shouldn't the old_prog reference only be released right here at the end
> instead of above (otherwise this could race)?

Yes, that's right. Will change as well. Thanks for spotting!


Daniel

Reply via email to