On 2020-10-15 11:46:53 [+0200], Eelco Chaudron wrote:
> The flow_lookup() function uses per CPU variables, which must not be
> preempted. However, this is fine in the general napi use case where
> the local BH is disabled. But, it's also called in the netlink
> context, which is preemptible. The below patch makes sure that even
> in the netlink path, preemption is disabled.

I would suggest to rephrase it: the term preemption usually means
preempt_disable(). A preempt disabled section can be preempted /
interrupted by hardirq and softirq. The later is mentioned and I think
is confusing.

> In addition, the u64_stats_update_begin() sync point was not protected,
> making the sync point part of the per CPU variable fixed this.

I would rephrase it and mention the key details:
u64_stats_update_begin() requires a lock to ensure one writer which is
not ensured here. Making it per-CPU and disabling NAPI (softirq) ensures
that there is always only one writer.

Regarding the annotation which were mentioned here in the thread.
Basically the this_cpu_ptr() warning worked as expected and got us here.
I don't think it is wise to add annotation distinguished from the actual
problem like assert_the_softirq_is_switched_off() in flow_lookup(). The
assert may become obsolete once the reason is removed and gets overseen
and remains in the code. The commits

        c60c32a577561 ("posix-cpu-timers: Remove 
lockdep_assert_irqs_disabled()")
        f9dae5554aed4 ("dpaa2-eth: Remove preempt_disable() from seed_pool()")

are just two examples which came to mind while writing this.

Instead I would prefer lockdep annotation in u64_stats_update_begin()
which is around also in 64bit kernels and complains if it is seen
without disabled BH if observed in-serving-softirq.
PeterZ, wasn't this mentioned before?

> --- a/net/openvswitch/flow_table.c
> +++ b/net/openvswitch/flow_table.c
> @@ -851,9 +852,17 @@ struct sw_flow *ovs_flow_tbl_lookup(struct flow_table 
> *tbl,
>       struct mask_array *ma = rcu_dereference_ovsl(tbl->mask_array);
>       u32 __always_unused n_mask_hit;
>       u32 __always_unused n_cache_hit;
> +     struct sw_flow *flow;
>       u32 index = 0;
>  
> -     return flow_lookup(tbl, ti, ma, key, &n_mask_hit, &n_cache_hit, &index);
> +     /* This function gets called trough the netlink interface and therefore
> +      * is preemptible. However, flow_lookup() function needs to be called
> +      * with preemption disabled due to CPU specific variables.

preemption vs BH.

> +      */
> +     local_bh_disable();
> +     flow = flow_lookup(tbl, ti, ma, key, &n_mask_hit, &n_cache_hit, &index);
> +     local_bh_enable();
> +     return flow;
>  }
>  
>  struct sw_flow *ovs_flow_tbl_lookup_exact(struct flow_table *tbl,

Otherwise it looks good.

Sebastian

Reply via email to