Re: [PATCH] net: Revert "net_sched: no need to free qdisc in RCU callback"

Jiri Pirko Thu, 21 Dec 2017 00:39:33 -0800

Thu, Dec 21, 2017 at 12:34:05AM CET, [email protected] wrote:
>On 12/20/2017 03:23 PM, Cong Wang wrote:
>> On Wed, Dec 20, 2017 at 3:05 PM, John Fastabend
>> <[email protected]> wrote:
>>> On 12/20/2017 02:41 PM, Cong Wang wrote:
>>>> On Wed, Dec 20, 2017 at 12:09 PM, John Fastabend
>>>> <[email protected]> wrote:
>>>>> RCU grace period is needed for lockless qdiscs added in the commit
>>>>> c5ad119fb6c09 ("net: sched: pfifo_fast use skb_array").
>>>>>
>>>>> It is needed now that qdiscs may be lockless otherwise we risk
>>>>> free'ing a qdisc that is still in use from datapath. Additionally,
>>>>> push list cleanup into RCU callback. Otherwise we risk the datapath
>>>>> adding skbs during removal.
>>>>
>>>> What about qdisc_graft() -> dev_deactivate() -> synchronize_net() ?
>>>> It doesn't work with your "lockless" patches?
>>>>
>>>
>>> Well this is only in the 'parent == NULL' case otherwise we call
>>> cops->graft(). Most sch_* seem to use qdisc_replace and this uses
>>> sch_tree_lock().
>>>
>>> The only converted qdisc mq and mqprio at this point don't care
>>> though and do their own dev_deactivate/activate. So its not fixing
>>> anything in the above mentioned commit.
>> 
>> Sure, removing a class does not impact the whole device,
>> but removing the root qdisc does.
>> 
>> After your "lockless", skb_array_consume_bh() is called in
>> pfifo_fast_reset() and ptr_ring_cleanup() is called in
>> pfifo_fast_destroy(), assuming skb_array is not buggy, what race
>> do we have here with datapath?
>> 
>
>None at the moment.
>
>> 
>>>
>>> I still think it will need to be done eventually. If it resolves
>>> the miniq case it seems like a good idea. Although per Jakub's comment
>>> perhaps I pulled too much into the RCU handler.
>> 
>> The case Jakub reported is a RCU callback missing a rcu
>> barrier. I don't understand why you keep believing it is RCU
>> readers on datapath.> 
>> Not even to mention ingress is not affected by your "lockless"
>> thing.
>> 
>
>I was thinking about the case where we want a lockless qdisc
>with classes. Doing the qdisc destroy after a grace period would
>solve this. Also we could start to cleanup a lot of the locking
>and extra bits around 'running' qdisc and such by doing a clean
>xchg on the qdisc layer. It seems that a dev_activate/deactivate
>just to install a new qdisc is not needed.
>
>Anyways future work. However if it resolves the miniq issue, as
>Jiri indicated, seems like a clean fix. Although Jakub's issue
>with the patch would need to be addressed. Seems he gets a WARN_ON
>if the offload is not disabled but the device is unitialized.


Why just moving qdisc_free to rcu is not enough? It would resolve this
issue and also avoid using synchronize net. Something like:

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 83a3e47..487288e 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -100,6 +100,7 @@ struct Qdisc {
        refcount_t              refcnt;
 
        spinlock_t              busylock ____cacheline_aligned_in_smp;
+       struct rcu_head         rcu;
 };
 
 static inline void qdisc_refcount_inc(struct Qdisc *qdisc)
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index cd1b200..9beffd1 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -698,8 +698,10 @@ void qdisc_reset(struct Qdisc *qdisc)
 }
 EXPORT_SYMBOL(qdisc_reset);
 
-static void qdisc_free(struct Qdisc *qdisc)
+static void qdisc_free_rcu(struct rcu_head *rcu)
 {
+       struct Qdisc *qdisc = container_of(rcu, struct Qdisc, rcu);
+
        if (qdisc_is_percpu_stats(qdisc)) {
                free_percpu(qdisc->cpu_bstats);
                free_percpu(qdisc->cpu_qstats);
@@ -732,7 +734,7 @@ void qdisc_destroy(struct Qdisc *qdisc)
 
        kfree_skb_list(qdisc->gso_skb);
        kfree_skb(qdisc->skb_bad_txq);
-       qdisc_free(qdisc);
+       call_rcu(&qdisc->rcu, qdisc_free_rcu);
 }
 EXPORT_SYMBOL(qdisc_destroy);
 
-- 
2.9.5

Re: [PATCH] net: Revert "net_sched: no need to free qdisc in RCU callback"

Reply via email to