On Tue, Oct 24, 2017 at 6:43 PM, David Miller <[email protected]> wrote:
> From: Cong Wang <[email protected]>
> Date: Mon, 23 Oct 2017 15:02:49 -0700
>
>> Recently, the RCU callbacks used in TC filters and TC actions keep
>> drawing my attention, they introduce at least 4 race condition bugs:
>
> Like Eric, I think doing a full RCU sync on every delete is too big
> a pill to swallow. This is a major control plane performance
> regression.
>
> Please find another reasonable way to fix this.
>
Alright... I finally find a way to make everyone happy.
My solution is introducing a workqueue for tc filters
and let each RCU callback defer the work to this
workqueue. I solve the flush_workqueue() deadlock
by queuing another work in the same workqueue
at the end, so the execution order should be as same
as it is now. The ugly part is now tcf_block_put() which
looks like below:
static void tcf_block_put_final(struct work_struct *work)
{
struct tcf_block *block = container_of(work, struct tcf_block, work);
struct tcf_chain *chain, *tmp;
/* At this point, all the chains should have refcnt == 1. */
rtnl_lock();
list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
tcf_chain_put(chain);
rtnl_unlock();
kfree(block);
}
static void tcf_block_put_deferred(struct work_struct *work)
{
struct tcf_block *block = container_of(work, struct tcf_block, work);
struct tcf_chain *chain;
rtnl_lock();
/* Hold a refcnt for all chains, except 0, in case they are gone. */
list_for_each_entry(chain, &block->chain_list, list)
if (chain->index)
tcf_chain_hold(chain);
/* No race on the list, because no chain could be destroyed. */
list_for_each_entry(chain, &block->chain_list, list)
tcf_chain_flush(chain);
INIT_WORK(&block->work, tcf_block_put_final);
/* Wait for RCU callbacks to release the reference count and make
* sure their works have been queued before this.
*/
rcu_barrier();
tcf_queue_work(&block->work);
rtnl_unlock();
}
void tcf_block_put(struct tcf_block *block)
{
if (!block)
return;
INIT_WORK(&block->work, tcf_block_put_deferred);
/* Wait for existing RCU callbacks to cool down, make sure their works
* have been queued before this. We can not flush pending works here
* because we are holding the RTNL lock.
*/
rcu_barrier();
tcf_queue_work(&block->work);
}
Paul, does this make any sense to you? ;)