On Tue, Oct 31, 2017 at 3:40 AM, Jiri Pirko <j...@resnulli.us> wrote: > Mon, Oct 30, 2017 at 07:10:09PM CET, xiyou.wangc...@gmail.com wrote: >>In commit 7aa0045dadb6 ("net_sched: introduce a workqueue for RCU callbacks >>of tc filter") >>I defer tcf_chain_flush() to a workqueue, this causes a use-after-free >>because qdisc is already destroyed after we queue this work. >> >>The tcf_block_put_deferred() is no longer necessary after we get RTNL >>for each tc filter destroy work, no others could jump in at this point. >>Same for tcf_chain_hold(), we are fully serialized now. >> >>This also reduces one indirection therefore makes the code more >>readable. Note this brings back a rcu_barrier(), however comparing >>to the code prior to commit 7aa0045dadb6 we still reduced one >>rcu_barrier(). For net-next, we can consider to refcnt tcf block to >>avoid it. >> >>Fixes: 7aa0045dadb6 ("net_sched: introduce a workqueue for RCU callbacks of >>tc filter") >>Cc: Daniel Borkmann <dan...@iogearbox.net> >>Cc: Jiri Pirko <j...@resnulli.us> >>Cc: John Fastabend <john.fastab...@gmail.com> >>Cc: Jamal Hadi Salim <j...@mojatatu.com> >>Cc: "Paul E. McKenney" <paul...@linux.vnet.ibm.com> >>Cc: Eric Dumazet <eduma...@google.com> >>Signed-off-by: Cong Wang <xiyou.wangc...@gmail.com> >>--- >> net/sched/cls_api.c | 37 ++++++++----------------------------- >> 1 file changed, 8 insertions(+), 29 deletions(-) >> >>diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c >>index 231181c602ed..b2d310745487 100644 >>--- a/net/sched/cls_api.c >>+++ b/net/sched/cls_api.c >>@@ -280,8 +280,8 @@ static void tcf_block_put_final(struct work_struct *work) >> struct tcf_block *block = container_of(work, struct tcf_block, work); >> struct tcf_chain *chain, *tmp; >> >>- /* At this point, all the chains should have refcnt == 1. */ >> rtnl_lock(); >>+ /* Only chain 0 should be still here. */ >> list_for_each_entry_safe(chain, tmp, &block->chain_list, list) >> tcf_chain_put(chain); >> rtnl_unlock(); >>@@ -289,23 +289,17 @@ static void tcf_block_put_final(struct work_struct >>*work) >> } >> >> /* XXX: Standalone actions are not allowed to jump to any chain, and bound >>- * actions should be all removed after flushing. However, filters are >>destroyed >>- * in RCU callbacks, we have to hold the chains first, otherwise we would >>- * always race with RCU callbacks on this list without proper locking. >>+ * actions should be all removed after flushing. However, filters are now >>+ * destroyed in tc filter workqueue with RTNL lock, they can not race here. >> */ >>-static void tcf_block_put_deferred(struct work_struct *work) >>+void tcf_block_put(struct tcf_block *block) >> { >>- struct tcf_block *block = container_of(work, struct tcf_block, work); >>- struct tcf_chain *chain; >>+ struct tcf_chain *chain, *tmp; >> >>- rtnl_lock(); >>- /* Hold a refcnt for all chains, except 0, in case they are gone. */ >>- list_for_each_entry(chain, &block->chain_list, list) >>- if (chain->index) >>- tcf_chain_hold(chain); >>+ if (!block) >>+ return; >> >>- /* No race on the list, because no chain could be destroyed. */ >>- list_for_each_entry(chain, &block->chain_list, list) >>+ list_for_each_entry_safe(chain, tmp, &block->chain_list, list) >> tcf_chain_flush(chain); > > The reason for the hold above was to avoid use after free in this loop. > Consider following example: > > chain1 > 1 filter with action goto_chain 2 > chain2 > empty > > Now in your list_for_each_entry_safe loop, chain1 is flushed, action is > removed and chain is put: > tcf_action_goto_chain_fini->tcf_chain_put(2) > > Given the fact chain2 is empty, this put would lead to tcf_chain_destroy(2) > > Then in another iteration of list_for_each_entry_safe you are using > already freed chain. > > Am I missing something that prevents this?
Actions are destroyed in filter's ->destroy(), and all filters now use RCU callback+workqueue to defer the tcf_action_goto_chain_fini() and with RTNL, so this function won't be executed until we release RTNL. This is what I mean by "fully serialized". Or if there any other case I still miss?