On Fri, Sep 28, 2018 at 10:56:47AM -0700, Cong Wang wrote: > On Fri, Sep 28, 2018 at 7:59 AM Ido Schimmel <[email protected]> wrote: > > > > On Wed, Sep 19, 2018 at 04:37:29PM -0700, Cong Wang wrote: > > > From: Vlad Buslov <[email protected]> > > > > > > From: Vlad Buslov <[email protected]> > > > > > > Action API was changed to work with actions and action_idr in concurrency > > > safe manner, however tcf_del_walker() still uses actions without taking a > > > reference or idrinfo->lock first, and deletes them directly, disregarding > > > possible concurrent delete. > > > > > > Change tcf_del_walker() to take idrinfo->lock while iterating over actions > > > and use new tcf_idr_release_unsafe() to release them while holding the > > > lock. > > > > > > And the blocking function fl_hw_destroy_tmplt() could be called when we > > > put a filter chain, so defer it to a work queue. > > > > I'm getting a use-after-free when running tc_chains.sh selftest and I > > believe it's caused by this patch. > > > > To reproduce: > > # cd tools/testing/selftests/net/forwarding > > # export TESTS="template_filter_fits"; ./tc_chains.sh veth0 veth1 > > > > __tcf_chain_put() > > tc_chain_tmplt_del() > > fl_tmplt_destroy() > > tcf_queue_work(&tmplt->rwork, fl_tmplt_destroy_work) > > tcf_chain_destroy() > > kfree(chain) > > > > Some time later fl_tmplt_destroy_work() starts executing and > > dereferencing 'chain'. > > Oops, forgot to hold the chain... I will test this: > > diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c > index 92dd5071a708..cbb68d5515d6 100644 > --- a/net/sched/cls_flower.c > +++ b/net/sched/cls_flower.c > @@ -1444,6 +1444,7 @@ static void fl_tmplt_destroy_work(struct > work_struct *work) > struct fl_flow_tmplt, rwork); > > fl_hw_destroy_tmplt(tmplt->chain, tmplt); > + tcf_chain_put(tmplt->chain); > kfree(tmplt); > } > > @@ -1451,6 +1452,7 @@ static void fl_tmplt_destroy(void *tmplt_priv) > { > struct fl_flow_tmplt *tmplt = tmplt_priv; > > + tcf_chain_hold(tmplt->chain); > tcf_queue_work(&tmplt->rwork, fl_tmplt_destroy_work); > }
I don't think this will work given the reference count already dropped to 0, which is why the template deletion function was invoked. I didn't test the patch, but I don't see what would prevent the chain from being freed. Thanks for looking into this.
