On Tue, Feb 26, 2019 at 8:10 AM Vlad Buslov <vla...@mellanox.com> wrote: > > > On Tue 26 Feb 2019 at 00:15, Cong Wang <xiyou.wangc...@gmail.com> wrote: > > On Mon, Feb 25, 2019 at 7:45 AM Vlad Buslov <vla...@mellanox.com> wrote: > >> > >> Function tc_dump_chain() obtains and releases block->lock on each iteration > >> of its inner loop that dumps all chains on block. Outputting chain template > >> info is fast operation so locking/unlocking mutex multiple times is an > >> overhead when lock is highly contested. Modify tc_dump_chain() to only > >> obtain block->lock once and dump all chains without releasing it. > >> > >> Signed-off-by: Vlad Buslov <vla...@mellanox.com> > >> Suggested-by: Cong Wang <xiyou.wangc...@gmail.com> > > > > Thanks for the followup! > > > > Isn't it similar for __tcf_get_next_proto() in tcf_chain_dump()? > > And for tc_dump_tfilter()? > > Not really. These two dump all tp filters and not just a template, which > is O(n) on number of filters and can be slow because it calls hw offload > API for each of them. Our typical use-case involves periodic filter dump > (to update stats) while multiple concurrent user-space threads are > updating filters, so it is important for them to be able to execute in > parallel.
Hmm, but if these are read-only, you probably don't even need a mutex, you can just use RCU read lock to protect list iteration and you still can grab the refcnt in the same way.