On Tue 09 Apr 2019 at 20:10, Jakub Kicinski <jakub.kicin...@netronome.com> wrote: > On Tue, 9 Apr 2019 08:23:40 +0000, Vlad Buslov wrote: >> On Tue 09 Apr 2019 at 01:26, Jakub Kicinski <jakub.kicin...@netronome.com> >> wrote: >> > On Fri, 5 Apr 2019 20:56:26 +0300, Vlad Buslov wrote: >> >> John reports: >> >> >> >> Recent refactoring of fl_change aims to use the classifier spinlock to >> >> avoid the need for rtnl lock. In doing so, the fl_hw_replace_filer() >> >> function was moved to before the lock is taken. This can create problems >> >> for drivers if duplicate filters are created (commmon in ovs tc offload >> >> due to filters being triggered by user-space matches). >> >> >> >> Drivers registered for such filters will now receive multiple copies of >> >> the same rule, each with a different cookie value. This means that the >> >> drivers would need to do a full match field lookup to determine >> >> duplicates, repeating work that will happen in flower __fl_lookup(). >> >> Currently, drivers do not expect to receive duplicate filters. >> >> >> >> To fix this, verify that filter with same key is not present in flower >> >> classifier hash table and insert the new filter to the flower hash table >> >> before offloading it to hardware. Implement helper function >> >> fl_ht_insert_unique() to atomically verify/insert a filter. >> >> >> >> This change makes filter visible to fast path at the beginning of >> >> fl_change() function, which means it can no longer be freed directly in >> >> case of error. Refactor fl_change() error handling code to deallocate the >> >> filter with rcu timeout. >> >> >> >> Fixes: 620da4860827 ("net: sched: flower: refactor fl_change") >> >> Reported-by: John Hurley <john.hur...@netronome.com> >> >> Signed-off-by: Vlad Buslov <vla...@mellanox.com> >> > >> > How is re-offload consistency guaranteed? IIUC the code is: >> > >> > insert into HT >> > offload >> > insert into IDR >> > >> > What guarantees re-offload consistency if new callback is added just >> > after offload is requested but before rules ends up in IDR? >> >> Hi Jakub, >> >> At the moment cls hardware offloads API is always called with rtnl lock, >> so rule can't be offloaded while reoffload is in progress. > > Does that somehow imply atomicity of offloading vs inserting into IDR? > Doesn't seem so from a cursory look. Or do you mean rtnl_held is > always true?
Sorry, I forgot that we are discussing shared block for which rtnl is not taken in tc_new_tfilter(). Now I understand the issue and will send my 'reoffload_count' implementation as a fix. > >> For my next patch set that unlocks the offloads API I implemented the >> algorithm to track reoffload count for each tp that works like this: >> >> 1. struct tcf_proto is extended with reoffload_count counter that >> incremented each time reoffload is called on particular tp instance. >> Counter is protected by tp->lock. >> >> 2. struct cls_fl_filter is also extended with reoffload_count counter. >> Its value is set to current tp->reoffload_count when offloading the >> filter. >> >> 3. After offloading the filter, but before inserting it to idr, >> f->reoffload_count is compared with tp->reoffload_count. If values >> don't match, filter is deleted and -EAGAIN is returned. Cls API >> retries filter insertion on -EAGAIN. > > Sounds good for add. Does this solve delete case as well? > > CPU 0 CPU 1 > > __fl_delete > IDR remove > cb unregister > hw delete all flows <- doesn't see the > remove in progress > > hw delete <- doesn't see > the removed cb Thanks for pointing that out! Looks like I need to move call to hw delete in __fl_delete() function to be executed before idr removal.