Hello,

On Wed, 31 May 2017, Sowmini Varadhan wrote:

> On (06/01/17 00:41), Julian Anastasov wrote:
> > 
> >     So, we do not hold reference to neigh while accessing
> > its fields. I suspect we need to move the table lock from
> > neigh_remove_one here, for example:
> 
> good point, let me think over your suggestion carefully (it sounds
> right, I want to make sure I dont miss any other race-windows)
> and post patch v4 tomorrow..

        Another problem is that neigh_update() changes the
state but before we go and unlink the entry another CPU
can reactivate the entry, i.e. NUD_INCOMPLETE entered
in __neigh_event_send(). So, there will be always some
small window where surprises can happen and the entry is
not really deleted.

        Checking for NUD_FAILED may not be needed, the
atomic_read(&n->refcnt) == 1 check under lock ensures
that we are in some state without timer, i.e. not in
NUD_INCOMPLETE, so it is ok to remove the entry.

> >     Another solution to cause faster removal would be
> > to cancel the gc_work and to schedule it after 1 jiffie.
> > It helps when many entries are deleted at once but the
> > work prefers to just sleep when gc_thresh1 is not reached,
> > so such solution is not good enough.
> 
> Right the other drawback of relying on gc for cleanup is
> that it doesn't give higher preference to cleaning up FAILED
> entries first- so it can end up cleaning up other useful entries
> (as I was pointing out to David Ahern)

        I see, ok.

Regards

--
Julian Anastasov <j...@ssi.bg>

Reply via email to