On Thu, Jun 8, 2017 at 1:12 PM, Krister Johansen <k...@templeofstupid.com> wrote: > After looking through the list of callbacks that the netdevice notifiers > invoke in this path, it appears that the dst_dev_event is the most > interesting. The dst_ifdown path places a hold on the loopback_dev as > part of releasing the dev associated with the original dst cache entry. > Most of our notifier callbacks are straight-forward, but this one a) > looks complex, and b) places a hold on the network interface in > question. > > I constructed a new bcc script that watches various events in the > liftime of a dst cache entry. Note that dst_ifdown will take a hold on > the loopback device until the invalidated dst entry gets freed. >
Yeah, this is what I observed when Kevin (Cc'ed) reported a similar (if not same) bug, I thought we have a refcnt leak on dst. ... > The way this works is that if there's still a reference on the dst entry > at the time we try to free it, it gets placed in the gc list by > __dst_free and the dst_destroy() call is invoked by the gc task once the > refcount is 0. If the gc task processes a 10th or less of its entries > on a single pass, it inreases the amount of time it waits between gc > intervals. > > Looking at the gc_task intervals, they started at 663ms when we invoked > __dst_free(). After that, they increased to 1663, 3136, 5567, 8191, > 10751, and 14848. The release that set the refcnt to 0 on our dst entry > occurred after the gc_task was enqueued for 14 second interval so we had > to wait longer than the warning time in wait_allrefs in order for the > dst entry to get free'd and the hold on 'lo' to be released. > I am glad to see you don't have a dst leak here. But from my experience of a similar bug (refcnt wait on lo), this goes infinitely rather than just 14sec, so it looked more like a real leak than just a gc delay. So in your case, this annoying warning eventually disappears, right? Thanks.