On Mon, May 8, 2017 at 7:18 PM, Eric Dumazet <eric.duma...@gmail.com> wrote: > On Mon, 2017-05-08 at 21:22 -0400, David Miller wrote: >> From: Eric Dumazet <eric.duma...@gmail.com> >> Date: Mon, 08 May 2017 17:01:20 -0700 >> >> > On Mon, 2017-05-08 at 14:35 -0400, David Miller wrote: >> >> From: Cong Wang <xiyou.wangc...@gmail.com> >> >> Date: Thu, 4 May 2017 14:54:17 -0700 >> >> >> >> > IPv4 dst could use fi->fib_metrics to store metrics but fib_info >> >> > itself is refcnt'ed, so without taking a refcnt fi and >> >> > fi->fib_metrics could be freed while dst metrics still points to >> >> > it. This triggers use-after-free as reported by Andrey twice. >> >> > >> >> > This patch reverts commit 2860583fe840 ("ipv4: Kill rt->fi") to >> >> > restore this reference counting. It is a quick fix for -net and >> >> > -stable, for -net-next, as Eric suggested, we can consider doing >> >> > reference counting for metrics itself instead of relying on fib_info. >> >> > >> >> > IPv6 is very different, it copies or steals the metrics from mx6_config >> >> > in fib6_commit_metrics() so probably doesn't need a refcnt. >> >> > >> >> > Decnet has already done the refcnt'ing, see dn_fib_semantic_match(). >> >> > >> >> > Fixes: 2860583fe840 ("ipv4: Kill rt->fi") >> >> > Reported-by: Andrey Konovalov <andreyk...@google.com> >> >> > Tested-by: Andrey Konovalov <andreyk...@google.com> >> >> > Signed-off-by: Cong Wang <xiyou.wangc...@gmail.com> >> >> >> >> Applied and queued up for -stable, thanks. >> > >> > Although I now have on latest net tree these messages when I reboot my >> > test machine. >> > >> > [ 224.085873] unregister_netdevice: waiting for eth0 to become free. >> > Usage count = 43 >> >> Strange, the refcounting looks quite OK in the patch you're quoting. >> I looked over it a few times and cannot figure out a possible cause >> there.
Eric, how did you produce it? I guess it's because of nh_dev which is the only netdevice pointer inside fib_info. Let me take a deeper look. >> >> I am assuming you are quite confident it is this change? > > At least, reverting the patch resolves the issue for me. > > Keeping fib (and their reference to netdev) is apparently too much, > we probably need to implement a refcount on the metrics themselves, > being stand alone objects. I don't disagree, just that it may need to change too much code which goes beyond a stable candidate. Thanks for the bug report!