Re: Refcount mismatch when unregistering netdevice from kernel

2021-02-11 Thread Alexei Starovoitov
On Thu, Feb 11, 2021 at 5:28 PM Jakub Kicinski wrote: > > On Thu, 11 Feb 2021 11:21:26 -0800 Alexei Starovoitov wrote: > > On Tue, Jan 5, 2021 at 11:11 AM Wei Wang wrote: > > > On Mon, Jan 4, 2021 at 8:58 PM David Ahern wrote: > > > > On 1/4/21 8:05 PM, stran...@codeaurora.org wrote: > > > Ah, I

Re: Refcount mismatch when unregistering netdevice from kernel

2021-02-11 Thread Jakub Kicinski
On Thu, 11 Feb 2021 11:21:26 -0800 Alexei Starovoitov wrote: > On Tue, Jan 5, 2021 at 11:11 AM Wei Wang wrote: > > On Mon, Jan 4, 2021 at 8:58 PM David Ahern wrote: > > > On 1/4/21 8:05 PM, stran...@codeaurora.org wrote: > > Ah, I see now. rt6_flush_exceptions is called by fib6_del_route, but

Re: Refcount mismatch when unregistering netdevice from kernel

2021-02-11 Thread Alexei Starovoitov
On Tue, Jan 5, 2021 at 11:11 AM Wei Wang wrote: > > On Mon, Jan 4, 2021 at 8:58 PM David Ahern wrote: > > > > On 1/4/21 8:05 PM, stran...@codeaurora.org wrote: > > > > > > We're able to reproduce the refcount mismatch after some experimentation > > > as well. > > > Essentially, it consists of > >

Re: Refcount mismatch when unregistering netdevice from kernel

2021-01-04 Thread David Ahern
On 1/4/21 8:05 PM, stran...@codeaurora.org wrote: > > We're able to reproduce the refcount mismatch after some experimentation > as well. > Essentially, it consists of > 1) adding a default route (ip -6 route add dev XXX default) > 2) forcing the creation of an exception route via manually injecti

Re: Refcount mismatch when unregistering netdevice from kernel

2021-01-04 Thread stranche
On 2020-12-11 09:10, David Ahern wrote: Could we further distinguish between dst added to the uncached list by icmp6_dst_alloc() and xfrm6_fill_dst(), and confirm which ones are the ones leaking reference? I suspect it would be the xfrm ones, but I think it is worth verifying. After diggi

Re: Refcount mismatch when unregistering netdevice from kernel

2020-12-11 Thread David Ahern
On 12/10/20 6:12 PM, stran...@codeaurora.org wrote: >>> BTW, have you tried your previous proposed patch and confirmed it >>> would fix the issue? >>> > > Yes, we shared this with the customer and the refcount mismatch still > occurred, so this doesn't seem sufficient either. > >>> Could we furth

Re: Refcount mismatch when unregistering netdevice from kernel

2020-12-10 Thread stranche
BTW, have you tried your previous proposed patch and confirmed it would fix the issue? Yes, we shared this with the customer and the refcount mismatch still occurred, so this doesn't seem sufficient either. Could we further distinguish between dst added to the uncached list by icmp6_dst_all

Re: Refcount mismatch when unregistering netdevice from kernel

2020-12-08 Thread stranche
Hi Wei and Eric, Thanks for the replies. This was reported to us on the 5.4.61 kernel during a customer regression suite, so we don't have an exact reproducer unfortunately. From the trace logs we've added it seems like this is happening during IPv6 transport mode XFRM data transfer and the d

Re: Refcount mismatch when unregistering netdevice from kernel

2020-12-08 Thread David Ahern
On 12/8/20 2:51 PM, Wei Wang wrote: > On Tue, Dec 8, 2020 at 11:13 AM wrote: >> >> Hi Wei and Eric, >> >> Thanks for the replies. >> >> This was reported to us on the 5.4.61 kernel during a customer >> regression suite, so we don't have an exact reproducer unfortunately. >> From the trace logs we

Re: Refcount mismatch when unregistering netdevice from kernel

2020-12-08 Thread Eric Dumazet
On 12/8/20 4:55 AM, stran...@codeaurora.org wrote: > Hi everyone, > > We've recently been investigating a refcount problem when unregistering a > netdevice from the kernel. It seems that the IPv6 module is still holding > various references to the inet6_dev associated with the main netdevice

Refcount mismatch when unregistering netdevice from kernel

2020-12-07 Thread stranche
Hi everyone, We've recently been investigating a refcount problem when unregistering a netdevice from the kernel. It seems that the IPv6 module is still holding various references to the inet6_dev associated with the main netdevice struct that are not being released, preventing the unregistra