On 12/10/20 6:12 PM, stran...@codeaurora.org wrote:
>>> BTW, have you tried your previous proposed patch and confirmed it
>>> would fix the issue?
>>>
> 
> Yes, we shared this with the customer and the refcount mismatch still
> occurred, so this doesn't seem sufficient either.
> 
>>> Could we further distinguish between dst added to the uncached list by
>>> icmp6_dst_alloc() and xfrm6_fill_dst(), and confirm which ones are the
>>> ones leaking reference?
>>> I suspect it would be the xfrm ones, but I think it is worth verifying.
>>>
> 
> After digging into the DST allocation/destroy a bit more, it seems that
> there are some cases where the DST's refcount does not hit zero, causing
> them to never be freed and release their references.
> One case comes from here on the IPv6 packet output path (these DST
> structs would hold references to both the inet6_dev and the netdevice)
> ip6_pol_route_output+0x20/0x2c -> ip6_pol_route+0x1dc/0x34c ->
> rt6_make_pcpu_route+0x18/0xf4 -> ip6_rt_pcpu_alloc+0xb4/0x19c

This is the normal data path, and this refers to a per-cpu dst cache.
Delete the route and the cached entries get removed.

> 
> We also see two DSTs where they are stored as the xdst->rt entry on the
> XFRM path that do not get released. One is allocated by the same path as
> above, and the other like this
> xfrm6_esp_err+0x7c/0xd4 -> esp6_err+0xc8/0x100 ->
> ip6_update_pmtu+0xc8/0x100 -> __ip6_rt_update_pmtu+0x248/0x434 ->
> ip6_rt_cache_alloc+0xa0/0x1dc

This entry goes into an exception cache. I have lost track of kernel
versions and features. Try listing the route cache to see these:  ip -6
ro ls cache

Reply via email to