On 7/29/20 5:43 AM, Ido Schimmel wrote: > On Tue, Jul 28, 2020 at 05:52:44PM -0700, Ashutosh Grewal wrote: >> Hello David and all, >> >> I hope this is the correct way to report a bug. > > Sure > >> >> I observed this problem with 256 v4 next-hops or 128 v6 next-hops (or >> 128 or so # of v4 next-hops with labels). >> >> Here is an example - >> >> root@a6be8c892bb7:/# ip route show 2.2.2.2 >> Error: Buffer too small for object. >> Dump terminated >> >> Kernel details (though I recall running into the same problem on 4.4* >> kernel as well) - >> root@ubuntu-vm:/# uname -a >> Linux ch1 5.4.0-33-generic #37-Ubuntu SMP Thu May 21 12:53:59 UTC 2020 >> x86_64 x86_64 x86_64 GNU/Linux >> >> I think the problem may be to do with the size of the skbuf being >> allocated as part of servicing the netlink request. >> >> static int netlink_dump(struct sock *sk) >> { >> <snip> >> >> skb = alloc_skb(...) > > Yes, I believe you are correct. You will get an skb of size 4K and it > can't fit the entire RTA_MULTIPATH attribute with all the nested > nexthops. Since it's a single attribute it cannot be split across > multiple messages.
yep, well known problem. > > Looking at the code, I think a similar problem was already encountered > with IFLA_VFINFO_LIST. See commit c7ac8679bec9 ("rtnetlink: Compute and > store minimum ifinfo dump size"). > > Maybe we can track the maximum number of IPv4/IPv6 nexthops during > insertion and then consult it to adjust 'min_dump_alloc' for > RTM_GETROUTE. That seems better than the current design for GETLINK which walks all devices to determine max dump size. Not sure how you will track that efficiently though - add is easy, delete is not. > > It's a bit complicated for IPv6 because you can append nexthops, but I > believe anyone using so many nexthops is already using RTA_MULTIPATH to > insert them, so we can simplify. I hope so. > > David, what do you think? You have a better / simpler idea? Maybe one > day everyone will be using the new nexthop API and this won't be needed > :) exactly. You won't have this problem with separate nexthops since each one is small (< 4k) and the group (multipath) is a set of ids, not the full set of attributes.