On Tue, 25 Sep 2018 14:34:08 +0200 Christian Brauner <christ...@brauner.io> wrote:
> On Tue, Sep 25, 2018, 14:07 Stephen Hemminger <step...@networkplumber.org> > wrote: > > > On Tue, 25 Sep 2018 11:49:10 +0200 > > Christian Brauner <christ...@brauner.io> wrote: > > > > > On Mon, Sep 24, 2018 at 09:19:06PM -0600, David Ahern wrote: > > > > On top of net-next I am see a dmesg error: > > > > > > > > netlink: 16 bytes leftover after parsing attributes in process `ip'. > > > > > > > > I traced it to address lists and commit: > > > > > > > > commit 6ecf4c37eb3e89b0832c9616089a5cdca3747da7 > > > > Author: Christian Brauner <christ...@brauner.io> > > > > Date: Tue Sep 4 21:53:50 2018 +0200 > > > > > > > > ipv6: enable IFA_TARGET_NETNSID for RTM_GETADDR > > > > > > > > Per the commit you are trying to guess whether the ancillary header is > > > > an ifinfomsg or a ifaddrmsg. I am guessing you are guessing wrong. > > :-) > > > > > > Well, I currently don't guess at all. :) I'm parsing with struct > > > ifaddrmsg as assumed header size but ignore parsing errors when that > > > fails. You don't get the niceties of the new property if you don't pack > > > > There are legacy parts of netlink interface with kernel. > > The ABI has evolved over time but some old parts are stuck in the past. > > > > > > > > I don't have time to take this to ground, but address listing is not > > the > > > > only area subject to iproute2's SNAFU of infomsg everywhere on dumps. I > > > > have thought about this for route dumps, but its solution does not work > > > > here. You'll need to find something because the current warning on > > every > > > > address dump is not acceptable. > > > > > > Two points before I propose a migitation: > > > > > > 1. The burded of seeing pr_warn_ratelimited() messages in dmesg when > > > userspace is doing something wrong is imho justifiable. > > > Actually, I would argue that we should not hide the problem from > > > userspace at all. The rate-limited (so no logging DOS afaict) warning > > > messages are a perfect indicator that a tool is doing something wrong > > > *without* introducing any regressions. > > > The rtnetlink manpage clearly indicates that ifaddrmsg is supposed to > > > be used too. Additionally, userspace stuffs an ifinfomsg in there but > > > expects to receive ifaddrmsg. They should be warned loudly. :) So I > > > actually like the warning messages. > > > 2. Userspace should be fixed. Especially such an important standard tool > > > as iproute2 that is maintained on git.kernel.org (glibc is already > > > doing the right.). > > > > > > So if people really want to hide this issue as much as we can then we > > > can play the guessing game. I could send a patch that roughly does the > > > following: > > > > > > if (nlmsg_len(cb->nlh) < sizeof(struct ifinfomsg)) > > > guessed_header_len = sizeof(struct ifaddrmsg); > > > else > > > guessed_header_len = sizeof(struct ifinfomsg); > > > > > > This will work since sizeof(ifaddrmsg) == 8 and sizeof(ifinfomsg) == 16. > > > The only valid property for RTM_GETADDR requests is IFA_TARGET_NETNSID. > > > This propert is a __s32 which should bring the message up to 12 bytes > > > (not sure about alignment requiremnts and where we might wend up ten) > > > which is still less than the 16 bytes without that property from > > > ifinfomsg. That's a hacky hacky hack-hack and will likely work but will > > > break when ifaddrmsg grows a new member or we introduce another property > > > that is valid in RTM_GETADDR requests. It also will not work cleanly > > > when users stuff additional properties in there that are valid for the > > > address family but are not used int RTM_GETADDR requests. > > > > > > I would like to hear what other people and davem think we should do. > > > Patch it away or print the warning. > > > > > > Christian > > > > You can't break old programs. That is one of the rules of kernel. > > Therefore, please either revert the kernel change or put the new attribute > > in a place where old versions do not cause problem. > > > > Sorry, there's a misunderstanding here. The code doesn't regress anything. > The patch was written in a backward compatible way. The only thing it > causes are rate-limited logging messages when the wrong struct is passed. > But the request still succeeds. The issue is with the logging afaict. > > Christian That still means enterprise distributions that use the current kernel will get customer complaints. You need to remove the warning.