On 19.10.2017 [09:35:19 -0000], Jan Gutter wrote: > @nacc > > Thanks so much for the explanation. I also found > https://wiki.ubuntu.com/ServerTeam/KnowledgeBase#Merge_Proposals_and_Reviewing > that details a bit more of the internal processes. As relative outsiders > to the Ubuntu process, I'd appreciate it very much if you could handle > that part for Monique's patches. I can be on hand to answer technical > questions if required.
And to be clear, the MP based workflow for the Git trees is brand new and experimental :) I'm happy to integrate the updated debdiffs (I'll reply to those comments directly). > Regarding the buffer size choice, it's very arbitrary as Phil said. I'm > pretty sure we came to the same conclusion independently (libvirt and > libnl had very similar issues) and the workaround is obvious. 32k seems > to work for 64 VF's (our test case), but breaks with 128 VF's. Not a lot > of machines can handle 128 concurrent VF's. I typed 64k "just because". > libvirt+libnl allow message peeking. However, iproute2 uses netlink > directly. So, implementing a similar idea would require an entirely new > receive codepath with all the fun of finding out where new exception > paths occur: something to be done on tip and not suitable for backport > without thorough vetting. Absolutely. My concern is the upstream code is at 32k as is Artful. I'm hesitant to backport something different (64k) to X and T without also ensuring Artful gets it (and BB when it opens), and presumably also fixing it upstream. So I see two routes forward: 1) File an upstream issue to request they bump to 64k, as you note 32k is insufficient for 128 VFs. Link to that issue in this bug and we'll fix AA, X and T with the suggested change (presuming upstream acks it). 2) Backport the upstream change as-is to X and T (AA already has the necessary fix). This will be faster, of course, but does mean the 128 VF case is broken. Given that it is less likely to be hit in the field, perhaps that is ok -- and in the meanwhile, upstream can work on a proper fix which, when available, we can backport accordingly (or decide at that point, in any case). I prefer 2), because I do not like diverging from upstream (or at least not without an upstream bug report). If you and Monique are ok with 2), I can update the debdiffs before sponsoring them. > I'm sure it'll save a lot of time once the kinks have been worked out of > the automation, backports are quite the double-edged sword. Definitely :) -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to iproute2 in Ubuntu. https://bugs.launchpad.net/bugs/1720126 Title: [ip link] Message truncated error for large number of passthrough VFs Status in iproute2 package in Ubuntu: Fix Released Status in iproute2 source package in Trusty: New Status in iproute2 source package in Xenial: Confirmed Status in iproute2 source package in Zesty: Fix Released Status in iproute2 package in CentOS: Unknown Bug description: [Impact] When querying a Physical Function netdev with a large amount of VF's (more than 30), the resulting return message can overflow the 16K netlink message buffer. This can be fixed by enabling message peeking on the socket and resizing the buffer on receive, or by simply enlarging the receive buffer. Since there's an upper limit to the number of VF's per PF, it's relatively sane to just enlarge the receive buffer. Please see the attached patch. [Test Case] # Set up 60 VF's on an SR-IOV device ip link show > /dev/null Observe the following: Message truncated Message truncated Message truncated [Regression Potential] 1) Applications relying on the broken behaviour will need to be updated, but it would be a really dubious use case. 2) Increasing the rx buffer size increases the memory footprint (but realistically, this is tiny). 3) Extra processing time is now needed to parse the larger buffer, in the case that a call to "ip link" is on the critical time path of an application, (called multiple times in a tight loop, for example), it would affect load. [Other Info] Observed on Ubuntu kernel 4.4.0-93-generic on both 14.04 and 16.04 ===================================================================================================== Ubuntu16 system stack@cluster04:~$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 16.04.3 LTS Release: 16.04 Codename: xenial stack@cluster04:~$ uname -r 4.4.0-93-generic stack@cluster04:~$ apt-cache policy iproute2 iproute2: Installed: 4.3.0-1ubuntu3.16.04.1 Version table: *** 4.3.0-1ubuntu3.16.04.1 500 500 http://us.archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages ================================================================================================= Ubuntu14 system: root@boomslang:~# lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 14.04.3 LTS Release: 14.04 Codename: trusty root@boomslang:~# uname -r 4.4.0-96-generic root@boomslang:~# apt-cache policy iproute2 iproute2: Installed: 3.12.0-2ubuntu1 Version table: *** 3.12.0-2ubuntu1 0 500 http://za.archive.ubuntu.com/ubuntu/ trusty-updates/main amd64 Packages To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/iproute2/+bug/1720126/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp