the systemtap script below can be used to monitor the dst count for all net namespaces. When any of the counts goes significantly negative (more than 32 * CPUS negative) it indicates this bug is reproduced - meaning, the count from one net namespace was incorrectly shifted to another net namespace, and once that happens enough times one (or more) net namespaces have a count that goes negative (which is not possible). Other net namespaces have counts that are much higher than they should be. Note this script is just for ipv4, but the bug exists for ipv6 also (and the patch fixes ipv6 also).
#!/usr/bin/stap global dst_count probe kernel.function("xfrm_resolve_and_create_bundle") { if ($family == 2) { dst_count[&$pols[0]->xp_net] = $pols[0]->xp_net->xfrm->xfrm4_dst_ops->pcpuc_entries->count } } probe timer.sec(1) { foreach (c in dst_count) { printf("%ld ", dst_count[c]) } print("\n") } -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1486670 Title: using ipsec, many connections result in no buffer space error Status in linux package in Ubuntu: In Progress Status in linux source package in Precise: Invalid Status in linux source package in Trusty: Fix Committed Status in linux source package in Vivid: Fix Committed Status in linux source package in Wily: Fix Committed Bug description: Reproduction info: set up two LXC containers (although this probably isn't specific to LXC containers), and inside each setup ipsec with something similar to: conn nodeN aggressive=yes authby=secret auto=start closeaction=restart dpdaction=restart esp=aes256-aes256gmac-modp1024 ike=aes256-sha512-modp1024 keyexchange=ikev2 left=10.0.3.145 leftid=10.0.3.145 lifetime=12h reauth=no right=10.0.3.199 type=transport then repeatedly open connections to the peer, e.g.: while true; do ping -c1 10.0.3.199 ; sleep 0.1 ; done eventually, the connections will fail with: connect: No buffer space available the reproduction can be sped up by reducing the xfrm4_gc_thresh, e.g.: echo 5 > /proc/sys/net/ipv4/xfrm4_gc_thresh Once the error occurs, no more connections can be made to the peer (all fail with no buffer space available), however after a long period (e.g. overnight) the buffers will be cleaned up and connections can be made again. this happens even on the latest net-next kernel. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1486670/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp