Control: retitle 818362 connection timeouts while a large number of containers boot Control: forwarded 818362 https://bugs.freedesktop.org/show_bug.cgi?id=74788
On Thu, 17 Mar 2016 at 12:47:06 +0100, Harald Dunkel wrote: > > How many LXC containers are you booting, on what hardware, and what service > > is connecting to the system bus and getting rejected? > > My test case is 31 containers on a quad core (+ht) Xeon E5420 CPU. > The production machine is a 2 * 6core (+ht) Xeon E5-2630 running > about 20 containers. On both hosts it can take 5 or 10 minutes until > the last container gets its IP address via network (DHCP). It might be a useful workaround to stagger startup so not everything is starting at the same time? I'm a little surprised this is necessary, though; the timeout is reasonably generous, and the handshake that the connections have to do before the timeout is hit is relatively small and shouldn't involve any significant I/O or computation. > Wouldn't you agree that a high watermark on the number of used > connection slots to enable the timeout restriction would have been > a better choice? Thanks, I've noted that suggestion upstream on <https://bugs.freedesktop.org/show_bug.cgi?id=74788>. Because this was treated as a security issue, the initial solution was developed under embargo and designed to be minimal/targeted, but that doesn't mean we can't improve on it later. I might not be able to implement this soon, but I'd be happy to review patches from anyone interested in making this more scalable. > Probably its reasonable to ignore the timeout for uid0, but surely it > will take some time till this change appears in a future Debian release. This is the price we have to pay for a stable distribution: we avoid changing anything non-critical in stable because it might introduce a regression, but then non-critical bugs don't get fixed for a while. If someone improves this in development versions of dbus-daemon, there'll at least be something that you could backport locally if you have machines that are hit particularly badly (like the LXC host you've described). S

