Greetings, Adrian Pepper! > I'll start this lengthy message with a table-of-contents of sorts.
Next time please post a new message, when you open a new thread to the list. > === Only a limited number of containers could run usefully === > I had had problems on my workstation running more than about 10 > containers; subsequent ones would show as RUNNING, but have no IP > address. lxc-attach suggested /sbin/init was actually hung, with > no apparent way to recover them. I used to resort to shutting down > lesser-needed containers to allow new ones to run usefully. > > Then one day, I decided to try and pursue the problem a little harder. > > === github lxc/lxd production-setup.md === > Eventually, mostly by checking my mbox archive of this list > ([email protected]), I stumbled on... > https://github.com/lxc/lxd/blob/master/doc/production-setup.md > It's not clear to me what the context of that document really is. > Does it end up in the contents of lxd? (I still use lxc). > But even referenced directly from the git repository, it still > provides useful information. > I summarized that production-setup.md for myself... > /etc/security/limits.conf > #<domain> <type> <item> <value> > * soft nofile 1048576 unset > * hard nofile 1048576 unset > root soft nofile 1048576 unset > root hard nofile 1048576 unset > * soft memlock unlimited unset > * hard memlock unlimited unset > /etc/sysctl.conf (effective) > fs.inotify.max_queued_events 1048576 # def:16384 > fs.inotify.max_user_instances 1048576 # def:128 > fs.inotify.max_user_watches 1048576 # def:8192 > vm.max_map_count 262144 # def:65530 max mma per proc > kernel.dmesg_restrict 1 # def:0 > net.ipv4.neigh.default.gc_thresh3 8192 # def:1024 arp table limit > net.ipv6.neigh.default.gc_thresh3 8192 # def:1024 arp table limit > kernel.keys.maxkeys 2000 # def:200 non-root key limit > # s.b. > number of containers > net.core.netdev_max_backlog "increase" suggests 182757 (from 1000!) > During this time of my most recent investigation, I had happened to > suspect fs.inotify.max_user_watches (Because a "tail" I ran indicated > that it could not use "inotify" and needed to poll instead). > (Hey, there I sound like a natural kernel geek, but actually I needed > to do a few web searches to correlate the tail diagnostic to the setting). > production-setup.md also has suggestions about txqueuelen, but I will > assume for now those apply only to systems wanting to generate or > receive a lot of real network traffic. > === Recommended values seem arbitrary, perhaps excessive in some cases === > In the suggestions above: > 1048576 is 1024*1024 and seems very arbitrary. > Hopefully, this is mostly increasing the size of edge-pointer tables > and so doesn't consume a lot of memory unless the resources do get > close to the maximum. I actually used smaller values. A little more > in line with the proportions of the defaults (shown above). > cscf-adm@scspc578-1804:~$ grep '^' /proc/sys/fs/inotify/* > /proc/sys/fs/inotify/max_queued_events:262144 > /proc/sys/fs/inotify/max_user_instances:131072 > /proc/sys/fs/inotify/max_user_watches:262144 > cscf-adm@scspc578-1804:~$ > Searching for more info about netdev_max_backlog found > https://community.mellanox.com/s/article/linux-sysctl-tuning > It suggests raising net.core.netdev_max_backlog=250000 > So I went with that. > I still haven't figured out the significance of 182757, the apparent > product of two primes, 3 * 60919. Nor can I see any significance to > any of its near-adjacent numbers. > After applying changes similar to the above, I observed very good > results. Whereas before I seemed to run into problems at around 12 > containers, I am currently running 17, and have run more. It would be useful if you can discover/describe some direct ways to investigate limits congestion. Will be way more helpful in tuning container host systems for specific needs. > === /sbin/init sometimes missing on apparently healthy containers? === > Also, I previously observed that the number of /sbin/init processes > running was significantly fewer than the number of apparently properly > functional containers. The good news is that today there are almost > as many /sbin/init processes running as containers. The bad news is > that N(/sbin/init) == N(containers)-1 whereas I would think it should > equal N(containers)+1 > (That is, I confirmed by sshing to each container in turn and looking > in "ps" output for /init that two containers had no /init running, but > they both seem to be generally working). Were they created from custom images? What do they report as pid 1? > The total number of processes I run is, according to "ps", nearly > always less than 1000. (Usually "ps -adelfww"). > I almost wonder if that was a transitory problem in Ubuntu 18.04 which > gets fixed in the containers as the appropriate dist-upgrade gets done. > === Using USB disk on container-heavy host used to exceed some queue limit > === > One of these changes, probably either net.core.netdev_max_backlog or > fs.inotify.max_queued_events, seems to have had the pleasant side effect > of allowing me to write backups to a USB drive without getting flakiness > in my user interface, also removing diagnostics which used to occur in > that situation about some queue limit being raised because of observed > lost events. More likely the fs.inotify.max_queued_events > === My previous pty tweaking now raises a distinct question === > Another distinct problem caused me to raise > /proc/sys/kernel/pty/max > Given the apparent value /proc/sys/kernel/pty/reserve:1024 > does one need to set kernel/pty/max to (N*1024 plus the total number of > ptys you expect to allocate) where N is the number of containers > you expect to run concurrently? > /proc/sys/kernel/pty/nr > never seems particularly high now. > (/proc/sys/kernel/pty/max being another one of the apparent few > system parameters for which you can monitor the current usage). Now, this is interesting. I was routinely killing 3 default login sessions started inside container by default. For no apparent reason. Seems like I wasn't far off doing that. > === Trivial observation re: sysctl which helped me when I noted it === > "sysctl kernel.pty.max" <=> "cat /proc/sys/kernel/pty/max" sort of. > I.e. "sysctl A.B.C.D" <=> "cat /proc/sys/A/B/C/D" Yep. sysctl is a sort of wrapper, you can achieve similar results to sysctl/-w with simple cat/echo to the respective "files" in /proc -- With best regards, Andrey Repin Saturday, September 14, 2019 8:55:58 Sorry for my terrible english... _______________________________________________ lxc-users mailing list [email protected] http://lists.linuxcontainers.org/listinfo/lxc-users
