Re: [lxc-users] LXD 2.14 - Ubuntu 16.04 - kernel 4.4.0-57-generic - SWAP continuing to grow

Marat Khalili Sat, 15 Jul 2017 09:27:05 -0700

Marat/Fajar:  How many servers do you guys have running in production, and what 
are their characteristics (RAM, CPU, workloads, etc)?

I have to admit I'm not running a farm; I administer a few, but they areall different depending on task. Still, even smallest has 64GB RAM. In2017 the 8GB is small even for user notebook IMO.

After digging into this a bit, it seems “top”, “top”, and “free” report similar 
swap usage, however, other tools report much less swap usage.

Yes, this is known, they got confused in containers. Run them on host toproduce more meaningful results.

All that said, the real issue is to find out if one of our containers/processes 
has a memory leak (per Marat’s suggestion below).  Unfortunately, LXD does not 
provide an easy way to track per-container stats, thus we must “roll our own” 
tools.

Here's a typical top output (on the host system with 19 LXC containerscurrently running):

top - 16:00:01 up 12 days, 10:35, 5 users, load average: 0.67, 0.58,0.61
Tasks: 501 total,   2 running, 499 sleeping,   0 stopped,   0 zombie
%Cpu(s): 5.8 us, 1.4 sy, 0.0 ni, 91.5 id, 1.1 wa, 0.0 hi, 0.2si, 0.0 stKiB Mem : 65853268 total, 379712 free, 8100284 used, 57373272buff/cacheKiB Swap: 24986931+total, 24782081+free, 2048496 used. 56852384 availMem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+COMMAND6671 root 20 0 5450952 3.728g 1564 S 0.3 5.9 93:14.29qemu-system-x866670 root 20 0 5411084 2.073g 1456 S 0.0 3.3 33:32.07qemu-system-x866979 999 20 0 5251132 244532 19436 S 0.0 0.4 101:47.88drwcsd.real
 4338 lxd       20   0 1968400 229004   8052 S   5.3 0.3 639:52.03 mysqld
 8135 root      20   0 6553852 198224   4280 S   0.0 0.3  41:52.66 java
4231 root 20 0 150072 99596 99472 S 0.0 0.2 0:19.43systemd-journal

It shows all processes, including those running in containers (first 5are). I sorted by RES/%RAM; in your case I'd also try sorting by VIRT. Idon't know how to directly find process that occupies much swap, butmost likely it will have high RES and VIRT values too. As soon as youfind problem processes, it is trivial to find container they run in withps -AFH or pstree -p on the host system. (Note, that user names and PIDsare different inside and outside of containers, don't rely on them.)

I don't have much experience with LXD, but I suppose it's same in thisaspect.


--

With Best Regards,
Marat Khalili

On 15/07/17 18:48, Ron Kelley wrote:

Thanks for the great replies.

Marat/Fajar: How many servers do you guys have running in production, and what
are their characteristics (RAM, CPU, workloads, etc)? I am trying to see if
our systems generally align to what you are running. Running without swap
seems rather drastic and removes the “safety net” in the case of a bad program.
In the end, we must have all containers/processes running 24/7.

tldr;
----
After digging into this a bit, it seems “top”, “top”, and “free” report similar
swap usage, however, other tools report much less swap usage. I found the
following threads on the ‘net which include simple scripts to look in /proc and
examine swap usage per process:

https://stackoverflow.com/questions/479953/how-to-find-out-which-processes-are-swapping-in-linux
https://www.cyberciti.biz/faq/linux-which-process-is-using-swap

As some people pointed out, top/htop don’t accurately report the swap usage as
they combine a number of memory fields together. And, indeed, running the
script in each container (looking at /proc) show markedly different results
when all the numbers are added up. For example, the “free” command on one of
our servers reports 3G of swap in use, but the script that scans the /proc
directory only shows 1.3G of real swap in use. Very odd.

All that said, the real issue is to find out if one of our containers/processes
has a memory leak (per Marat’s suggestion below). Unfortunately, LXD does not
provide an easy way to track per-container stats, thus we must “roll our own”
tools.

-Ron

On Jul 15, 2017, at 5:11 AM, Marat Khalili <[email protected]> wrote:

I'm using LXC, and I frequently observe some unused containers get swapped out, 
even though system has plenty of RAM and no RAM limits are set. The only bad 
effect I observe is couple of seconds delay when you first log into them after 
some time. I guess it is absolutely normal since kernel tries to maximize 
amount of memory available for disk caches.

If you don't like this behavior, instead of trying to fine tune kernel 
parameters why not disable swap altogether? Many people run it this way, it's 
mostly a matter of taste these days. (But first check your software for leaks.)

For example, our “server-4” machine shows 8G total RAM, 500MB free, 2.5G 
available, and 5G of buff/cache. Yet, swap is at 5.5GB and has been slowly 
growing over the past few days. It seems something is preventing the apps from 
using the RAM.

Did you identify what processes all this virtual memory belongs to?

To be honest, we have been battling lots of memory/swap issues using LXD. We 
started with no tuning, but the app stack quickly ran out of memory.

LXC/LXD is hardly responsible for your app stack memory usage. Either you 
underestimated it or there's a memory leak somewhere.

Given all the issues we have had with memory and swap using LXD, we are 
seriously considering moving back to the traditional VM approach until LXC/LXD 
is better “baked”.

Did your VMs use less memory? I don't think so. Limits could be better 
enforced, but VMs don't magically give you infinite RAM.
--

With Best Regards,
Marat Khalili

On July 14, 2017 9:58:57 PM GMT+03:00, Ron Kelley <[email protected]> wrote:
Wondering if anyone else has similar issues.

We have 5x LXD 2.12 servers running (U16.04 - kernel 4.4.0-57-generic - 8G RAM, 19G 
SWAP).  Each server is running about 50 LXD containers - Wordpress w/Nginx and PHP7.  
The servers have been running for about 15 days now, and swap space continues to grow.  
In addition, the kswapd0 process starts consuming CPU until we flush the system cache 
via "/bin/echo 3 > /proc/sys/vm/drop_caches” command.

Our LXD profile looks like this:
-------------------------
config:
   limits.cpu: "2"
   limits.memory: 512MB
   limits.memory.swap: "true"
   limits.memory.swap.priority: "1"
-------------------------


We also have added these to /etc/sysctl.conf
-------------------------
vm.swappiness=10
vm.vfs_cache_pressure=50
-------------------------

A quick “top” output shows plenty of available Memory and buff/cache.  But, for 
some reason, the system continues to swap out the app.  For example, our 
“server-4” machine shows 8G total RAM, 500MB free, 2.5G available, and 5G of 
buff/cache.  Yet, swap is at 5.5GB and has been slowly growing over the past 
few days.  It seems something is preventing the apps from using the RAM.


To be honest, we have been battling lots of memory/swap issues using LXD.  We 
started with no tuning, but the app stack quickly ran out of memory.  After 
editing the profile to allow 512MB RAM per container (and restarting the 
container), the kswapd0 issue happens.  Given all the issues we have had with 
memory and swap using LXD, we are seriously considering moving back to the 
traditional VM approach until LXC/LXD is better “baked”.


-Ron

lxc-users mailing list
[email protected]
http://lists.linuxcontainers.org/listinfo/lxc-users
_______________________________________________
lxc-users mailing list
[email protected]
http://lists.linuxcontainers.org/listinfo/lxc-users

_______________________________________________
lxc-users mailing list
[email protected]
http://lists.linuxcontainers.org/listinfo/lxc-users

_______________________________________________
lxc-users mailing list
[email protected]
http://lists.linuxcontainers.org/listinfo/lxc-users

Re: [lxc-users] LXD 2.14 - Ubuntu 16.04 - kernel 4.4.0-57-generic - SWAP continuing to grow

Reply via email to