High kswapd
Good morning, Ubuntu 16.04.6 LTS PostgreSQL 9.6.5 On one of our database servers, we're regularly seeing kswapd at the top of "top" output, regularly using over 50 %CPU. We should have well over 80GB of available memory according to "free -m". # free -m totalusedfree shared buff/cache available Mem: 125910 41654 820 857 83435 82231 Swap: 511 448 63 We've already got vm.swappiness and vm.zone_reclaim_mode set to 0, and NUMA is disabled from what I can see: # dmesg | grep -i numa [0.00] No NUMA configuration found We are using HugePages and things look good there as well. Curious what would be causing kswapd to run hot like it is, or if it is a red herring as I look into high CPU usage on this box (although, again, it is the single-highest CPU user). -- Don Seiler www.seiler.us
Re: High kswapd
On Mon, Apr 13, 2020 at 09:34:23AM -0500, Don Seiler wrote: > Good morning, > > Ubuntu 16.04.6 LTS > PostgreSQL 9.6.5 > > On one of our database servers, we're regularly seeing kswapd at the top of > "top" output, regularly using over 50 %CPU. We should have well over 80GB > of available memory according to "free -m". Do you have THP enabled? Or KSM ? tail /sys/kernel/mm/ksm/run /sys/kernel/mm/transparent_hugepage/khugepaged/defrag /sys/kernel/mm/transparent_hugepage/enabled /sys/kernel/mm/transparent_hugepage/defrag https://www.postgresql.org/message-id/20170718180152.GE17566%40telsasoft.com -- Justin
Re: High kswapd
On Mon, Apr 13, 2020 at 9:42 AM Justin Pryzby wrote: > > tail /sys/kernel/mm/ksm/run > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag > /sys/kernel/mm/transparent_hugepage/enabled > /sys/kernel/mm/transparent_hugepage/defrag > # tail /sys/kernel/mm/ksm/run /sys/kernel/mm/transparent_hugepage/khugepaged/defrag /sys/kernel/mm/transparent_hugepage/enabled /sys/kernel/mm/transparent_hugepage/defrag ==> /sys/kernel/mm/ksm/run <== 0 ==> /sys/kernel/mm/transparent_hugepage/khugepaged/defrag <== 1 ==> /sys/kernel/mm/transparent_hugepage/enabled <== always madvise [never] ==> /sys/kernel/mm/transparent_hugepage/defrag <== always madvise [never] -- Don Seiler www.seiler.us
Re: High kswapd
On Mon, Apr 13, 2020 at 09:46:22AM -0500, Don Seiler wrote: > ==> /sys/kernel/mm/ksm/run <== > 0 Was it off to begin with ? If not, you can set it to "2" to "unshare" pages. > ==> /sys/kernel/mm/transparent_hugepage/khugepaged/defrag <== > 1 So I'd suggest trying with this disabled. I don't know if I ever fully understood the problem, but it sounds like at least in your case it's related to large shared_buffers, and hugepages, which cannot be swapped out. -- Justin
Re: PostgreSQL DBA consulting
https://www.cybertec-postgresql.com/ With Warm Regards, Amol P. Tarte Project Manager, Rajdeep InfoTechno Pvt. Ltd. Visit us at http://it.rajdeepgroup.com On Tue, Apr 7, 2020, 5:21 PM daya airody wrote: > hi folks, > > we are looking for a PostgreSQL DBA to help us in tuning our database. > > Could you please recommend somebody in your network? > thanks, > --daya-- > > >
Re: High kswapd
On Mon, Apr 13, 2020 at 9:58 AM Justin Pryzby wrote: > On Mon, Apr 13, 2020 at 09:46:22AM -0500, Don Seiler wrote: > > ==> /sys/kernel/mm/ksm/run <== > > 0 > > Was it off to begin with ? > If not, you can set it to "2" to "unshare" pages. > Yes we haven't changed this. It was already set to 0. > > > ==> /sys/kernel/mm/transparent_hugepage/khugepaged/defrag <== > > 1 > > So I'd suggest trying with this disabled. > My understanding was that THP is disabled anyway. What would this defrag feature be doing now? > I don't know if I ever fully understood the problem, but it sounds like at > least in your case it's related to large shared_buffers, and hugepages, > which > cannot be swapped out. > Basically the problem is our DB host getting slammed with connections (even with pgbouncer in place). We see the CPU load spiking, and when we check "top" we regularly see "kswapd" at the top of the list. For example, just now kswapd is at 72 %CPU in top. The next highest is a postgres process at 6.6 %CPU. Our shared_buffers is set to 32GB, and HugePages is set to 36GB: # grep Huge /proc/meminfo AnonHugePages: 0 kB HugePages_Total: 18000 HugePages_Free: 1897 HugePages_Rsvd: 41 HugePages_Surp:0 Hugepagesize: 2048 kB Also FWIW this host is actually a VSphere VM. We're looking into any underlying events during these spikes as well. Don. -- Don Seiler www.seiler.us
