We're getting zombies here which aren't being reaped: 130428 ? Z 0:00 [stress-ng-brk] <defunct> 130432 ? Z 0:00 [stress-ng-brk] <defunct> 130434 ? Z 0:00 [stress-ng-brk] <defunct> 130436 ? Z 0:00 [stress-ng-brk] <defunct>
The reason for this is that memory stressors like brk have a parent that forks off a child. The child performs the stressing and if it gets OOM'd the parent can spawn off another stressor. So I think the SIGKILL on the stress-ng brk stressor is killing the parent bug the child (which is still holding onto a load of memory on the heap) is not being waited for and hence is in a memory hogging zombie state. We may be in a pathologically memory hogging state because the zombies may be holding brk regions that are swapped out to disk due to memory pressure and we're hitting a low-memory state which is not being cleared up. I suggest modifying the test bash script as follows: 1. run stress-ng with -k flag (so that all the processes have the same stress-ng name) 2. kill with ALRM first 3. then kill with KILL all the stress-ng processes after a small grace period. 4. report on unkillable stressors refer to the changes I made to https://launchpadlibrarian.net/296974522/disk_stress_ng -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1573062 Title: memory_stress_ng failing for Power architecture for 16.04 Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Status in linux source package in Yakkety: In Progress Bug description: memory_stress_ng, as part of server certification is failing for IBM Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is defined by the test locking up the server in an unrecoverable state which only a reboot will fix. I will be attaching screen and kern logs for the failures and a successful run on 14.04 on the same server. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp