We're getting zombies here which aren't being reaped:

130428 ?        Z      0:00 [stress-ng-brk] <defunct>
130432 ?        Z      0:00 [stress-ng-brk] <defunct>
130434 ?        Z      0:00 [stress-ng-brk] <defunct>
130436 ?        Z      0:00 [stress-ng-brk] <defunct>

The reason for this is that memory stressors like brk have a parent that
forks off a child. The child performs the stressing and if it gets OOM'd
the parent can spawn off another stressor.  So I think the SIGKILL on
the stress-ng brk stressor is killing the parent bug the child (which is
still holding onto a load of memory on the heap) is not being waited for
and hence is in a memory hogging zombie state.  We may be in a
pathologically memory hogging state because the zombies may be holding
brk regions that are swapped out to disk due to memory pressure and
we're hitting a low-memory state which is not being cleared up.

I suggest modifying the test bash script as follows:

1. run stress-ng with -k flag (so that all the processes have the same 
stress-ng name)
2. kill with ALRM first
3. then kill with KILL all the stress-ng processes after a small grace period.
4. report on unkillable stressors

refer to the changes I made to
https://launchpadlibrarian.net/296974522/disk_stress_ng

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1573062

Title:
  memory_stress_ng failing for Power architecture for 16.04

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Yakkety:
  In Progress

Bug description:
  memory_stress_ng, as part of server certification is failing for IBM
  Power S812LC(TN71-BP012) in bare metal mode. Failing in this case is
  defined by the test locking up the server in an unrecoverable state
  which only a reboot will fix.

  I will be attaching screen and kern logs for the failures and a
  successful run on 14.04 on the same server.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1573062/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to