This bug is missing log files that will aid in diagnosing the problem.
While running an Ubuntu kernel (not a mainline or third-party kernel)
please enter the following command in a terminal window:

apport-collect 1798127

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable
to run this command, please add a comment stating that fact and change
the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the
Ubuntu Kernel Team.

** Changed in: linux (Ubuntu)
       Status: New => Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1798127

Title:
  CPU Soft Lockups when stress-ng stack stressor runs with M.2 NVMe as
  root FS

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  This was reported by a hardware partner.  The system set up is a
  server with 512GB RAM and an M.2 NVMe drive as the root
  filesystem/boot device.

  Per the customer, when running the certification Memory Stress test
  (utilizing several stress-ng stressors run in sequence) the system
  freezes with CPU Soft Lockup errors appearing on console whe the
  "stack" stressor is run.

  Tester has tried with 2.5” SATA (1TB), 2.5” NVMe (800GB), and M.2 NVMe
  (1.9TB).

  So far, this only seems to affect the 4.15 kernel.  The tester has
  tried using the 2.5" SATA SSD as the RootFS/Boot device and the tests
  pass on all attempts.  It is ONLY when using the M.2 NVMe as the root
  / boot device that the tests cause a lockup.  The tester is re-trying
  now with the 2.5" NVMe device to see if this only occurs with the M.2
  NVMe.

  The tester has tried this on the following while using the M.2 NVMe as the 
rootFS/Boot device:
  Test run #1 – 16.04.5 at kernel 4.15; Result: Failed stress-ng memory on 
stack stressor
  Test run #2 – 18.04.1 at kernel 4.15; Result: Failed stress-ng memory on 
stack stressor
  Test run #3 – 16.04.5 at kernel 4.4; Result: Passed stress-ng memory test

  
  The stress-ng command invoked at the time the soft lockups occur is this:

  'stress-ng -k --aggressive --verify --timeout 300 --stack 0'

  This can be reproduced by running the memory_stress_ng test script
  from the cert suite:

  sudo /usr/lib/plainbox-provider-certification-
  server/bin/memory_stress_ng

  It may be more easily reproducible running the stack stressor alone,
  or the whole memory stress script without dealing with Checkbox.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1798127/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to