I am sorry about the delay in uploading the logs. Update: * I wanted to experiment on many things before I conclude this as a kernel defect.
* A few firmware updates (including BIOS) were done. * After these fw updates, I am unable to hit the same crash/hang issue. Rather, I only see couple of my stress threads getting killed by oom- killer and other threads exiting gracefully after 20hrs of I/O stress run. This seems ok for me. I've tried 5 full runs now. * # uname -r 4.13.0-32-generic This is the same kernel where the hang was previously seen during the 20hr stress. Give me a couple of days to get back here and update if this looks to be a genuine Xenial defect. Thanks, Sujith -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1749746 Title: DellEMC AMD servers hang when running IO stress on NVMe disks Status in linux package in Ubuntu: Incomplete Bug description: Description: On Ubuntu 16.04 running 4.13.0-32 kernel, when file IO stress is run on multiple NVMe disks (ext4 partitioned), system hangs with multiple kernel crashes in the logs. Steps: 1. Setup a DellEMC AMD servers with a few NVMe disks. 2. Run the file IO stress on these disks for 24 hours. 3. Observe that the system goes un-reponsive after a few mins/hrs. Additional Info: * Stress ran fine for 24hrs with 4.12.0-041200-generic. * Stress ran fine for 13hrs with 4.10.0-28-generic. Had to stop the run manually due to some other reasons. * Stress fails with linux-image-4.13.0-25-generic. Attaching the logs. Will update here once we have more data. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1749746/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp