I'll add a OOM check wrapper around this stressor so it can detect OOM'ing and do a sane error handling condition.
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-hwe-5.4 in Ubuntu. https://bugs.launchpad.net/bugs/1961076 Title: linux-hwe-5.4 ADT test failure (ubuntu_stress_smoke_test) with linux- hwe-5.4/5.4.0-100.113~18.04.1 Status in Stress-ng: New Status in ubuntu-kernel-tests: New Status in linux-hwe-5.4 package in Ubuntu: New Status in linux-hwe-5.4 source package in Bionic: New Bug description: The 'dev-shm' stress-ng test is failing with bionic/linux-hwe-5.4 5.4.0-100.113~18.04.1 on ADT, only on ppc64el. Testing failed on: ppc64el: https://autopkgtest.ubuntu.com/results/autopkgtest-bionic/bionic/ppc64el/l/linux-hwe-5.4/20220216_115416_c1d6c@/log.gz 11:35:08 DEBUG| [stdout] stress-ng: debug: [26897] stress-ng 0.13.11 g48be8ff4ffc4 11:35:08 DEBUG| [stdout] stress-ng: debug: [26897] system: Linux autopkgtest 5.4.0-100-generic #113~18.04.1-Ubuntu SMP Mon Feb 7 15:02:55 UTC 2022 ppc64le 11:35:08 DEBUG| [stdout] stress-ng: debug: [26897] RAM total: 7.9G, RAM free: 3.3G, swap free: 1023.9M 11:35:08 DEBUG| [stdout] stress-ng: debug: [26897] 4 processors online, 4 processors configured 11:35:08 DEBUG| [stdout] stress-ng: info: [26897] setting to a 5 second run per stressor 11:35:08 DEBUG| [stdout] stress-ng: info: [26897] dispatching hogs: 4 dev-shm 11:35:08 DEBUG| [stdout] stress-ng: debug: [26897] cache allocate: using cache maximum level L1 11:35:08 DEBUG| [stdout] stress-ng: debug: [26897] cache allocate: shared cache buffer size: 32K 11:35:08 DEBUG| [stdout] stress-ng: debug: [26897] starting stressors 11:35:08 DEBUG| [stdout] stress-ng: debug: [26899] stress-ng-dev-shm: started [26899] (instance 0) 11:35:08 DEBUG| [stdout] stress-ng: debug: [26900] stress-ng-dev-shm: started [26900] (instance 1) 11:35:08 DEBUG| [stdout] stress-ng: debug: [26901] stress-ng-dev-shm: started [26901] (instance 2) 11:35:08 DEBUG| [stdout] stress-ng: debug: [26902] stress-ng-dev-shm: started [26902] (instance 3) 11:35:08 DEBUG| [stdout] stress-ng: debug: [26897] 4 stressors started 11:35:08 DEBUG| [stdout] stress-ng: debug: [26899] stress-ng-dev-shm: assuming killed by OOM killer, restarting again (instance 0) 11:35:08 DEBUG| [stdout] stress-ng: debug: [26902] stress-ng-dev-shm: assuming killed by OOM killer, restarting again (instance 3) 11:35:08 DEBUG| [stdout] stress-ng: debug: [26901] stress-ng-dev-shm: assuming killed by OOM killer, restarting again (instance 2) 11:35:08 DEBUG| [stdout] stress-ng: debug: [26900] stress-ng-dev-shm: assuming killed by OOM killer, restarting again (instance 1) 11:35:08 DEBUG| [stdout] stress-ng: debug: [26897] process [26899] (stress-ng-dev-shm) terminated on signal: 9 (Killed) 11:35:08 DEBUG| [stdout] stress-ng: debug: [26897] process [26899] (stress-ng-dev-shm) was killed by the OOM killer 11:35:08 DEBUG| [stdout] stress-ng: debug: [26897] process [26899] terminated 11:35:08 DEBUG| [stdout] stress-ng: debug: [26897] process [26900] (stress-ng-dev-shm) terminated on signal: 9 (Killed) 11:35:08 DEBUG| [stdout] stress-ng: debug: [26897] process [26900] (stress-ng-dev-shm) was possibly killed by the OOM killer 11:35:08 DEBUG| [stdout] stress-ng: debug: [26897] process [26900] terminated 11:35:08 DEBUG| [stdout] stress-ng: debug: [26901] stress-ng-dev-shm: exited [26901] (instance 2) 11:35:08 DEBUG| [stdout] stress-ng: debug: [26897] process [26901] terminated 11:35:08 DEBUG| [stdout] stress-ng: debug: [26897] process [26902] (stress-ng-dev-shm) terminated on signal: 9 (Killed) 11:35:08 DEBUG| [stdout] stress-ng: debug: [26897] process [26902] (stress-ng-dev-shm) was killed by the OOM killer 11:35:08 DEBUG| [stdout] stress-ng: debug: [26897] process [26902] terminated 11:35:08 DEBUG| [stdout] stress-ng: info: [26897] successful run completed in 5.06s 11:35:08 DEBUG| [stdout] stress-ng: fail: [26897] dev_shm instance 0 corrupted bogo-ops counter, 14 vs 0 11:35:08 DEBUG| [stdout] stress-ng: fail: [26897] dev_shm instance 0 hash error in bogo-ops counter and run flag, 2146579844 vs 0 11:35:08 DEBUG| [stdout] stress-ng: fail: [26897] dev_shm instance 1 corrupted bogo-ops counter, 13 vs 0 11:35:08 DEBUG| [stdout] stress-ng: fail: [26897] dev_shm instance 1 hash error in bogo-ops counter and run flag, 1093487894 vs 0 11:35:08 DEBUG| [stdout] stress-ng: fail: [26897] dev_shm instance 3 corrupted bogo-ops counter, 13 vs 0 11:35:08 DEBUG| [stdout] info: 5 failures reached, aborting stress process 11:35:08 DEBUG| [stdout] stress-ng: fail: [26897] dev_shm instance 3 hash error in bogo-ops counter and run flag, 1093487894 vs 0 11:35:08 DEBUG| [stdout] stress-ng: fail: [26897] metrics-check: stressor metrics corrupted, data is compromised 11:35:08 DEBUG| [stdout] 11:35:08 DEBUG| [stdout] [ 3840.094607] stress-ng invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=1000 11:35:08 DEBUG| [stdout] [ 3840.094611] CPU: 2 PID: 26903 Comm: stress-ng Tainted: P OE 5.4.0-100-generic #113~18.04.1-Ubuntu 11:35:08 DEBUG| [stdout] [ 3840.094612] Call Trace: 11:35:08 DEBUG| [stdout] [ 3840.094618] [c00000011edd77d0] [c000000000f05e28] dump_stack+0xbc/0x104 (unreliable) 11:35:08 DEBUG| [stdout] [ 3840.094623] [c00000011edd7810] [c0000000003863fc] dump_header+0x5c/0x2c0 11:35:08 DEBUG| [stdout] [ 3840.094625] [c00000011edd78a0] [c000000000386c1c] oom_kill_process+0x1ac/0x2a0 11:35:08 DEBUG| [stdout] [ 3840.094626] [c00000011edd78e0] [c000000000387d78] out_of_memory+0x128/0x750 11:35:08 DEBUG| [stdout] [ 3840.094630] [c00000011edd7980] [c00000000046a948] mem_cgroup_out_of_memory+0x118/0x150 11:35:08 DEBUG| [stdout] [ 3840.094632] [c00000011edd7a00] [c000000000470ab4] try_charge+0xa04/0xac0 11:35:08 DEBUG| [stdout] [ 3840.094634] [c00000011edd7b00] [c00000000047477c] mem_cgroup_try_charge+0xdc/0x330 11:35:08 DEBUG| [stdout] [ 3840.094635] [c00000011edd7b50] [c000000000474a0c] mem_cgroup_try_charge_delay+0x3c/0x80 11:35:08 DEBUG| [stdout] [ 3840.094637] [c00000011edd7b90] [c0000000003aa748] shmem_getpage_gfp+0x218/0xd00 11:35:08 DEBUG| [stdout] [ 3840.094639] [c00000011edd7c90] [c0000000003ad268] shmem_fallocate+0x348/0x610 11:35:08 DEBUG| [stdout] [ 3840.094641] [c00000011edd7d60] [c00000000048c994] vfs_fallocate+0x174/0x330 11:35:08 DEBUG| [stdout] [ 3840.094642] [c00000011edd7db0] [c00000000048e428] ksys_fallocate+0x68/0xf0 11:35:08 DEBUG| [stdout] [ 3840.094643] [c00000011edd7e00] [c00000000048e4d8] sys_fallocate+0x28/0x40 11:35:08 DEBUG| [stdout] [ 3840.094646] [c00000011edd7e20] [c00000000000b378] system_call+0x5c/0x68 11:35:08 DEBUG| [stdout] [ 3840.094649] --- interrupt: c01 at 0x71d6d5855ab0 This doesn't seem to be a kernel regression, as it has already failed with 5.4.0-94.106~18.04.1 (https://autopkgtest.ubuntu.com/results/autopkgtest- bionic/bionic/ppc64el/l/linux-hwe-5.4/20220108_161257_2ca9f@/log.gz), but passed with versions -97 and -99, and it doesn't fail in the regression tests infrastructure. The focal/linux main kernel also did not have any failure. I have checked the stress-ng repo and didn't find any recent change to this testcase. To manage notifications about this bug go to: https://bugs.launchpad.net/stress-ng/+bug/1961076/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp