The tests (both yakkety and zesty) are still looping, I'm afraid. I
didn't see the "BUG soft lockup" in the one instance I was watching.
There were no obvious kernel call traces in dmesg, and the test output
tail is just:

06:27:14 DEBUG| [stdout] 
Killed
DEBUG - Leave sh
ERROR - The command (AUTOTEST_PATH=/home/ubuntu/autotest sudo -E 
autotest/client/autotest-local --verbose 
autotest/client/tests/ubuntu_qrt_apparmor/control) return
ed a non-zero exit status (137).
ERROR - 06:03:39 INFO | Writing results to 
/tmp/autopkgtest.XDxmzj/build.2sj/linux-4.8.0/autotest/client/results/default
ERROR - 06:03:39 DEBUG| Initializing the state engine
ERROR - 06:03:39 DEBUG| Persistent state client.steps now set to []
ERROR - 06:03:39 DEBUG| Persistent option harness now set to None
ERROR - 06:03:39 DEBUG| Persistent option harness_args now set to None
ERROR - 06:03:39 DEBUG| Selected harness: standalone
ERROR - 06:03:39 INFO | START   ----    ----    timestamp=1477029819    
localtime=Oct 21 06:03:39
ERROR - 06:03:39 DEBUG| Persistent state client._record_indent now set to 1
ERROR - 06:03:39 DEBUG| Test has timeout: 7200 sec.
ERROR - 06:03:39 INFO |         START   ubuntu_qrt_apparmor.setup       
ubuntu_qrt_apparmor.setup       timestamp=1477029819    timeout=7200    
localtime=Oct 21 06
:03:39
ERROR - 06:03:39 DEBUG| Persistent state client._record_indent now set to 2
ERROR - 06:03:39 DEBUG| Persistent state client.unexpected_reboot now set to 
('ubuntu_qrt_apparmor.setup', 'ubuntu_qrt_apparmor.setup')
ERROR - 06:03:39 DEBUG| Waiting for pid 2611 for 7200 seconds
ERROR - 06:03:39 DEBUG| Running 'apt-get update'
[...]
ERROR - 06:05:02 DEBUG| [stdout] Unpacking libgraphite2-3:i386 (1.3.8-1ubuntu1) 
...
ERROR - 06:05:02 DEBUG| [stdout] Selecting previously unselected package 
libharfbuzz0b:i386.
ERROR - 06:05:02 DEBUG| [stdout] Preparing to unpack 
.../035-libharfbuzz0b_1.2.7-1_i386.deb ...
ERROR - 06:05:02 DEBUG| [stdout] Unpacking libharfbuzz0b:i386 (1.2.7-1) ...
ERROR - 06:05:02 DEBUG| [stdout] Selecting previously unselected package 
libpangoft2-1.0-0:i386.
ERROR - 06:05:02 DEBUG| [stdout] Preparing to unpack 
.../036-libpangoft2-1.0-0_1.40.3-2_i386.deb ...
ERROR - 06:05:02 DEBUG| [stdout] Unpacking libpangoft2-1.0-0:i386 (1.40.3-2) ...
Killed
Connection to 10.220.46.18 closed by remote host.
Exit request sent.
autopkgtest [06:31:15]: ERROR: testbed failure: testbed auxverb failed with 
exit code 255

So this time it seems apt gets killed. The VM went away after that, so I
don't have it any more for investigation.

I'm watching another current run and will get its journalctl -f and test
output tail, and will then kill the current runs again as there are a
lot of them, they loop, and they block large amounts of capacity.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1634519

Title:
  linux autopkgtest: eternal hang/ BUG: soft lockup

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Following up to bug 1632252, tests still hang (but not due to the
  suspend issue any more). In a local QEMU run:

  $ autopkgtest --testname ubuntu-regression-suite linux -- qemu 
--ram-size=4096 -o /var/cache/martin /srv/vm/autopkgtest-yakkety-amd64.img
  [...]
  14:46:41 DEBUG| [stdout] Test icebp [Ok]
  14:46:41 DEBUG| [stdout] Test int 3 trap [Ok]
  14:46:41 DEBUG| [stdout] selftests: breakpoint_test [PASS]

  and since then (1 hour) it's hung. But now I still can log into ttyS0.
  dmesg is almost empty:

  [ 5501.499217] ata2.01: NODEV after polling detection
  [ 5501.500138] ata2.00: configured for MWDMA2

  (something in the test clears the ring buffer), and journalctl
  confirms that suspend/resume worked fine:

  Oct 18 14:47:11 autopkgtest kernel: PM: suspend of devices complete after 
65.273 msecs
  Oct 18 14:47:11 autopkgtest kernel: PM: late suspend of devices complete 
after 0.172 msecs
  Oct 18 14:47:11 autopkgtest kernel: PM: noirq suspend of devices complete 
after 1.963 msecs
  Oct 18 14:47:11 autopkgtest kernel: ACPI: Preparing to enter system sleep 
state S3
  Oct 18 14:47:11 autopkgtest kernel: PM: Saving platform NVS memory
  Oct 18 14:47:11 autopkgtest kernel: Disabling non-boot CPUs ...
  Oct 18 14:47:11 autopkgtest kernel: kvm-clock: cpu 0, msr 1:3fff4001, primary 
cpu clock, resume
  Oct 18 14:47:11 autopkgtest kernel: ACPI: Low-level resume complete
  Oct 18 14:47:11 autopkgtest kernel: PM: Restoring platform NVS memory
  Oct 18 14:47:11 autopkgtest kernel: ACPI: Waking up from system sleep state S3
  Oct 18 14:47:11 autopkgtest kernel: PM: noirq resume of devices complete 
after 6.973 msecs
  Oct 18 14:47:11 autopkgtest kernel: PM: early resume of devices complete 
after 0.105 msecs
  Oct 18 14:47:11 autopkgtest kernel: pci 0000:00:01.0: PIIX3: Enabling Passive 
Release
  Oct 18 14:47:11 autopkgtest kernel: rtc_cmos 00:00: System wakeup disabled by 
ACPI
  Oct 18 14:47:11 autopkgtest kernel: PM: resume of devices complete after 
8.898 msecs
  Oct 18 14:47:11 autopkgtest kernel: PM: Finishing wakeup.
  Oct 18 14:47:11 autopkgtest systemd[1]: Time has been changed
  Oct 18 14:47:11 autopkgtest systemd[1]: apt-daily.timer: Adding 10h 56min 
28.694439s random time.
  Oct 18 14:47:11 autopkgtest systemd[3986]: Time has been changed
  Oct 18 14:47:11 autopkgtest kernel: Restarting tasks ... done.
  Oct 18 14:47:11 autopkgtest sudo[30731]: pam_unix(sudo:session): session 
closed for user root
  Oct 18 14:47:11 autopkgtest kernel: ata2.01: NODEV after polling detection
  Oct 18 14:47:11 autopkgtest kernel: ata2.00: configured for MWDMA2

  But also, no messages beyond that (last message one hour ago).

  On the currently running amd64 yakkety test on the ifrastructure, the
  int trap bug fails:

  13:15:46 DEBUG| [stdout] Test icebp [Ok]
  13:15:46 DEBUG| [stdout] Test int 3 trap [Failed]
  13:15:46 DEBUG| [stdout] selftests: breakpoint_test [PASS]

  and then it's apparently hung at exactly the same problem. But unlike
  with local QEMU I cannot ssh in any more (Connection refused), and
  console-log shows trouble:

  autopkgtest login: [ 1927.192886] AppArmor: change_hat: Invalid input '^open'
  [ 1927.194060] AppArmor: change_hat: Invalid input '^'[ 1927.198501] 
AppArmor: change_hat: Invalid input, NULL hat and NULL magic
  [ 1927.203181] AppArmor: change_hat: Invalid input, NULL hat and NULL magic
  [ 1927.207249] AppArmor: change_hat: Invalid input '^open'AppArmor: 
change_hat: Invalid input '^'
  [ 1927.208826] AppArmor: change_hat: Invalid input '^'[[0;32m  OK  [0m] 
Stopped target Graphical Interface.
  [[0;32m  OK  [0m] Stopped target Multi-User System.
           Stopping OpenBSD Secure Shell server...
           Stopping autopkgtest root shell on ttyS1...
  [ 2909.360975] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! 
[swapper/3:0]
  [ 2937.360965] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! 
[swapper/3:0]
  [ 2941.368025] INFO: rcu_sched self-detected stall on CPU
  [ 2941.368095]  3-...: (1 GPs behind) idle=ea7/2/0 softirq=544443/544443 
fqs=6600 
  [ 2941.368095]   (t=15000 jiffies g=205236 c=205235 q=134637)
  [ 2969.360984] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! 
[swapper/3:0]
  [ 2997.360987] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! 
[swapper/3:0]
  [ 3021.796060] INFO: task systemd:1 blocked for more than 120 seconds.
  [ 3021.797626]       Tainted: G        W  OEL  4.8.0-25-generic #27-Ubuntu
  [ 3021.797627] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 3021.797727] INFO: task systemd-journal:285 blocked for more than 120 
seconds.
  [ 3021.797728]       Tainted: G        W  OEL  4.8.0-25-generic #27-Ubuntu
  [ 3021.797728] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 3021.797772] INFO: task kworker/1:2:303 blocked for more than 120 seconds.
  [ 3021.797772]       Tainted: G        W  OEL  4.8.0-25-generic #27-Ubuntu
  [ 3021.797773] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 3021.797814] INFO: task kworker/2:2:350 blocked for more than 120 seconds.
  [ 3021.797815]       Tainted: G        W  OEL  4.8.0-25-generic #27-Ubuntu
  [ 3021.797815] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 3021.797853] INFO: task acpid:659 blocked for more than 120 seconds.
  [ 3021.797853]       Tainted: G        W  OEL  4.8.0-25-generic #27-Ubuntu
  [ 3021.797854] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 3021.797940] INFO: task (exim4):31259 blocked for more than 120 seconds.
  [ 3021.797941]       Tainted: G        W  OEL  4.8.0-25-generic #27-Ubuntu
  [ 3021.797941] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 3025.361006] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! 
[swapper/3:0]
  [ 3053.360962] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! 
[swapper/3:0]
  [ 3081.360962] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! 
[swapper/3:0]
  [ 3109.360971] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! 
[swapper/3:0]

  ... and a couple of similar repetitions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1634519/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to