** Tags added: cscc -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1775732
Title: arm64 soft lock crashes on nova-compute charm running Status in linux package in Ubuntu: Confirmed Status in linux source package in Bionic: Confirmed Bug description: Discovered on bionic, arm64 (Moonshot, verified on multiple swirlix cartridges), 4.15.0-22-generic. After deploying the nova-compute Juju charm, on subsequent reboots, within a few seconds after complete boot, everything will freeze and eventually display on the serial console (just these, no traces): [ 188.010510] watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [juju-log:2272] [ 216.010292] watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [juju-log:2272] (From here on, "lock up" refers to that sequence: boot a kernel, it completes boot to login prompt, then everything freezes a few seconds later, then BUGs.) It's usually but not always juju-log, sometimes a relation-ids or similar. I was able to briefly notice that it was in its startup config-changed hook. I've separated out and tested nearly everything it does during its startup config-changed (sets up bridging, writes some config files, restarts libvirtd/nova-compute/etc) without being able to trigger the bug, but I suspect proximity to boot is a factor. If I disable jujud- unit-nova-compute startup, boot, log in, re-enable and start (by which time over a minute or so has elapsed from boot finish), it will not lock up. Similarly, if I wrap the jujud startup in a `strace -Ff -o /var/log/strace.log` (which slows it down massively), it will not lock up. Watched pot syndrome. I've tried kernels from http://kernel.ubuntu.com/~kernel-ppa/mainline/ . I noticed most of the recent arm64 mainline kernels had failed builds, notified the kernel team channel and apw fixed the issue and started some rebuilds. What I've discovered (after many dead ends and a futile bisection) is that mainline builds before the rebuilds lock up, but fixed mainline builds initiated by apw DO NOT lock up. e.g. 4.16.3-041603.201804190730 locks up, but 4.16.6-041606.201806042022 does not lock up. (4.16.4 and 4.16.5 appear to have never been rebuilt and don't have arm64 debs, and that period is what I tried to bisect after figuring a fix must be in there.) But when I try to compile any of these recent kernels myself, they lock up when booted. Same kernel configs, tried on both bionic and in a cosmic chroot, tried both native arm64 compile and cross-compile from amd64. e.g. 4.16.6-041606.201806042022 from k.u.c does not lock up, but when I build it myself, it does. TBC, I've verified lock ups on the following kernels (all assume kernel configs from their respective Ubuntu or k.u.c mainline builds): - 4.15.0-22-generic from bionic (both Ubuntu-provided and my own recompile) - v4.16 (and all point releases) - v4.17 As I write this, my compiled v4.10 DOES NOT appear to lock up. I will attempt to bisect at a macro level from 4.10..4.15 and dig deeper. ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: linux-image-4.15.0-22-generic 4.15.0-22.24 ProcVersionSignature: Ubuntu 4.15.0-22.24-generic 4.15.17 Uname: Linux 4.15.0-22-generic aarch64 AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Jun 2 04:22 seq crw-rw---- 1 root audio 116, 33 Jun 2 04:22 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.2 Architecture: arm64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: Date: Fri Jun 8 00:13:05 2018 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig' Lsusb: Error: command ['lsusb'] failed with exit code 1: PciMultimedia: ProcEnviron: TERM=xterm-256color PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcFB: ProcKernelCmdLine: console=ttyS0,9600n8r ro RelatedPackageVersions: linux-restricted-modules-4.15.0-22-generic N/A linux-backports-modules-4.15.0-22-generic N/A linux-firmware 1.173.1 RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill' SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1775732/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp