Ok, nm about the sosreport - I got the info from some older emails from axino, nova is using qemu-nbd to locally mount images and access the partitions inside them. I was able to trivially reproduce this simply by creating an image, attaching it with qemu-nbd to /dev/nbd0, partitioning it and mkfs its p1 and then mounting it, then while copying a file to it, performing qemu-nbd -d to un-attach it to /dev/nbd0. That causes the spam of "Attempted..." error messages.
So this appears to be a simple case of nova calling qemu-nbd -d while there is still I/O to the image. The right thing to do is simply ratelimit the error messages (and they really should be anyway, as they're printing directly inside a loop). The messages themselves do not indicate any kernel error, simply that the nbd device was removed while being written to. Can you try this kernel PPA to see if it fixes the problem? You will still see the error messages, but only a few lines since they'll be ratelimited. Of course there is still the (probably more serious) problem of the serial port driver hanging a cpu and eating up memory; that probably deserves its own bug, since it's caused by this, but a separate issue. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1505564 Title: Soft lockup with "block nbdX: Attempted send on closed socket" spam Status in linux package in Ubuntu: In Progress Bug description: Some of our nova compute hosts regularly freeze, sometimes for a few hours, with kern.log getting spammed with: block nbdX: Attempted send on closed socket and a few "CPU soft lockup" messages (see attached log). This clears up when the queue gets cleared, eg : block nbdX: queue cleared trusty hosts with kernel version 3.19.0-30-generic. --- AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Nov 24 12:23 seq crw-rw---- 1 root audio 116, 33 Nov 24 12:23 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3.19 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: DistroRelease: Ubuntu 14.04 IwConfig: Error: [Errno 2] No such file or directory MachineType: HP ProLiant DL385 G7 Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen-256color PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 radeondrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.19.0-36-generic root=UUID=13289ac9-8dc9-4feb-b6bd-ca7db66b21d6 ro console=tty0 console=ttyS1,38400 nosplash crashkernel=384M-:512M nox2apic intremap=off ProcVersionSignature: Ubuntu 3.19.0-36.41~14.04.1hf00090138v20151122b1-generic 3.19.8-ckt9 RelatedPackageVersions: linux-restricted-modules-3.19.0-36-generic N/A linux-backports-modules-3.19.0-36-generic N/A linux-firmware 1.127.18 RfKill: Error: [Errno 2] No such file or directory Tags: trusty uec-images Uname: Linux 3.19.0-36-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: _MarkForUpload: True dmi.bios.date: 02/02/2014 dmi.bios.vendor: HP dmi.bios.version: A18 dmi.chassis.type: 23 dmi.chassis.vendor: HP dmi.modalias: dmi:bvnHP:bvrA18:bd02/02/2014:svnHP:pnProLiantDL385G7:pvr:cvnHP:ct23:cvr: dmi.product.name: ProLiant DL385 G7 dmi.sys.vendor: HP To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1505564/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp