Greetings!
No luck with 5.4.0-80.90, still getting the same bug as before even on kernel 
version 5.4.0-86. Still no clue on how to reproduce it – hypervisor nodes just 
randomly crash. I have attached dmesg of the most recent encounter, but it 
seems identical to previous ones.

Here is fresh crash dump –
https://drive.google.com/file/d/1skA238DVtxpY8t8ANdzX1gBC8muChxto/view?usp=sharing


** Attachment added: "crash-260122.log"
   
https://bugs.launchpad.net/ubuntu/+source/linux-hwe-5.4/+bug/1921355/+attachment/5557608/+files/crash-260122.log

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-hwe-5.4 in Ubuntu.
https://bugs.launchpad.net/bugs/1921355

Title:
  cgroups related kernel panics

Status in linux package in Ubuntu:
  Incomplete
Status in linux-hwe-5.4 package in Ubuntu:
  Confirmed

Bug description:
  Hi!

  Recently (throughout the last 6 months) we've upgraded our hypervisor
  compute hosts from ubuntu bionic kernel 4.15.* to ubuntu bionic hwe
  kernel 5.4.

  This month we noticed that several nodes failed due to bugs in cgroups.
  Trace was different almost every time, but it all revolves around cgroups - 
either null pointer failures, or panic caught by BUG_ON() macro. Looked like 
some cgroup didn't exist anymore but somebody tried to access it, thus causing 
kernel panic.
  Please find the logs attached.

  3 of 4 cases happened after a VM shutdown. We tried to spawn lots of VMs, 
load them, shut them down, but didn't manage to reproduce the behavior.
  Actually, every case is sort of different - patch kernel versions (5.4.0-42 
to 5.4.0-66), uptime vary (from 1 day to ~half a year). There are also lots of 
hosts with several months of uptime, no issue with them. Also, on 4.15 we've 
never seen this behavior, at all.
  That's quite disturbing, as I don't want dozens of VMs crash (due to host 
outage) at random times for some vague reason...
  I didn't manage to find any related bugs on the bug tracker, thus creating 
this one.

  I wonder if anybody in the community came across something like that.
  Could somebody give an advice how to debug further, or where else to report / 
look for a similar the case?

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1921355/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to