Hi Shixiong, thanks for the report.

At the moment the bisect sits here:

4d60b13f267d workqueue: Don't call cpumask_test_cpu() with -1 CPU in 
wq_update_node_max_active()
adc1b642f72f workqueue: Implement system-wide nr_active enforcement for unbound 
workqueues
929b7fbecbcc workqueue: Introduce struct wq_node_nr_active
afd774d513f5 workqueue: RCU protect wq->dfl_pwq and implement accessors for it
31a8e16645d7 workqueue: Make wq_adjust_max_active() round-robin pwqs while 
activating
e4bbec8ce062 workqueue: Move nr_active handling into helpers
865f7641cf47 workqueue: Replace pwq_activate_inactive_work() with 
[__]pwq_activate_work()
a88074533304 workqueue: Factor out pwq_is_empty()
5d378b3d47e1 workqueue: Move pwq->max_active to wq->max_active
eb182ba1f6cb workqueue.c: Increase workqueue name length
a0fcae282d10 do_sys_name_to_handle(): use kzalloc() to fix kernel-infoleak
fa1cbadd64bc UBUNTU: [Packaging] add Real-time Linux Analysis tool (rtla) to 
linux-tools
888e7c48a1ff UBUNTU: SAUCE: rtla: fix deb build
51c8aee42179 UBUNTU: [Packaging] provide a wrapper module for python-perf
48357b9b6d27 UBUNTU: [Packaging] enable perf python module
759436dbdae1 drm/amdgpu: respect the abmlevel module parameter value if it is 
set
716ec855fa62 drm/amd/display: add panel_power_savings sysfs entry to eDP 
connectors
b4275c751289 UBUNTU: Start new release
782e3646d110 UBUNTU: [Packaging] update annotations scripts
b43457da65e4 UBUNTU: [Packaging] update variants
80ebe4152d65 UBUNTU: [Packaging] drop getabis data
7fdb45c9bbbc (tag: Ubuntu-6.8.0-31.31, 
refs/bisect/good-7fdb45c9bbbc95a3300b4d8de3f751f4c05c98e2) UBUNTU: 
Ubuntu-6.8.0-31.31

these workqueue patches are definitely suspicious, but they are actually
part of the v6.8.2 stable tree  hence i'm a bit puzzled - before we move
forward, may i ask you to:

1) test the lastest v6.8 Noble kernel from  proposed:

linux-generic | 6.8.0-58.60            | noble-proposed    | amd64,
arm64, armhf, ppc64el, s390x

2) test again Noble GA kernel:

linux-generic | 6.8.0-31.31            | noble             | amd64,
arm64, armhf, ppc64el, s390x

I want to double check we didn't overlook anything.

Thank you.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2081685

Title:
  [Ubuntu 24.04-generic Kernel-6.8]Hard lockup on 8 Socket System,
  ThinkSystem SR950 V3.

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Noble:
  In Progress
Status in linux source package in Oracular:
  Confirmed

Bug description:
  There is CPU hard Lockup detected under Ubuntu 24.04 LTS (kernel
  6.8.0-38). see attachment"dmesg0723-Lockup-Ubuntu24.04.log"

  ubuntu@SR950V3:~$ cat /var/log/dmesg | grep -i  lockup

  [   15.241164] kernel: watchdog: Watchdog detected hard LOCKUP on cpu
  124

  [   15.241164] kernel:  ? watchdog_hardlockup_check+0x1cb/0x3b0

  
  Besides, the issue does not occur on upstream kernel 6.8,6.9, 6.10, 6.11-rc*, 
then only ubuntu kernel issue. see  attachment "dmesg0923-No-Lockup-Kernel 
6-10.log". 
  According to the dmesg log, the "hard lockup" is not a real lockup, 
  Because many CPU try to get cache_disable_lock spin lock at the same time 
when kernel boot. And competition has occurred here. 
  Every CPU's TLB will be flushed in the critical zone, the flushing TLB is a 
time-consuming operation, and there are so many CPUs,
  so the false "hard lockup" was detected by kernel. To avoid customer confuse, 
when Canonical do the fix?

  
  HW Config:
  ThinkSystem SR950 V3

  CPU: 8*  Intel(R) Xeon(R) Platinum 8490H 60 Core 3.5GHz

  MEM:  2TB = SK Hynix 356GB DDR5 4800MHz 3DS (2015.1GB)

  Raid: ThinkSystem RAID 940-8i 4GB Flash PCIe Gen4 12Gb Adapter

  Storage: Micron_7450_MTFDKBA960TFR *1

  Samsung 30.7TB 24Gbps SAS 2.5" SSD

  NIC: ThinkSystem Intel X710-T4L 10GBASE-T 4-Port OCP Ethernet Adapter

  OS: ubuntu 24.04 LTS( kernel 6.8.0-38-generic)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2081685/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to