Public bug reported:

== Summary ==

When a Docker container with restart policy is crash-looping (creating
and destroying veth interfaces repeatedly), the ath10k WiFi driver times
out waiting for peer stats from the firmware, enters a spin-wait on one
CPU, and causes a hard lockup that freezes the entire system. The NMI
watchdog detects the lockup but with hardlockup_panic=0 (default) no
panic is triggered and no recovery is possible — the machine must be
power-cycled.

== System ==

Ubuntu 24.04.4 LTS
Kernel: 6.17.0-23.23~24.04.1-generic (HWE)
CPU: AMD Ryzen 7 5700U (Zen 2, 8 cores / 16 threads)
WiFi: Qualcomm Atheros QCA6174 802.11ac [168c:003e] rev 32
      ath10k_pci, qca6174 hw3.2, firmware WLAN.RM.4.4.1-00309-

== Steps to reproduce ==

1. Have a Docker container configured with restart: unless-stopped (or always) 
that exits immediately on start (e.g. due to a misconfiguration)
2. Leave the system running — Docker will restart the container every ~60 
seconds, each time creating and destroying a veth interface pair attached to a 
Docker bridge
3. After O(100–600) restart cycles, the ath10k driver times out waiting for 
peer stats from the firmware and the system hard-locks

== Observed in journal (journalctl -b -1 -k) ==

After hundreds of veth create/destroy cycles:

  ath10k_pci 0000:03:00.0: timed out waiting peer stats info
  watchdog: CPU6: Watchdog detected hard LOCKUP on cpu 6

The ath10k timeout appeared at 23:19:23; the hard lockup was detected at
23:19:28 (5 seconds later). The system was completely unresponsive from
that point — display off, no keyboard/mouse response. No panic, no kdump
(hardlockup_panic=0). Power button required after ~37 minutes.

In the 10-hour session before the crash, the container restarted 608
times, producing 608 veth interface create/destroy cycles on bridge
br-f396b81ca142.

== Expected behaviour ==

ath10k firmware timeout should be handled gracefully (log, reset
firmware, or abort) without holding a spinlock or disabling interrupts
on a CPU indefinitely. The kernel should not become unresponsive due to
a WiFi driver error recovery path.

== Additional notes ==

- kernel.hardlockup_panic = 0 (default) means no vmcore was captured
- No MCE errors were present — this is a pure driver/locking issue
- The ath10k chip is the Qualcomm Atheros on a Dell Vostro 5515 (subsystem 
1028:0310)
- CPUs were in native amd_idle mode (no processor.max_cstate restriction), 
which allows deep CC6 states — the IRQ delivery to a CC6-sleeping CPU6 may have 
contributed to the ath10k timeout escalating into a lockup

ProblemType: Bug
DistroRelease: Ubuntu 24.04
Package: linux-image-6.17.0-23-generic 6.17.0-23.23~24.04.1
ProcVersionSignature: Ubuntu 6.17.0-23.23~24.04.1-generic 6.17.13
Uname: Linux 6.17.0-23-generic x86_64
ApportVersion: 2.28.1-0ubuntu3.8
Architecture: amd64
CRDA: N/A
CasperMD5CheckResult: unknown
CurrentDesktop: ubuntu:GNOME
Date: Thu May 14 00:21:22 2026
InstallationDate: Installed on 2022-04-01 (1504 days ago)
InstallationMedia: Ubuntu 20.04.4 LTS "Focal Fossa" - Release amd64 (20220223)
MachineType: Dell Inc. Vostro 5515
ProcFB: 0 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-6.17.0-23-generic 
root=UUID=615dcedf-89c6-42dc-9f4a-c35fea943a7e ro 
psmouse.synaptics_intertouch=0 nvme_core.default_ps_max_latency_us=0 
crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M
RelatedPackageVersions:
 linux-restricted-modules-6.17.0-23-generic N/A
 linux-backports-modules-6.17.0-23-generic  N/A
 linux-firmware                             20240318.git3b128b60-0ubuntu2.27
SourcePackage: linux-hwe-6.17
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 01/30/2026
dmi.bios.release: 5.3
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 1.33.0
dmi.board.name: 0P3R55
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 10
dmi.chassis.vendor: Dell Inc.
dmi.chassis.version: 1.33.0
dmi.modalias: 
dmi:bvnDellInc.:bvr1.33.0:bd01/30/2026:br5.3:svnDellInc.:pnVostro5515:pvr1.33.0:rvnDellInc.:rn0P3R55:rvrA00:cvnDellInc.:ct10:cvr1.33.0:sku0A7A:
dmi.product.family: Vostro
dmi.product.name: Vostro 5515
dmi.product.sku: 0A7A
dmi.product.version: 1.33.0
dmi.sys.vendor: Dell Inc.

** Affects: linux-hwe-6.17 (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: amd64 apport-bug ath10k hw-specific kernel-bug noble wayland-session

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2152582

Title:
  ath10k firmware timeout causes hard lockup on CPU6 when veth
  interfaces cycle repeatedly (Docker restart loop)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-hwe-6.17/+bug/2152582/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to