The tracer at https://github.com/chaiken/BPF-sandbox/commit/e0dda39cbe92e0f80805a2a06aa80d6fb3b065b2 finally was running when apcupsd triggered the lockup. The stack below has appeared over and over for days. The time output in the trace has a 1000x math error:

2025-04-24T11:20:23.394893-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: pci_mmcfg_read() for comm apcupsd with tid 3183 took 222794 s 2025-04-24T11:20:38.809232-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: apcupsd: lock contended for 222810s 2025-04-24T11:20:38.809380-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: __down_common+521 2025-04-24T11:20:38.809396-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: __down_common+521 2025-04-24T11:20:38.809406-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: down_timeout+85 2025-04-24T11:20:38.809420-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: acpi_os_wait_semaphore+76 2025-04-24T11:20:38.809431-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: acpi_ut_acquire_mutex+58 2025-04-24T11:20:38.809444-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: acpi_ex_enter_interpreter+16 2025-04-24T11:20:38.809458-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: acpi_ex_system_wait_mutex+82 2025-04-24T11:20:38.809470-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: acpi_ds_begin_method_execution+137 2025-04-24T11:20:38.809482-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: acpi_ds_call_control_method+79 2025-04-24T11:20:38.809493-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: acpi_ps_parse_aml+718 2025-04-24T11:20:38.809504-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: acpi_ps_execute_method+319 2025-04-24T11:20:38.809515-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: acpi_ns_evaluate+318 2025-04-24T11:20:38.809528-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: acpi_evaluate_object+334 2025-04-24T11:20:38.809539-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: acpi_device_set_power+495 2025-04-24T11:20:38.809551-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: acpi_pci_set_power_state+109 2025-04-24T11:20:38.809562-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: pci_power_up+48 2025-04-24T11:20:38.809573-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: pci_pm_runtime_resume+51 2025-04-24T11:20:38.809584-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: __rpm_callback+65 2025-04-24T11:20:38.809596-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: rpm_callback+85 2025-04-24T11:20:38.809608-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: rpm_resume+1235 2025-04-24T11:20:38.809618-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: rpm_resume+723 2025-04-24T11:20:38.809630-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: __pm_runtime_resume+75 2025-04-24T11:20:38.809641-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: usb_autoresume_device+30 2025-04-24T11:20:38.809651-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: usbdev_open+259 2025-04-24T11:20:38.809662-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: chrdev_open+178 2025-04-24T11:20:38.809673-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: do_dentry_open+332 2025-04-24T11:20:38.809685-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: vfs_open+46 2025-04-24T11:20:38.809698-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: path_openat+2094 2025-04-24T11:20:38.809708-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: do_filp_open+196 2025-04-24T11:20:38.809719-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: do_sys_openat2+174 2025-04-24T11:20:38.809730-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: __x64_sys_openat+85 2025-04-24T11:20:38.809742-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: do_syscall_64+130 2025-04-24T11:20:38.809753-07:00 schallkreis run-investigate-pcie-lockup.sh[2622305]: entry_SYSCALL_64_after_hwframe+118

Here is the failure shortly thereafter. It happened while the system was idle except for browser, IRC etc. left running. Perhaps he pulseaudio failure is related?

2025-04-24T11:22:43.698237-07:00 schallkreis plasmashell[3874]: org.kde.pulseaudio: context kaput 2025-04-24T11:22:43.698345-07:00 schallkreis plasmashell[3874]: org.kde.pulseaudio: No object for name "alsa_output.pci-0000_73_00.6.analog-stereo" 2025-04-24T11:22:43.698364-07:00 schallkreis plasmashell[3874]: org.kde.pulseaudio: No object for name "alsa_output.pci-0000_73_00.6.analog-stereo.monitor" 2025-04-24T11:22:44.139668-07:00 schallkreis kded6[3770]: org.kde.pulseaudio: context kaput 2025-04-24T11:22:44.139719-07:00 schallkreis kded6[3770]: org.kde.pulseaudio: No object for name "alsa_output.pci-0000_73_00.6.analog-stereo" 2025-04-24T11:22:44.139733-07:00 schallkreis kded6[3770]: org.kde.pulseaudio: No object for name "alsa_output.pci-0000_73_00.6.analog-stereo.monitor" 2025-04-24T11:22:44.139817-07:00 schallkreis systemd[3492]: pulseaudio.service: Main process exited, code=killed, status=9/KILL 2025-04-24T11:22:44.580911-07:00 schallkreis systemd[3492]: pulseaudio.service: Failed with result 'signal'. 2025-04-24T11:22:44.581246-07:00 schallkreis systemd[3492]: pulseaudio.service: Consumed 33min 45.125s CPU time, 11.4M memory peak. 2025-04-24T11:22:44.801449-07:00 schallkreis systemd[3492]: pulseaudio.service: Scheduled restart job, restart counter is at 1. 2025-04-24T11:22:45.022289-07:00 schallkreis systemd[3492]: Starting pulseaudio.service - Sound Service... 2025-04-24T11:22:47.669907-07:00 schallkreis plasmashell[3874]: pw.conf: can't load config client.conf: No such file or directory 2025-04-24T11:22:47.669986-07:00 schallkreis plasmashell[3874]: pw.conf: can't load config client.conf: No such file or directory 2025-04-24T11:22:47.670011-07:00 schallkreis plasmashell[3874]: kpipewire_logging: Failed to create PipeWire context 2025-04-24T11:22:47.670024-07:00 schallkreis plasmashell[3874]: received error while creating the stream "Failed to create PipeWire context" Media monitor will not work. 2025-04-24T11:22:52.523562-07:00 schallkreis plasmashell[3874]: pw.conf: can't load config client.conf: No such file or directory 2025-04-24T11:22:52.523678-07:00 schallkreis plasmashell[3874]: pw.conf: can't load config client.conf: No such file or directory 2025-04-24T11:22:52.523700-07:00 schallkreis plasmashell[3874]: kpipewire_logging: Failed to create PipeWire context 2025-04-24T11:22:52.523713-07:00 schallkreis plasmashell[3874]: received error while creating the stream "Failed to create PipeWire context" Media monitor will not work. 2025-04-24T11:22:52.744151-07:00 schallkreis kernel: watchdog: BUG: soft lockup - CPU#16 stuck for 23s! [apcupsd:3183] 2025-04-24T11:22:52.744159-07:00 schallkreis kernel: Modules linked in: qrtr nfsd auth_rpcgss nfs_acl lockd grace binfmt_misc sunrpc nls_ascii nls_cp437 vfat fat amd_atl intel_rapl_msr intel_rapl_common mt7925e mt7925_common mt792x_lib mt76_connac_lib edac_mce_amd snd_hda_codec_realtek mt76 kvm_amd snd_hda_codec_generic mac80211 snd_hda_scodec_component snd_hda_codec_hdmi kvm snd_hda_intel snd_intel_dspcfg libarc4 snd_intel_sdw_acpi snd_hda_codec cfg80211 crct10dif_pclmul ghash_clmulni_intel sha512_ssse3 snd_hda_core sha256_ssse3 sha1_ssse3 snd_hwdep aesni_intel snd_pcm gf128mul crypto_simd cryptd wmi_bmof snd_timer spd5118 rapl ccp snd pcspkr k10temp rfkill soundcore joydev evdev parport_pc ppdev lp parport efi_pstore configfs nfnetlink efivarfs ip_tables x_tables autofs4 ext4 mbcache jbd2 crc32c_generic hid_generic usbhid hid amdgpu amdxcp drm_exec gpu_sched drm_buddy i2c_algo_bit drm_suballoc_helper drm_display_helper cec rc_core drm_ttm_helper ttm xhci_pci xhci_hcd drm_kms_helper ahci libahci libata drm thunderbolt r8169 usbcore nvme sp5100_tco realtek 2025-04-24T11:22:52.744161-07:00 schallkreis kernel: scsi_mod watchdog mdio_devres nvme_core libphy crc32_pclmul crc32c_intel i2c_piix4 i2c_smbus usb_common scsi_common nvme_auth video crc16 gpio_amdpt wmi gpio_generic button 2025-04-24T11:22:52.744162-07:00 schallkreis kernel: CPU: 16 UID: 0 PID: 3183 Comm: apcupsd Not tainted 6.12.21-amd64 #1 Debian 6.12.21-1 2025-04-24T11:22:52.744162-07:00 schallkreis kernel: Hardware name: System76 Thelio Mira/Thelio Mira, BIOS 3.11.SP01 12/05/2024 2025-04-24T11:22:52.744163-07:00 schallkreis kernel: RIP: 0010:pci_mmcfg_read+0xa4/0xe0 2025-04-24T11:22:52.744164-07:00 schallkreis kernel: Code: fe 01 75 0b 4c 01 e0 8a 00 0f b6 c0 89 45 00 e8 a2 6a 51 ff 31 c0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 4c 01 e0 8b 00 <89> 45 00 eb e0 4c 01 e0 66 8b 00 0f b7 c0 89 45 00 eb d2 e8 74 6a 2025-04-24T11:22:52.744164-07:00 schallkreis kernel: RSP: 0018:ffff9d0645f9f4b8 EFLAGS: 00000286 2025-04-24T11:22:52.744164-07:00 schallkreis kernel: RAX: 00000000ffffffff RBX: 0000000000f00000 RCX: 0000000000000ffc 2025-04-24T11:22:52.744165-07:00 schallkreis kernel: RDX: 00000000000000ff RSI: 000000000000000f RDI: 0000000000000000 2025-04-24T11:22:52.744165-07:00 schallkreis kernel: RBP: ffff9d0645f9f4f4 R08: 0000000000000004 R09: ffff9d0645f9f4f4 2025-04-24T11:22:52.744170-07:00 schallkreis kernel: R10: 000000000000000f R11: ffff9d0645f9f2e0 R12: 0000000000000ffc 2025-04-24T11:22:52.744170-07:00 schallkreis kernel: R13: 0000000000000000 R14: 0000000000000004 R15: 0000000000000000 2025-04-24T11:22:52.744170-07:00 schallkreis kernel: FS: 00007f8da22fc0c0(0000) GS:ffff8c8b5fa00000(0000) knlGS:0000000000000000 2025-04-24T11:22:52.744171-07:00 schallkreis kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2025-04-24T11:22:52.744171-07:00 schallkreis kernel: CR2: 00007f9fdd253680 CR3: 0000000111982000 CR4: 0000000000f50ef0
2025-04-24T11:22:52.744171-07:00 schallkreis kernel: PKRU: 55555554
2025-04-24T11:22:52.744172-07:00 schallkreis kernel: Call Trace:
2025-04-24T11:22:52.744172-07:00 schallkreis kernel:  <IRQ>
2025-04-24T11:22:52.744172-07:00 schallkreis kernel: ? watchdog_timer_fn.cold+0x3d/0xa1 2025-04-24T11:22:52.744173-07:00 schallkreis kernel: ? __pfx_watchdog_timer_fn+0x10/0x10 2025-04-24T11:22:52.744174-07:00 schallkreis kernel: ? __hrtimer_run_queues+0x12f/0x2a0 2025-04-24T11:22:52.744174-07:00 schallkreis kernel: ? hrtimer_interrupt+0xfa/0x210 2025-04-24T11:22:52.744174-07:00 schallkreis kernel: ? __sysvec_apic_timer_interrupt+0x52/0x100 2025-04-24T11:22:52.744175-07:00 schallkreis kernel: ? sysvec_apic_timer_interrupt+0x6c/0x90
2025-04-24T11:22:52.744176-07:00 schallkreis kernel:  </IRQ>
2025-04-24T11:22:52.744176-07:00 schallkreis kernel:  <TASK>
2025-04-24T11:22:52.744176-07:00 schallkreis kernel: ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 2025-04-24T11:22:52.744177-07:00 schallkreis kernel: ? pci_mmcfg_read+0xa4/0xe0 2025-04-24T11:22:52.744177-07:00 schallkreis kernel: ? pci_mmcfg_read+0x4f/0xe0 2025-04-24T11:22:52.744177-07:00 schallkreis kernel: trace_clock_x86_tsc+0x20/0x20 2025-04-24T11:22:52.744178-07:00 schallkreis kernel: pci_find_next_ext_capability+0x89/0xf0 2025-04-24T11:22:52.744178-07:00 schallkreis kernel: pci_restore_ltr_state+0x28/0x50 2025-04-24T11:22:52.744179-07:00 schallkreis kernel: pci_restore_state.part.0+0x29/0x370 2025-04-24T11:22:52.744179-07:00 schallkreis kernel: pci_pm_runtime_resume+0x45/0xf0 2025-04-24T11:22:52.744179-07:00 schallkreis kernel: ? __pfx_pci_pm_runtime_resume+0x10/0x10 2025-04-24T11:22:52.744179-07:00 schallkreis kernel: __rpm_callback+0x41/0x170 2025-04-24T11:22:52.744180-07:00 schallkreis kernel: ? __pfx_pci_pm_runtime_resume+0x10/0x10 2025-04-24T11:22:52.744180-07:00 schallkreis kernel: rpm_callback+0x55/0x60 2025-04-24T11:22:52.744180-07:00 schallkreis kernel: ? __pfx_pci_pm_runtime_resume+0x10/0x10 2025-04-24T11:22:52.744181-07:00 schallkreis kernel: rpm_resume+0x4d3/0x700 2025-04-24T11:22:52.744181-07:00 schallkreis kernel: ? sched_clock_cpu+0xf/0x1d0 2025-04-24T11:22:52.744181-07:00 schallkreis kernel: ? psi_group_change+0x138/0x300 2025-04-24T11:22:52.744182-07:00 schallkreis kernel: rpm_resume+0x2d3/0x700 2025-04-24T11:22:52.744183-07:00 schallkreis kernel: ? update_curr+0x98/0x280 2025-04-24T11:22:52.744183-07:00 schallkreis kernel: rpm_resume+0x2d3/0x700 2025-04-24T11:22:52.744183-07:00 schallkreis kernel: rpm_resume+0x2d3/0x700 2025-04-24T11:22:52.744184-07:00 schallkreis kernel: __pm_runtime_resume+0x4b/0x80 2025-04-24T11:22:52.744184-07:00 schallkreis kernel: usb_autoresume_device+0x1e/0x50 [usbcore] 2025-04-24T11:22:52.744184-07:00 schallkreis kernel: usbdev_open+0x103/0x260 [usbcore] 2025-04-24T11:22:52.744185-07:00 schallkreis kernel: ? bpf_prog_e8932b6bae2b9745_restrict_filesystems+0xb7/0x132 2025-04-24T11:22:52.744185-07:00 schallkreis kernel: chrdev_open+0xb2/0x230 2025-04-24T11:22:52.744186-07:00 schallkreis kernel: ? __pfx_chrdev_open+0x10/0x10 2025-04-24T11:22:52.744186-07:00 schallkreis kernel: do_dentry_open+0x14c/0x440
2025-04-24T11:22:52.744186-07:00 schallkreis kernel:  vfs_open+0x2e/0xe0
2025-04-24T11:22:52.744186-07:00 schallkreis kernel: path_openat+0x82e/0x12d0 2025-04-24T11:22:52.744187-07:00 schallkreis kernel: ? kick_pool+0x5c/0x140 2025-04-24T11:22:52.744187-07:00 schallkreis kernel: do_filp_open+0xc4/0x170 2025-04-24T11:22:52.744188-07:00 schallkreis kernel: do_sys_openat2+0xae/0xe0 2025-04-24T11:22:52.744188-07:00 schallkreis kernel: __x64_sys_openat+0x55/0xa0 2025-04-24T11:22:52.744188-07:00 schallkreis kernel: do_syscall_64+0x82/0x190 2025-04-24T11:22:52.744188-07:00 schallkreis kernel: ? vfs_read+0x15e/0x370 2025-04-24T11:22:52.744189-07:00 schallkreis kernel: ? vfs_read+0x15e/0x370 2025-04-24T11:22:52.744190-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210 2025-04-24T11:22:52.744190-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190 2025-04-24T11:22:52.744190-07:00 schallkreis kernel: ? terminate_walk+0x61/0x100 2025-04-24T11:22:52.744190-07:00 schallkreis kernel: ? path_openat+0x495/0x12d0 2025-04-24T11:22:52.744191-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190 2025-04-24T11:22:52.744191-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x4d/0x210 2025-04-24T11:22:52.744191-07:00 schallkreis kernel: ? do_filp_open+0xc4/0x170 2025-04-24T11:22:52.744192-07:00 schallkreis kernel: ? __rseq_handle_notify_resume+0xa2/0x4a0 2025-04-24T11:22:52.744192-07:00 schallkreis kernel: ? syscall_exit_to_user_mode+0x172/0x210 2025-04-24T11:22:52.744193-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190 2025-04-24T11:22:52.744194-07:00 schallkreis kernel: ? do_syscall_64+0x8e/0x190 2025-04-24T11:22:52.744194-07:00 schallkreis kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e 2025-04-24T11:22:52.744194-07:00 schallkreis kernel: RIP: 0033:0x7f8da23979ee 2025-04-24T11:22:52.744195-07:00 schallkreis kernel: Code: 08 0f 85 f5 4b ff ff 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 80 00 00 00 00 48 83 ec 08 2025-04-24T11:22:52.744195-07:00 schallkreis kernel: RSP: 002b:00007fff105c8718 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 2025-04-24T11:22:52.744196-07:00 schallkreis kernel: RAX: ffffffffffffffda RBX: 00007f8da22fc0c0 RCX: 00007f8da23979ee 2025-04-24T11:22:52.964717-07:00 schallkreis kernel: RDX: 0000000000080002 RSI: 00005608b22ab500 RDI: ffffffffffffff9c 2025-04-24T11:22:52.964718-07:00 schallkreis kernel: RBP: 0000000000000007 R08: 0000000000000000 R09: 0000000000000000 2025-04-24T11:22:52.964718-07:00 schallkreis kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff105c8850 2025-04-24T11:22:52.964719-07:00 schallkreis kernel: R13: 00007fff105c8800 R14: 0000560886d392cb R15: 0000000000000004
2025-04-24T11:22:52.964719-07:00 schallkreis kernel:  </TASK>

Perhaps this

2025-04-24T11:22:52.744179-07:00 schallkreis kernel: pci_pm_runtime_resume+0x45/0xf0 2025-04-24T11:22:52.744179-07:00 schallkreis kernel: ? __pfx_pci_pm_runtime_resume+0x10/0x10 2025-04-24T11:22:52.744179-07:00 schallkreis kernel: __rpm_callback+0x41/0x170 2025-04-24T11:22:52.744180-07:00 schallkreis kernel: ? __pfx_pci_pm_runtime_resume+0x10/0x10 2025-04-24T11:22:52.744180-07:00 schallkreis kernel: rpm_callback+0x55/0x60 2025-04-24T11:22:52.744180-07:00 schallkreis kernel: ? __pfx_pci_pm_runtime_resume+0x10/0x10

is the output of the actual bug? That's where the stack which printed out over and over and the failure stack originated.

Best wishes,

Alison Chaiken

---
Alison Chaiken                   ali...@she-devel.com
https://she-devel.com
"What respite from her thrilling toil did Beauty ever take -- But Work might be Electric Rest To those that Magic make" -- Emily Dickson

Reply via email to