Control: tags -1 + moreinfo HI Dominique,
On Sat, Feb 05, 2022 at 11:33:33AM +0100, Dominique Dumont wrote: > Package: src:linux > Version: 5.15.15-2 > Severity: normal > Tags: upstream > > Dear Maintainer, > > > Since upgrade to linux-image-5.15.0-3-amd6, suspending my machine no > longer works correctly: the screen goes blank as usual, but comes back > after 10s or so. > > The most relevant kernel logs are: > > [ 257.531771] PM: suspend entry (s2idle) > [ 257.610570] Filesystems sync: 0.078 seconds > [ 257.610723] (NULL device *): firmware: direct-loading firmware > regulatory.db > [ 257.610726] (NULL device *): firmware: direct-loading firmware > regulatory.db.p7s > [ 257.610745] (NULL device *): firmware: direct-loading firmware > intel/ibt-17-16-1.ddc > [ 257.610954] (NULL device *): firmware: direct-loading firmware > intel/ibt-17-16-1.sfi > [ 257.610986] (NULL device *): firmware: direct-loading firmware > iwlwifi-9000-pu-b0-jf-b0-46.ucode > [ 257.611211] (NULL device *): firmware: direct-loading firmware > i915/kbl_dmc_ver1_04.bin > [ 257.726247] Freezing user space processes ... (elapsed 0.002 seconds) done. > [ 257.728699] OOM killer disabled. > [ 257.728700] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) > done. > [ 257.730085] printk: Suspending console(s) (use no_console_suspend to debug) > [ 257.839817] amdgpu: > last message was failed ret is 65535 > [ 257.839842] amdgpu: > failed to send message 261 ret is 65535 > > [ ... lots of failed message ...] > > [ 257.840748] ------------[ cut here ]------------ > [ 257.840751] WARNING: CPU: 4 PID: 58 at > drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:2014 > dm_suspend+0x19e/0x1c0 [amdgpu] > [ 257.841665] Modules linked in: rfcomm xt_conntrack nft_chain_nat > xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 > nf_defrag_ipv4 nft_counter xt_addrtype nft_compat nf_tables libcrc32c > nfnetlink br_netfilter bridge stp llc xfrm_user xfrm_algo nvme_fabrics > typec_displayport cmac algif_hash algif_skcipher af_alg overlay bnep > binfmt_misc nls_ascii nls_cp437 squashfs vfat fat loop x86_pkg_temp_thermal > intel_powerclamp mei_hdcp snd_sof_pci_intel_cnl coretemp dell_rbtn > intel_rapl_msr snd_sof_intel_hda_common soundwire_intel > soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci > kvm_intel snd_sof_xtensa_dsp snd_hda_codec_hdmi snd_sof soundwire_bus btusb > btrtl kvm snd_ctl_led snd_soc_skl btbcm btintel dell_laptop irqbypass iwlmvm > snd_soc_hdac_hda rapl bluetooth snd_hda_ext_core snd_soc_sst_ipc > snd_soc_sst_dsp snd_hda_codec_realtek snd_soc_acpi_intel_match snd_soc_acpi > dell_smm_hwmon snd_hda_codec_generic intel_cstate ledtrig_audio mac80211 > snd_soc_core > [ 257.841816] dell_wmi intel_uncore snd_compress dell_smbios > jitterentropy_rng dcdbas snd_hda_intel sha512_ssse3 serio_raw pcspkr libarc4 > snd_intel_dspcfg sha512_generic efi_pstore dell_wmi_descriptor uvcvideo > snd_intel_sdw_acpi iwlwifi snd_usb_audio snd_hda_codec dell_wmi_sysman > videobuf2_vmalloc videobuf2_memops firmware_attributes_class iTCO_wdt > videobuf2_v4l2 intel_pmc_bxt snd_hda_core drbg iTCO_vendor_support > snd_usbmidi_lib videobuf2_common intel_wmi_thunderbolt wmi_bmof ee1004 > watchdog snd_hwdep ansi_cprng joydev snd_rawmidi hid_multitouch videodev > cfg80211 snd_seq_device mc snd_pcm processor_thermal_device_pci_legacy > processor_thermal_device snd_timer processor_thermal_rfim > processor_thermal_mbox ucsi_acpi processor_thermal_rapl snd mei_me typec_ucsi > intel_rapl_common ecdh_generic roles soundcore mei ecc rfkill > intel_soc_dts_iosf intel_pch_thermal typec int3403_thermal evdev > int340x_thermal_zone dell_smo8800 intel_hid int3400_thermal intel_pmc_core > acpi_thermal_rel acpi_pad > [ 257.841935] sparse_keymap ac parport_pc ppdev sunrpc lp parport fuse > configfs efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 > crc32c_generic dm_crypt dm_mod hid_jabra usbhid r8152 mii hid_generic amdgpu > i915 rtsx_pci_sdmmc mmc_core crc32_pclmul crc32c_intel ghash_clmulni_intel > gpu_sched nvme aesni_intel e1000e crypto_simd cryptd nvme_core i2c_algo_bit > drm_ttm_helper ptp t10_pi ttm psmouse pps_core xhci_pci i2c_i801 > drm_kms_helper thunderbolt crc_t10dif xhci_hcd cec i2c_smbus rc_core > crct10dif_generic crct10dif_pclmul crct10dif_common rtsx_pci drm usbcore > i2c_hid_acpi intel_lpss_pci i2c_hid intel_lpss idma64 usb_common hid wmi > battery button video > [ 257.842049] CPU: 4 PID: 58 Comm: kworker/u16:7 Not tainted 5.15.0-3-amd64 > #1 Debian 5.15.15-2 > [ 257.842057] Hardware name: Dell Inc. Precision 3540/0M14W7, BIOS 1.9.1 > 07/06/2020 > [ 257.842062] Workqueue: events_unbound async_run_entry_fn > [ 257.842075] RIP: 0010:dm_suspend+0x19e/0x1c0 [amdgpu] > [ 257.842795] Code: ff 31 d2 4c 89 e6 4c 89 ef e8 4e d7 15 00 83 f8 01 74 1e > 89 c2 48 c7 c6 40 36 f5 c0 48 c7 c7 50 bc 01 c1 e8 14 89 61 ff eb c2 <0f> 0b > e9 95 fe ff ff 4c 89 e6 4c 89 ef e8 60 26 15 00 eb ae e8 d9 > [ 257.842801] RSP: 0018:ffffac778029fcf0 EFLAGS: 00010286 > [ 257.842808] RAX: 0000000000000000 RBX: ffff9e72cb1b5b08 RCX: > 0000000000000027 > [ 257.842812] RDX: 0000000000000009 RSI: 0000000000000001 RDI: > ffff9e72cb1a0000 > [ 257.842816] RBP: ffff9e72cb1a0000 R08: 0000000000000032 R09: > 0000000000000004 > [ 257.842819] R10: 000000000000000f R11: ffffffffb1b82693 R12: > ffff9e72cb1a0000 > [ 257.842823] R13: 0000000000000004 R14: 0000000000000002 R15: > ffff9e72c0145f05 > [ 257.842826] FS: 0000000000000000(0000) GS:ffff9e762e500000(0000) > knlGS:0000000000000000 > [ 257.842831] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 257.842835] CR2: 000055620fb334f6 CR3: 0000000412e10003 CR4: > 00000000003706e0 > [ 257.842840] Call Trace: > [ 257.842846] <TASK> > [ 257.842851] ? vi_common_set_clockgating_state+0x229/0x2f0 [amdgpu] > [ 257.843356] amdgpu_device_ip_suspend_phase1+0x5e/0xc0 [amdgpu] > [ 257.843771] amdgpu_device_suspend+0x62/0xc0 [amdgpu] > [ 257.844184] amdgpu_pmops_suspend+0x36/0x70 [amdgpu] > [ 257.844631] pci_pm_suspend+0x71/0x160 > [ 257.844643] ? pci_pm_freeze+0xb0/0xb0 > [ 257.844651] dpm_run_callback+0x47/0x120 > [ 257.844658] __device_suspend+0x10e/0x470 > [ 257.844664] async_suspend+0x1b/0x90 > [ 257.844669] async_run_entry_fn+0x2d/0x130 > [ 257.844677] process_one_work+0x1ee/0x390 > [ 257.844685] worker_thread+0x53/0x3e0 > [ 257.844690] ? process_one_work+0x390/0x390 > [ 257.844696] kthread+0x124/0x150 > [ 257.844706] ? set_kthread_struct+0x40/0x40 > [ 257.844715] ret_from_fork+0x1f/0x30 > [ 257.844728] </TASK> > [ 257.844730] ---[ end trace f4b6157e346cd3f6 ]--- > [ 258.419015] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend > of IP block <vce_v3_0> failed -110 > [ 258.878568] amdgpu: > last message was failed ret is 65535 > > [ ... lots of failed message ...] > > [ 259.957788] amdgpu: Failed to force to switch arbf0! > [ 259.957789] amdgpu: [disable_dpm_tasks] Failed to disable DPM! > [ 259.957789] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend > of IP block <powerplay> failed -22 > [ 261.029543] amdgpu 0000:3b:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] > *ERROR* ring kiq_2.1.0 test failed (-110) > [ 261.029632] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed > [ 263.171945] amdgpu: cp is busy, skip halt cp > [ 264.242959] amdgpu: rlc is busy, skip halt rlc > > [ ... another kernel warning ... ] > > [ 265.315820] amdgpu 0000:3b:00.0: amdgpu: PCI CONFIG reset > [ 266.386163] PM: pci_pm_suspend(): amdgpu_pmops_suspend+0x0/0x70 [amdgpu] > returns -22 > [ 266.386248] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -22 > [ 266.386253] amdgpu 0000:3b:00.0: PM: failed to suspend async: error -22 > [ 266.386382] PM: Some devices failed to suspend, or early wake event > detected > [ 266.681752] r8152 4-1.3:1.0 enx00e04c680aef: carrier on > [ 267.069698] OOM killer enabled. > [ 267.069700] Restarting tasks ... > > > Not that suspend works fine when booting linux-image-5.15.0-2-amd6. Does the issue persist if you upgrade to the most recent 5.16.y version? 5.16.4-1~exp1 (5.16.7-1 should land soon as well). Any chance you can bisect the commit introducing the issue? Regards, Salvatore