Update: this is NOT a separate bug from the short freezes - it is the same display-core failure, and it reproduces on the 6.19.10 mainline kernel that several of us have been using as a workaround.
Today I hit a ~2 minute full freeze on 6.19.10 (not 7.0.x). From my side: - Counter-Strike 2 on monitor 1, YouTube on monitor 2. - ~18:24 both monitors froze. After a few seconds monitor 1 recovered on its own; monitor 2 stayed completely frozen on a single YouTube frame. - Closing Chrome did nothing - the frozen frame stayed on screen. - I opened Settings -> Display and toggled the second monitor. Both screens went inactive, then both came back. Recovery at ~18:26. journalctl -k for that window: May 29 18:24:01 kernel: amdgpu 0000:03:00.0: [drm] *ERROR* [CRTC:367:crtc-1] flip_done timed out May 29 18:24:01 kernel: amdgpu 0000:03:00.0: amdgpu: [drm] *ERROR* [CRTC:367:crtc-1] hw_done or flip_done timed out May 29 18:25:12 kernel: workqueue: dm_handle_vmin_vmax_update [amdgpu] hogged CPU for >10000us 19 times, consider switching to WQ_UNBOUND May 29 18:25:32 kernel: amdgpu 0000:03:00.0: [drm] *ERROR* flip_done timed out May 29 18:25:32 kernel: amdgpu 0000:03:00.0: [drm] *ERROR* [CRTC:367:crtc-1] commit wait timed out May 29 18:25:43 kernel: amdgpu 0000:03:00.0: [drm] *ERROR* flip_done timed out May 29 18:25:43 kernel: amdgpu 0000:03:00.0: [drm] *ERROR* [CONNECTOR:387:DP-2] commit wait timed out May 29 18:25:53 kernel: amdgpu 0000:03:00.0: [drm] *ERROR* flip_done timed out May 29 18:25:53 kernel: amdgpu 0000:03:00.0: [drm] *ERROR* [PLANE:148:plane-2] commit wait timed out May 29 18:25:53 kernel: amdgpu 0000:03:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn32_program_compbuf_size line:147 Key points: 1. Same signatures as the 7.0.0 reports. dm_handle_vmin_vmax_update hogging, the dcn32_program_compbuf_size REG_WAIT timeout, and flip_done / commit wait timeouts are exactly the functions seen on 7.0.0-15. So 6.19.10 is NOT free of this defect. 2. This is a display/KMS commit wedge, not a GPU hang. There is no ring timeout, no GPU reset, and no IP block message anywhere in the window. The GPU kept rendering (the game continued underneath) - only the presentation/atomic commit on one pipe got stuck. 3. The freeze was localized to crtc-1 / DP-2 (my second monitor). Only that CRTC appears in the errors; monitor 1's pipe recovered on its own, which matches what I saw. Closing Chrome changed nothing because the wedge is at the hardware-commit level, not tied to the app holding the buffer. 4. The 18:25:32 / :43 / :53 timeouts are my own modeset attempts from Settings also timing out because the pipe was still wedged. The disable -> re-enable finally forced a full modeset that reset the pipe and recovered it. 5. The REG_WAIT itself is short (1us * 100 tries = ~100us), so it is the tell, not the cause: dcn32_program_compbuf_size waited for a hardware ack that never arrived, so the pipe programming never completed, flip_done never signaled, and the atomic commit hung until the manual modeset. This same REG_WAIT failure path therefore exists on DC 3.2.359 (6.19.10), not only on DC 3.2.369 (7.0.0). Frequency / workaround caveat: I have been running 6.19.10 as a workaround for 20 days now. In those 20 days this is the second time I have had a ~2 minute full freeze. The first time was identical: both screens froze, monitor 1 recovered, monitor 2 stayed frozen for a few minutes, then recovered on its own. Separately, the "hogged CPU ... 19 times" counter is cumulative since boot, which shows the short-freeze path is still active on 6.19.10 - I just rarely notice the sub-second ones. Conclusion: 6.19.10 reduces the frequency of the freezes but does not fix them. The same dm_handle_vmin_vmax_update / dcn32_program_compbuf_size / flip_done failure path is present on both 6.19.10 (DC 3.2.359) and 7.0.0-15 (DC 3.2.369). This looks less like a clean "introduced in 7.0" regression and more like a latent display-core defect that 7.0 made fire far more often. It is worth flagging to the amdgpu display team that the dcn32_program_compbuf_size REG_WAIT timeout and the resulting flip-completion wedge reproduce on 3.2.359 as well, so a fix targeting only the 7.0 frequency change may not cover the underlying pipe wedge. For anyone using a 6.19.x kernel as a workaround: it is a mitigation, not a fix - the same total freeze can still happen. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2150776 Title: Ubuntu 26.04 GNOME Wayland: random short display/presentation freezes on AMD RX 7900 XT while apps continue running To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2150776/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
