This happens regularly to me too.

$ uname -a
Linux ionian 5.4.0-54-generic #60-Ubuntu SMP Fri Nov 6 10:37:59 UTC 2020 x86_64 
x86_64 x86_64 GNU/Linux

Ubuntu 20.04.

▐$ sudo dpkg -l | grep xserver-xorg-video
ii  xserver-xorg-video-all                                1:7.7+19ubuntu14      
                      amd64        X.Org X server -- output driver metapackage
ii  xserver-xorg-video-amdgpu                             19.1.0-1              
                      amd64        X.Org X server -- AMDGPU display driver
ii  xserver-xorg-video-ati                                1:19.1.0-1            
                      amd64        X.Org X server -- AMD/ATI display driver 
wrapper
ii  xserver-xorg-video-fbdev                              1:0.5.0-1ubuntu1      
                      amd64        X.Org X server -- fbdev display driver
ii  xserver-xorg-video-intel                              
2:2.99.917+git20200226-1                    amd64        X.Org X server -- 
Intel i8xx, i9xx display driver
rc  xserver-xorg-video-modesetting                        0.9.0-1build1         
                      amd64        X.Org X server -- Generic modesetting driver
ii  xserver-xorg-video-nouveau                            1:1.0.16-1            
                      amd64        X.Org X server -- Nouveau display driver
ii  xserver-xorg-video-openchrome                         1:0.6.0-3build1       
                      amd64        X.Org X server -- OpenChrome display driver
ii  xserver-xorg-video-qxl                                0.1.5+git20200331-1   
                      amd64        X.Org X server -- QXL display driver
ii  xserver-xorg-video-radeon                             1:19.1.0-1            
                      amd64        X.Org X server -- AMD/ATI Radeon display 
driver
ii  xserver-xorg-video-vesa                               1:2.4.0-2             
                      amd64        X.Org X server -- VESA display driver
ii  xserver-xorg-video-vmware                             1:13.3.0-3            
                      amd64        X.Org X server -- VMware display driver


journalctl gives this at the start of the crash. I coincides with starting 
Slack (the chat program, it uses google-chrome):

Dec 09 08:56:07 ionian kernel: radeon 0000:01:00.0: ring 0 stalled for more 
than 10156msec
Dec 09 08:56:07 ionian kernel: radeon 0000:01:00.0: GPU lockup (current fence 
id 0x0000000000010df8 last fence id 0x0000000000010e00 on ring 0)
Dec 09 08:56:08 ionian kernel: radeon 0000:01:00.0: ring 3 stalled for more 
than 10216msec
Dec 09 08:56:08 ionian kernel: radeon 0000:01:00.0: GPU lockup (current fence 
id 0x0000000000002eef last fence id 0x0000000000002ef1 on ring 3)
Dec 09 08:56:08 ionian kernel: radeon 0000:01:00.0: ring 0 stalled for more 
than 10664msec
Dec 09 08:56:08 ionian kernel: radeon 0000:01:00.0: GPU lockup (current fence 
id 0x0000000000010df8 last fence id 0x0000000000010e00 on ring 0)
Dec 09 08:56:08 ionian kernel: radeon 0000:01:00.0: ring 3 stalled for more 
than 10728msec
Dec 09 08:56:08 ionian kernel: radeon 0000:01:00.0: GPU lockup (current fence 
id 0x0000000000002eef last fence id 0x0000000000002ef1 on ring 3)
Dec 09 08:56:09 ionian kernel: radeon 0000:01:00.0: ring 0 stalled for more 
than 11176msec
Dec 09 08:56:09 ionian kernel: radeon 0000:01:00.0: GPU lockup (current fence 
id 0x0000000000010df8 last fence id 0x0000000000010e00 on ring 0)
Dec 09 08:56:09 ionian kernel: radeon 0000:01:00.0: ring 3 stalled for more 
than 11240msec
Dec 09 08:56:09 ionian kernel: radeon 0000:01:00.0: GPU lockup (current fence 
id 0x0000000000002eef last fence id 0x0000000000002ef1 on ring 3)
Dec 09 08:56:09 ionian kernel: radeon 0000:01:00.0: ring 0 stalled for more 
than 11688msec
Dec 09 08:56:09 ionian kernel: radeon 0000:01:00.0: GPU lockup (current fence 
id 0x0000000000010df8 last fence id 0x0000000000010e00 on ring 0)
Dec 09 08:56:09 ionian kernel: radeon 0000:01:00.0: ring 3 stalled for more 
than 11756msec
Dec 09 08:56:09 ionian kernel: radeon 0000:01:00.0: GPU lockup (current fence 
id 0x0000000000002eef last fence id 0x0000000000002ef1 on ring 3)
Dec 09 08:56:10 ionian kernel: radeon 0000:01:00.0: ring 0 stalled for more 
than 12200msec
Dec 09 08:56:10 ionian kernel: radeon 0000:01:00.0: GPU lockup (current fence 
id 0x0000000000010df8 last fence id 0x0000000000010e00 on ring 0)
Dec 09 08:56:10 ionian /usr/lib/gdm3/gdm-x-session[5666]: [12/09/20, 
08:56:10:433] info: [ACTION:CHANNEL-SIDEBAR] Selected GFFT85V5M

After a while it goes in a loop like this, and the symptoms are the same
as OP, no response from keyboard, screen freeze. The sound from my web
meeting was still going though, if with some hickups (probably in phase
with the radeon errors):

Dec 09 08:56:18 ionian kernel: radeon 0000:01:00.0: GPU reset succeeded, trying 
to resume
Dec 09 08:56:18 ionian kernel: [drm] PCIE gen 2 link speeds already enabled
Dec 09 08:56:18 ionian kernel: radeon 0000:01:00.0: Wait for MC idle timedout !
Dec 09 08:56:18 ionian kernel: radeon 0000:01:00.0: Wait for MC idle timedout !
Dec 09 08:56:18 ionian kernel: [drm] PCIE GART of 1024M enabled (table at 
0x000000000014C000).
Dec 09 08:56:18 ionian kernel: radeon 0000:01:00.0: WB enabled
Dec 09 08:56:18 ionian kernel: radeon 0000:01:00.0: fence driver on ring 0 use 
gpu addr 0x0000000040000c00 and cpu addr 0x000000009ed2883c
Dec 09 08:56:18 ionian kernel: radeon 0000:01:00.0: fence driver on ring 3 use 
gpu addr 0x0000000040000c0c and cpu addr 0x00000000c6312e78
Dec 09 08:56:18 ionian kernel: radeon 0000:01:00.0: fence driver on ring 5 use 
gpu addr 0x000000000005c418 and cpu addr 0x0000000009b8d531
Dec 09 08:56:18 ionian kernel: [drm:r600_ring_test [radeon]] *ERROR* radeon: 
ring 0 test failed (scratch(0x8504)=0xCAFEDEAD)
Dec 09 08:56:18 ionian kernel: [drm:evergreen_resume [radeon]] *ERROR* 
evergreen startup failed on resume
Dec 09 08:56:28 ionian kernel: radeon 0000:01:00.0: ring 0 stalled for more 
than 10176msec
Dec 09 08:56:28 ionian kernel: radeon 0000:01:00.0: GPU lockup (current fence 
id 0x0000000000010df8 last fence id 0x0000000000010e18 on ring 0)
Dec 09 08:56:29 ionian kernel: radeon 0000:01:00.0: Saved 98855 dwords of 
commands on ring 0.
Dec 09 08:56:29 ionian kernel: snd_hda_intel 0000:00:1b.0: Unstable LPIB (29688 
>= 8192); disabling LPIB delay counting
Dec 09 08:56:29 ionian kernel: radeon 0000:01:00.0: GPU softreset: 0x0000001D
Dec 09 08:56:29 ionian kernel: radeon 0000:01:00.0:   GRBM_STATUS               
= 0xA0003828
Dec 09 08:56:29 ionian kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE0           
= 0x00000007
Dec 09 08:56:29 ionian kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE1           
= 0x00000007
Dec 09 08:56:29 ionian kernel: radeon 0000:01:00.0:   SRBM_STATUS               
= 0x200006C0
Dec 09 08:56:29 ionian kernel: radeon 0000:01:00.0:   SRBM_STATUS2              
= 0x00000000
Dec 09 08:56:29 ionian kernel: radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 
= 0x00000000
Dec 09 08:56:29 ionian kernel: radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 
= 0x00010002
Dec 09 08:56:29 ionian kernel: radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     
= 0x00020182
Dec 09 08:56:29 ionian kernel: radeon 0000:01:00.0:   R_008680_CP_STAT          
= 0x80038243
Dec 09 08:56:29 ionian kernel: radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   
= 0x44C83D57
Dec 09 08:56:30 ionian kernel: radeon 0000:01:00.0: Wait for MC idle timedout !
Dec 09 08:56:30 ionian kernel: radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00007F6B
Dec 09 08:56:30 ionian kernel: radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00100100
Dec 09 08:56:30 ionian kernel: radeon 0000:01:00.0:   GRBM_STATUS               
= 0x00003828
Dec 09 08:56:30 ionian kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE0           
= 0x00000007
Dec 09 08:56:30 ionian kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE1           
= 0x00000007
Dec 09 08:56:30 ionian kernel: radeon 0000:01:00.0:   SRBM_STATUS               
= 0x200006C0
Dec 09 08:56:30 ionian kernel: radeon 0000:01:00.0:   SRBM_STATUS2              
= 0x00000000
Dec 09 08:56:30 ionian kernel: radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 
= 0x00000000
Dec 09 08:56:30 ionian kernel: radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 
= 0x00000000
Dec 09 08:56:30 ionian kernel: radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     
= 0x00000000
Dec 09 08:56:30 ionian kernel: radeon 0000:01:00.0:   R_008680_CP_STAT          
= 0x00000000
Dec 09 08:56:30 ionian kernel: radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   
= 0x44C83D57
Dec 09 08:56:30 ionian kernel: radeon 0000:01:00.0: GPU reset succeeded, trying 
to resume

HD5770

Anything more I could send? Or do to fix it?

-- 
You received this bug notification because you are a member of Desktop
Packages, which is subscribed to xserver-xorg-video-ati in Ubuntu.
https://bugs.launchpad.net/bugs/1863390

Title:
  GPU lockup ring 0 stalled for more than X msec

Status in xserver-xorg-video-ati package in Ubuntu:
  Incomplete

Bug description:
  Since the update:

   xserver-xorg-video-ati-hwe-18.04 (1:19.0.1-1ubuntu1~18.04.1) bionic;

  which resulted from:

   https://bugs.launchpad.net/fedora/+source/xserver-xorg-video-
  ati/+bug/1841718

  I've experienced GPU freezes where all video becomes unresponsive,
  both Xorg and Ctrl+Alt terminal switching, and the GPU fan goes to
  full. I am still able to access the system via SSH.

  Sometimes dmesg ends up full of this message repeating over and over:

   radeon 0000:01:00.0: ring 0 stalled for more than 24040msec
   radeon 0000:01:00.0: GPU lockup (current fence id 0x0000000000009e44 last 
fence id 0x0000000000009e49 on ring 0)

  I sometimes get a few GPU soft reset which seem to fail in drm(?):

   radeon 0000:01:00.0: Saved 110839 dwords of commands on ring 0.
   radeon 0000:01:00.0: GPU softreset: 0x00000008
   ...
   radeon 0000:01:00.0: Wait for MC idle timedout !
   radeon 0000:01:00.0: Wait for MC idle timedout !
   [drm] PCIE GART of 1024M enabled (table at 0x0000000000162000).
   radeon 0000:01:00.0: WB enabled 
   radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 
and cpu addr 0x00000000725651ad
   radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c 
and cpu addr 0x00000000c3678ed8
   radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000072118 
and cpu addr 0x00000000dbd9e01b
   [drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed 
(scratch(0x8504)=0xCAFEDEAD)
   [drm:evergreen_resume [radeon]] *ERROR* evergreen startup failed on resume

  Even if the above reset doesn't happen, this freeze always results in
  a unable to handle page fault" BUG in radeon_ring_backup, entered from
  various call paths, eg:

   BUG: unable to handle page fault for address: ffffbc2d80574ffc
   ...
   Oops: 0000 [#1] SMP PTI 
   CPU: 2 PID: 11243 Comm: kworker/2:1H Not tainted 5.5.0-050500-generic 
#202001262030
   Workqueue: radeon-crtc radeon_flip_work_func [radeon]
   RIP: 0010:radeon_ring_backup+0xc9/0x140 [radeon]
   Call Trace:
    radeon_gpu_reset+0xc3/0x2f0 [radeon]
    radeon_flip_work_func+0x1f3/0x250 [radeon]
    ? __schedule+0x2e0/0x760
    process_one_work+0x1b5/0x370
    worker_thread+0x50/0x3d0
    kthread+0x104/0x140
    ? process_one_work+0x370/0x370
    ? kthread_park+0x90/0x90
    ret_from_fork+0x35/0x40

  or:

   BUG: unable to handle page fault for address: ffffc03901000ffc
   ...
   Oops: 0000 [#1] SMP PTI

   CPU: 3 PID: 2227 Comm: compton Not tainted 5.3.0-28-generic 
#30~18.04.1-Ubuntu
   RIP: 0010:radeon_ring_backup+0xd3/0x140 [radeon]
   Call Trace:
    radeon_gpu_reset+0xb9/0x340 [radeon]
    ? dma_fence_wait_timeout+0x48/0x110
    ? reservation_object_wait_timeout_rcu+0x19d/0x340
    radeon_gem_handle_lockup.part.4+0xe/0x20 [radeon]
    radeon_gem_wait_idle_ioctl+0xa6/0x110 [radeon]
    ? radeon_gem_busy_ioctl+0x80/0x80 [radeon]
    drm_ioctl_kernel+0xb0/0x100 [drm]
    drm_ioctl+0x389/0x450 [drm]
    ? radeon_gem_busy_ioctl+0x80/0x80 [radeon]
    ? __switch_to_asm+0x40/0x70
    ? __switch_to_asm+0x34/0x70
    ? __switch_to_asm+0x40/0x70
    ? __switch_to_asm+0x40/0x70
    ? __switch_to_asm+0x34/0x70
    ? __switch_to_asm+0x40/0x70
    ? __switch_to_asm+0x34/0x70
    ? __switch_to_asm+0x40/0x70
    radeon_drm_ioctl+0x4f/0x80 [radeon]
    do_vfs_ioctl+0xa9/0x640
    ? __schedule+0x2b0/0x670
    ksys_ioctl+0x75/0x80
    __x64_sys_ioctl+0x1a/0x20
    do_syscall_64+0x5a/0x130
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

  I've tried both 5.3.0-28-generic and 5.5.0-050500-generic from kernel-
  ppa but that made no difference. It appears to be a bug in radeon.

  Nothing specific makes this happen, just regular usage with a
  compositing window manager. I'm not playing games or particularly
  exercising the GPU. The last two times I was just reading in web
  browser. It's also happened in the middle of the night while I was
  asleep. Sometimes I have a few days uptime, sometimes it happens in
  less than 24 hours from boot.

  This never happened before the radeon update mentioned on the first
  line.

  I'll attach two files of dmesg output. As per
  https://wiki.ubuntu.com/X/Troubleshooting/Freeze I've installed and
  started apport for next time it happens.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-ati/+bug/1863390/+subscriptions

-- 
Mailing list: https://launchpad.net/~desktop-packages
Post to     : desktop-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~desktop-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to