Issue occurred again after BIOS update, during make -j12. I also had
chrome and vmplayer running. Dmesg errors from journalctl:

kernel: pcieport 0000:00:1b.0: AER: Multiple Corrected error received: 
0000:01:00.0
kernel: pcieport 0000:00:1b.0: AER: PCIe Bus Error: severity=Corrected, 
type=Data Link Layer, (Transmitter ID)
kernel: pcieport 0000:00:1b.0: AER:   device [8086:a340] error 
status/mask=00001000/00002000
kernel: pcieport 0000:00:1b.0: AER:    [12] Timeout               
kernel: nvme 0000:01:00.0: AER: PCIe Bus Error: severity=Corrected, type=Data 
Link Layer, (Receiver ID)
kernel: nvme 0000:01:00.0: AER:   device [1344:5410] error 
status/mask=00000040/00002000
kernel: nvme 0000:01:00.0: AER:    [ 6] BadTLP                
kernel: nvme 0000:01:00.0: AER:   Error of this Agent is reported first

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to intel-gpu-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1861294

Title:
  Gpu watchdog segfault and video+kbd+mouse freeze on optiplex 7060
  intel gpu

Status in intel-gpu-tools package in Ubuntu:
  New

Bug description:
  Running up-to-date Ubuntu-18.04.3 with kernel 5.3.0-26 on a Dell
  Optiplex 7060 with an i7-8700 CPU and Intel UHD Graphics 630
  (Coffeelake 3x8 GT2).

  I had chrome, slack and vmware-player running in Gnome. While doing
  some git clone, screen+mouse+keyboard froze for 2 minutes after which
  xorg and everything else recovered. I saw this in dmesg:

  kernel: show_signal_msg: 2 callbacks suppressed
  kernel: GpuWatchdog[20399]: segfault at 0 ip 0000556fd1665ded sp 
00007efbf17e46c0 error 6 in chrome[556fcd72a000+7171000]
  kernel: Code: 48 c1 c9 03 48 81 f9 af 00 00 00 0f 87 c9 00 00 00 48 8d 15 a9 
5a 9c fb f6 04 11 20 0f 84 b8 00 00 00 be 01 00 00 00 ff 50 30 <c7> 04 25 00 00 
00 00 37 13 00 00 c6 05 c1 6d 
  kernel: nvme nvme0: I/O 202 QID 6 timeout, aborting
  kernel: nvme nvme0: I/O 203 QID 6 timeout, aborting
  kernel: nvme nvme0: I/O 204 QID 6 timeout, aborting
  kernel: nvme nvme0: I/O 205 QID 6 timeout, aborting
  kernel: nvme nvme0: Abort status: 0x0
  kernel: nvme nvme0: Abort status: 0x0
  kernel: nvme nvme0: Abort status: 0x0
  kernel: nvme nvme0: Abort status: 0x0
  kernel: nvme nvme0: I/O 202 QID 6 timeout, reset controller
  kernel: nvme nvme0: 12/0/0 default/read/poll queues

  While writing this bug report, the system froze again, and this time
  it didn't recover. After a cold reset I didn't see any other
  GpuWatchdog messages in journalctl.

  Ubuntu applied a BIOS firmware update before the first freeze, so my
  BIOS was updated as part of the cold reset I did. Not sure if this is
  relevant to reproducing the freeze.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/intel-gpu-tools/+bug/1861294/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to