On Fri, May 22, 2026 at 12:05:35PM -0700, Linus Torvalds wrote: > On Fri, 22 May 2026 at 11:55, Maarten Lankhorst <[email protected]> wrote: > > > > There's a > > May 22 11:09:19 3970x kernel: xe 0000:4b:00.0: [drm] Tile0: GT0: Timedout > > job: seqno=4485322, lrc_seqno=4485322, guc_id=0, flags=0x73 in no process > > [-1] > > May 22 11:09:19 3970x kernel: xe 0000:4b:00.0: [drm] Xe device coredump has > > been created > > May 22 11:09:19 3970x kernel: xe 0000:4b:00.0: [drm] Check your > > /sys/class/drm/card0/device/devcoredump/data > > > > Do you have this coredump too? > > Nope. I was assuming it didn't survive the reboot.
It doesn't. In this kind of setup the best way to deal with devcoredump is to create a udev rule that copies the data file to a persistent place. > > (This machine doesn't allow any remote logins - very much on purpose - > so when the GPU hangs, it's toast). Any journal saving the kernel buf log of previous boots? Preferably with some drm.debug flags enabled 0xf likely Also: Any bisect possible in this setup? I imagine it might be painful though... What was the last drm-fixes pull you got in this 7.1.0-rc3-00073-ga6920214ba75 ? I believe the quickest path might be to simply drop the xe fixes you might have recently gotten there while we don't identify the culprit. Thanks, Rodrigo. > > Linus
