On Fri, May 22, 2026 at 12:05:35PM -0700, Linus Torvalds wrote:
> On Fri, 22 May 2026 at 11:55, Maarten Lankhorst <[email protected]> wrote:
> >
> > There's a
> > May 22 11:09:19 3970x kernel: xe 0000:4b:00.0: [drm] Tile0: GT0: Timedout 
> > job: seqno=4485322, lrc_seqno=4485322, guc_id=0, flags=0x73 in no process 
> > [-1]
> > May 22 11:09:19 3970x kernel: xe 0000:4b:00.0: [drm] Xe device coredump has 
> > been created
> > May 22 11:09:19 3970x kernel: xe 0000:4b:00.0: [drm] Check your 
> > /sys/class/drm/card0/device/devcoredump/data
> >
> > Do you have this coredump too?
> 
> Nope. I was assuming it didn't survive the reboot.

It doesn't. In this kind of setup the best way to deal with devcoredump
is to create a udev rule that copies the data file to a persistent place.

> 
> (This machine doesn't allow any remote logins - very much on purpose -
> so when the GPU hangs, it's toast).

Any journal saving the kernel buf log of previous boots? Preferably with
some drm.debug flags enabled 0xf likely

Also:

Any bisect possible in this setup? I imagine it might be painful though...

What was the last drm-fixes pull you got in this 7.1.0-rc3-00073-ga6920214ba75 ?

I believe the quickest path might be to simply drop the xe fixes you might
have recently gotten there while we don't identify the culprit.

Thanks,
Rodrigo.

> 
>                Linus

Reply via email to