On Sun, Jul 24, 2022 at 12:09:50PM +0100, Chris Narkiewicz wrote:
> Hi,
> 
> Some time ago I posted a bug on bugs@ (subject: X11 hangs on StarLabs Mk IV - 
> snapshot 06-06-2022).
> We did some checks with help of one of the developers, but we could not find 
> a cause then.
> 
> I started digging deeper and I think I found a cause of the lockup, but I'm 
> struggling to
> get to the bottom of it.
> 
> All the source code with my patches are on my github. Apologies if somebody 
> doesn't do
> github - I could not find a better way to share this messy set of changes:
> 
> https://github.com/ezaquarii/xenocara/tree/bug
> https://github.com/ezaquarii/xenocara/commit/1d6e50bf668adfc07a4da0860d6c8f738ec1228a
> 
> https://github.com/ezaquarii/src/tree/bug
> https://github.com/ezaquarii/src/commit/0047e0f206896aa5287cad250c6bee1c994cdf88
> 
> Playing with a debugger first, I managed to find that it locks in ioctl()
> originating from _mesa_MapBuffer and ending up in libiris. I put printfs to
> demonstrate this and also the output in attachment.

to see ioctls you can use ktrace/kdump

> 
> Then, I instrumented kernel with some more printfs and I located a
> place in DRM code where the task is awaiting wakeup - infintely - inside
> drm_syncobj_array_wait_timeout. Due to a large number of printfs line 
> locations
> do not make much sense, so here is the exact location in my git source tree:
> 
> https://github.com/ezaquarii/src/commit/0047e0f206896aa5287cad250c6bee1c994cdf88#r79272404
> 
> The timeout value sent via ioctl() is "inifinite", so it hangs there 
> infintely.
> I also did a nasty experiment by overriding it to some large but finite 
> number here:
> https://github.com/ezaquarii/src/commit/0047e0f206896aa5287cad250c6bee1c994cdf88#r79273004

what value do you mean by infinite?

for linux's MAX_SCHEDULE_TIMEOUT we use INT32_MAX (0x7fffffff)
tsleep(9) with a timo of 0 is how a process sleeps without a timeout

drm_syncobj_array_wait_timeout() calls schedule_timeout() which calls
sleep_setup() with timo 0 if the argument is MAX_SCHEDULE_TIMEOUT.

A sleep without a timeout itself is not a problem as a wakeup would
come from another part of the kernel.

It may be interesting to see what the i965 Mesa driver does.
You can force it to load with MESA_LOADER_DRIVER_OVERRIDE=i965 in your
environment or try move away /usr/X11R6/lib/modules/dri/iris_dri.so

Reply via email to