Hi Andi, On 10/30/25 14:20, Andi Shyti wrote: > Hi Christian, > > I'm now jumping into this discussion as there have been several > patches from Nitin, Janusz and in igt as well. > > On Thu, Feb 27, 2025 at 03:11:39PM +0100, Christian König wrote: >> Am 27.02.25 um 13:52 schrieb Andi Shyti: >>> On Wed, Feb 26, 2025 at 09:25:34PM +0530, Nitin Gote wrote: >>>> Give the scheduler a chance to breath by calling cond_resched() >>>> as some of the loops may take some time on old machines (like apl/bsw/pnv), >>>> and so catch the attention of the watchdogs. >>>> >>>> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12904 >>>> Signed-off-by: Nitin Gote <[email protected]> >>> This patch goes beyond the intel-gfx domain so that you need to >>> add some people in Cc. By running checkpatch, you should add: >>> >>> Sumit Semwal <[email protected]> (maintainer:DMA BUFFER SHARING >>> FRAMEWORK) >>> "Christian König" <[email protected]> (maintainer:DMA BUFFER SHARING >>> FRAMEWORK) >>> [email protected] (open list:DMA BUFFER SHARING FRAMEWORK) >>> [email protected] (open list:DMA BUFFER SHARING FRAMEWORK) >>> >>> I added them now, but you might still be asked to resend. >>> >>> Said that, at a first glance, I don't have anything against this >>> patch. >> >> There has been some push to deprecate cond_resched() cause it is almost >> always not appropriate. > > Yes, there have been ideas and patches, but so far I haven't seen > anything concrete to deprecate cond_resched() and so far I see it > used normally. Or am I missing something? > >> Saying that if I'm not completely mistaken that here is also not 100% >> correct usage. >> >> Question is why is the test taking 26 (busy?) seconds to complete? That >> sounds really long even for a very old CPU. >> >> Do we maybe have an udelay() here which should have been an usleep() or >> similar? > > mmhhh... it doesn't look right, sleeps and cond_resched() are > different kind of beasts, I wouldn't like random sleeps added, as > you explained in Nitin's second patch.
This issue has developed quite a bit and is now understood much better. The problem is that the test cases tests what happens if userspace makes 1k of submissions in the order A..Z and that the HW reports back that those submissions finished in the order Z..A. While there can be some parallelism and out of order execution that the HW reports back that 1k of submissions completes in exact opposite order is really unrealistic. So what we should rather do is to completely remove the test cases or at least make it somehow realistic. Regards, Christian. > > Thanks, > Andi
