On Thu, 5 Mar 2026 21:46:21 -0800 Chia-I Wu wrote:
>On Thu, Mar 5, 2026 at 3:10 PM Hillf Danton <[email protected]> wrote:
>> On Wed, Mar 04, 2026 at 02:51:39PM -0800, Chia-I Wu wrote:
>> > Hi,
>> >
>> > Our system compositor (surfaceflinger on android) submits gpu jobs
>> > from a SCHED_FIFO thread to an RT gpu queue. However, because
>> > workqueue threads are SCHED_NORMAL, the scheduling latency from submit
>> > to run_job can sometimes cause frame misses. We are seeing this on
>> > panthor and xe, but the issue should be common to all drm_sched users.
>> >
>> > Using a WQ_HIGHPRI workqueue helps, but it is still not RT (and won't
>> > meet future android requirements). It seems either workqueue needs to
>> > gain RT support, or drm_sched needs to support kthread_worker.
>> >
>> As RT means (in general) to some extent that the game of eevdf is played in
>> __userspace__, but you are not PeterZ, so any issue like frame miss is
>> understandably expected.
>> Who made the workqueue worker a victim if the CPU cycles are not tight?
>> Who is the new victim of a RT kthread worker?
>> As RT is not free, what did you pay for it, given fewer RT success on market?
>>
> That is a deliberate decision for android, that avoiding frame misses
> is a top priority.
>
> Also, I think most drm drivers already signal their fences from irq
> handlers or rt threads for a similar reason. And the reasoning applies
> to submissions as well.
>
If RT submission alone works for you then your CPU cycles are tight.
And if your workloads are sanely correct then making workqueue and/or kthread
worker RT barely makes sense because the right option is to buy CPU with
higher capacity.