On 11/26/25 17:11, Lucas Stach wrote:
> Am Mittwoch, dem 26.11.2025 um 16:44 +0100 schrieb Philipp Stanner:
>> On Wed, 2025-11-26 at 16:03 +0100, Christian König wrote:
>>
>>>>
> [...]
>>>> My hope would be that in the mid-term future we'd get firmware
>>>> rings
>>>> that can be preempted through a firmware call for all major
>>>> hardware.
>>>> Then a huge share of our problems would disappear.
>>>
>>> At least on AMD HW pre-emption is actually horrible unreliable as
>>> well.
>>
>> Do you mean new GPUs with firmware scheduling, or what is "HW pre-
>> emption"?
>>
>> With firmware interfaces, my hope would be that you could simply tell
>>
>> stop_running_ring(nr_of_ring)
>> // time slice for someone else
>> start_running_ring(nr_of_ring)
>>
>> Thereby getting real scheduling and all that. And eliminating many
>> other problems we know well from drm/sched.
> 
> It doesn't really matter if you have firmware scheduling or not for
> preemption to be a hard problem on GPUs. CPUs have limited software
> visible state that needs to be saved/restored on a context switch and
> even there people start complaining now that they need to context
> switch the AVX512 register set.

Yeah, that has been discussed for the last 20 years or so when the first MMX 
extension came out.

> GPUs have megabytes of software visible state. Which needs to be
> saved/restored on the context switch if you want fine grained
> preemption with low preemption latency. There might be points in the
> command execution where you can ignore most of that state, but reaching
> those points can have basically unbounded latency. So either you can
> reliably save/restore lots of state or you are limited to very coarse
> grained preemption with all the usual issues of timeouts and DoS
> vectors.
> I'm not totally up to speed with the current state across all relevant
> GPUs, but until recently NVidia was the only vendor to have real
> reliable fine-grained preemption.

Completely agree. You won't believe how often that is a topic in discussions.

AMD has Compute Wave Save Restore now on newer HW, but both the reliability and 
performance are unfortunately questionable at best.

Regards,
Christian.

> 
> Regards,
> Lucas
> 
> 

Reply via email to