[top-posting to make sure everybody is aware that involving Linus for a
judgement call is on the table]
Geert: Many thx for the clarification. In that case:
Geert or Marek: could one of you two please submit a revert for
330e76d3169721 ("drm/imagination: Add power domain control") [v6.16-rc1]?
With a bit of luck the responsible drm maintainers will pick it up
quickly -- or it might lead to a different solution for the problem. If
neither happens within a few days, I'll point Linus to this thread and
the revert. Then he can make the judgement call if he wants to.
Ciao, Thorsten
On 2/16/26 10:00, Geert Uytterhoeven wrote:
> On Sat, 14 Feb 2026 at 13:38, Thorsten Leemhuis
> <[email protected]> wrote:
>> On 2/13/26 23:52, Marek Vasut wrote:
>>> On 2/12/26 4:56 PM, Thorsten Leemhuis wrote:
>>>> On 2/12/26 15:38, Marek Vasut wrote:
>>>>> On 2/12/26 10:00 AM, Matt Coster wrote:
>>>>>> On 11/02/2026 19:17, Marek Vasut wrote:
>>>>>>> On 1/23/26 2:50 PM, Geert Uytterhoeven wrote:
>>>>>>>> On Fri, 23 Jan 2026 at 14:36, Matt Coster <[email protected]>
>>>>>>>> wrote:
>>>>>>>>> On 22/01/2026 16:08, Geert Uytterhoeven wrote:
>>>>>>>>>> Call the dev_pm_domain_attach_list() and
>>>>>>>>>> dev_pm_domain_detach_list()
>>>>>>>>>> helpers instead of open-coding multi PM Domain handling.
>>>>>>>>>>
>>>>>>>>>> This changes behavior slightly:
>>>>>>>>>> - The new handling is also applied in case of a single PM
>>>>>>>>>> Domain,
>>>>>>>>>> - PM Domains are now referred to by index instead of by
>>>>>>>>>> name, but
>>>>>>>>>> "make dtbs_check" enforces the actual naming and ordering
>>>>>>>>>> anyway,
>>>>>>>>>> - There are no longer device links created between virtual
>>>>>>>>>> domain
>>>>>>>>>> devices, only between virtual devices and the parent device.
>>>>>>>>>
>>>>>>>>> We still need this guarantee, both at start and end of day. In the
>>>>>>>>> current implementation dev_pm_domain_attach_list() iterates
>>>>>>>>> forwards,
>>>>>>>>> but so does dev_pm_domain_detach_list(). Even if we changed that,
>>>>>>>>> I'd
>>>>>>>>> prefer not to rely on the implementation details when we can
>>>>>>>>> declare the
>>>>>>>>> dependencies explicitly.
>>>>>>>>
>>>>>>>> Note that on R-Car, the PM Domains are nested (see e.g.
>>>>>>>> r8a7795_areas[]),
>>>>>>>> so they are always (un)powered in the correct order. But that may
>>>>>>>> not
>>>>>>>> be the case in the integration on other SoCs.
>>>>>>>>
>>>>>>>>> We had/have a patch (attached) kicking around internally to use the
>>>>>>>>> *_list() functions but keep the inter-domain links in place; it got
>>>>>>>>> held
>>>>>>>>> up by discussions as to whether we actually need those dependencies
>>>>>>>>> for
>>>>>>>>> the hardware to behave correctly. Your patch spurred me to run
>>>>>>>>> around
>>>>>>>>> the office and nag people a bit, and it seems we really do need to
>>>>>>>>> care
>>>>>>>>> about the ordering.
>>>>>>>>
>>>>>>>> OK.
>>>>>>>>
>>>>>>>>> Can you add the links back in for a V2 or I can properly send the
>>>>>>>>> attached patch instead, I don't mind either way.
>>>>>>>>
>>>>>>>> Please move forward with your patch, you are the expert.
>>>>>>>> I prefer not to be blamed for any breakage ;-)
>>>>>>>
>>>>>>> Has there been any progress on fixing this kernel crash ?
>>>>>>>
>>>>>>> There are already two proposed solutions, but no fix is upstream.
>>>>>>
>>>>>> Yes and no. Our patch to use dev_pm_domain_attach_list() has landed in
>>>>>> drm-misc-next as commit e19cc5ab347e3 ("drm/imagination: Use>>
>>>>>> dev_pm_domain_attach_list()"), but this does not fix the underlying
>>>>>> issue of missing synchronization in the PM core[1] is still unresolved
>>>>>> as far as I'm aware.
>>>>>
>>>>> OK, but the pvr driver can currently easily crash the kernel on boot if
>>>>> firmware is missing, so that should be fixed soon, right ?
>>>>
>>>> Well, drm-misc-next afaik means that the above mentioned fix would only
>>>> be merged in 7.1, which is ~4 months away, which is not really "soon"
>>>> I'd say. Or did I misjudge this?
>>>
>>> The PM domain issue here crashes the kernel, so I think this would be
>>> material for drm-misc-fixes .
>>
>> Yeah, sounds a lot like it.
>>
>>>>> I added the regressions list onto CC, because this seems like a problem
>>>>> worth tracking.
>>>>
>>>> Noticed that and wondered what change caused the regression.
>>>
>>> I think this one:
>>>
>>> 330e76d31697 ("drm/imagination: Add power domain control")
>>
>> Thx; FWIW, that was merged for v6.16-rc1.
>>
>>>> Did not
>>>> find a answer in a quick search on lore[1]. Because if it's a
>>>> regression, we maybe should just revert the culprit for now according to
>>>> Linus:
>>>> https://lore.kernel.org/lkml/CAHk-=wi86AosXs66-
>>>> [email protected]/
>>>>
>>>> [1] I guess this was the initial report from Geert?
>>>> https://lore.kernel.org/all/
>>>> camuhmdwapt40hv3c+csbqfow05awcv1a6v_nijygoyi0i9_...@mail.gmail.com/
>>>
>>> It is.
>>>
>>> I think there are other SoCs which depend on the power domain commit, so
>>> revert is not so clear cut anymore.
>>
>> Well, it's a judgement call. 330e76d31697 was merged less then a year
>> ago, so I'd not be surprised at all if Linus would revert it in a case
>> like this. But it seems it doesn't revert clearly anymore, which
>> complicates things.
>>
>>> But SoCs which have hierarchical
>>> power domains and which manage to probe this driver without having a
>>> firmware available for the GPU will simply end with crashed kernel,
>>> which is really not good.
>>
>> Does the patch Matt mentioned fix the crash? His "this does not fix the
>> underlying issue [...]" (see quote earlier) makes it sound like the
>> crash or some other problem (theoretical or practical? regression or
>> not?) remains. If that's the case and no quick fix in sight I guess it
>> would be best if someone affected could post a revert and then we can
>> ask Linus if he wants to pick it up.
>
> I don't think that patch would fix the crash. The Adreno and Panfrost
> GPU drivers do similar things (explicit multi-PM Domain handling),
> so I am wondering if the issue can be triggered with them too (e.g. on
> unbind).