Hi Thorsten,

On Sat, 14 Feb 2026 at 13:38, Thorsten Leemhuis
<[email protected]> wrote:
> On 2/13/26 23:52, Marek Vasut wrote:
> > On 2/12/26 4:56 PM, Thorsten Leemhuis wrote:
> >> On 2/12/26 15:38, Marek Vasut wrote:
> >>> On 2/12/26 10:00 AM, Matt Coster wrote:
> >>>> On 11/02/2026 19:17, Marek Vasut wrote:
> >>>>> On 1/23/26 2:50 PM, Geert Uytterhoeven wrote:
> >>>>>> On Fri, 23 Jan 2026 at 14:36, Matt Coster <[email protected]>
> >>>>>> wrote:
> >>>>>>> On 22/01/2026 16:08, Geert Uytterhoeven wrote:
> >>>>>>>> Call the dev_pm_domain_attach_list() and
> >>>>>>>> dev_pm_domain_detach_list()
> >>>>>>>> helpers instead of open-coding multi PM Domain handling.
> >>>>>>>>
> >>>>>>>> This changes behavior slightly:
> >>>>>>>>      - The new handling is also applied in case of a single PM
> >>>>>>>> Domain,
> >>>>>>>>      - PM Domains are now referred to by index instead of by
> >>>>>>>> name, but
> >>>>>>>>        "make dtbs_check" enforces the actual naming and ordering
> >>>>>>>> anyway,
> >>>>>>>>      - There are no longer device links created between virtual
> >>>>>>>> domain
> >>>>>>>>        devices, only between virtual devices and the parent device.
> >>>>>>>
> >>>>>>> We still need this guarantee, both at start and end of day. In the
> >>>>>>> current implementation dev_pm_domain_attach_list() iterates
> >>>>>>> forwards,
> >>>>>>> but so does dev_pm_domain_detach_list(). Even if we changed that,
> >>>>>>> I'd
> >>>>>>> prefer not to rely on the implementation details when we can
> >>>>>>> declare the
> >>>>>>> dependencies explicitly.
> >>>>>>
> >>>>>> Note that on R-Car, the PM Domains are nested (see e.g.
> >>>>>> r8a7795_areas[]),
> >>>>>> so they are always (un)powered in the correct order.  But that may
> >>>>>> not
> >>>>>> be the case in the integration on other SoCs.
> >>>>>>
> >>>>>>> We had/have a patch (attached) kicking around internally to use the
> >>>>>>> *_list() functions but keep the inter-domain links in place; it got
> >>>>>>> held
> >>>>>>> up by discussions as to whether we actually need those dependencies
> >>>>>>> for
> >>>>>>> the hardware to behave correctly. Your patch spurred me to run
> >>>>>>> around
> >>>>>>> the office and nag people a bit, and it seems we really do need to
> >>>>>>> care
> >>>>>>> about the ordering.
> >>>>>>
> >>>>>> OK.
> >>>>>>
> >>>>>>> Can you add the links back in for a V2 or I can properly send the
> >>>>>>> attached patch instead, I don't mind either way.
> >>>>>>
> >>>>>> Please move forward with your patch, you are the expert.
> >>>>>> I prefer not to be blamed for any breakage ;-)
> >>>>>
> >>>>> Has there been any progress on fixing this kernel crash ?
> >>>>>
> >>>>> There are already two proposed solutions, but no fix is upstream.
> >>>>
> >>>> Yes and no. Our patch to use dev_pm_domain_attach_list() has landed in
> >>>> drm-misc-next as commit e19cc5ab347e3 ("drm/imagination: Use>>
> >>>> dev_pm_domain_attach_list()"), but this does not fix the underlying
> >>>> issue of missing synchronization in the PM core[1] is still unresolved
> >>>> as far as I'm aware.
> >>>
> >>> OK, but the pvr driver can currently easily crash the kernel on boot if
> >>> firmware is missing, so that should be fixed soon, right ?
> >>
> >> Well, drm-misc-next afaik means that the above mentioned fix would only
> >> be merged in 7.1, which is ~4 months away, which is not really "soon"
> >> I'd say. Or did I misjudge this?
> >
> > The PM domain issue here crashes the kernel, so I think this would be
> > material for drm-misc-fixes .
>
> Yeah, sounds a lot like it.
>
> >>> I added the regressions list onto CC, because this seems like a problem
> >>> worth tracking.
> >>
> >> Noticed that and wondered what change caused the regression.
> >
> > I think this one:
> >
> > 330e76d31697 ("drm/imagination: Add power domain control")
>
> Thx; FWIW, that was merged for v6.16-rc1.
>
> >> Did not
> >> find a answer in a quick search on lore[1]. Because if it's a
> >> regression, we maybe should just revert the culprit for now according to
> >> Linus:
> >> https://lore.kernel.org/lkml/CAHk-=wi86AosXs66-
> >> [email protected]/
> >>
> >> [1] I guess this was the initial report from Geert?
> >> https://lore.kernel.org/all/
> >> camuhmdwapt40hv3c+csbqfow05awcv1a6v_nijygoyi0i9_...@mail.gmail.com/
> >
> > It is.
> >
> > I think there are other SoCs which depend on the power domain commit, so
> > revert is not so clear cut anymore.
>
> Well, it's a judgement call. 330e76d31697 was merged less then a year
> ago, so I'd not be surprised at all if Linus would revert it in a case
> like this. But it seems it doesn't revert clearly anymore, which
> complicates things.
>
> > But SoCs which have hierarchical
> > power domains and which manage to probe this driver without having a
> > firmware available for the GPU will simply end with crashed kernel,
> > which is really not good.
>
> Does the patch Matt mentioned fix the crash? His "this does not fix the
> underlying issue [...]" (see quote earlier) makes it sound like the
> crash or some other problem (theoretical or practical? regression or
> not?) remains. If that's the case and no quick fix in sight I guess it
> would be best if someone affected could post a revert and then we can
> ask Linus if he wants to pick it up.

I don't think that patch would fix the crash.  The Adreno and Panfrost
GPU drivers do similar things (explicit multi-PM Domain handling),
so I am wondering if the issue can be triggered with them too (e.g. on
unbind).

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

Reply via email to