On 3/25/25 3:52 PM, Boris Brezillon wrote:

Hello Boris,

sorry for the late reply.

Hm, that might be the cause of the fast reset issue (which is a fast
resume more than a fast reset BTW): if you re-assert the reset line on
runtime suspend, I guess this causes a full GPU reset, and the MCU ends
up in a state where it needs a slow reset (all data sections reset to
their initial state). Can you try to move the reset_control_[de]assert
to the unplug/init functions?
Is it correct to assume , that if I remove all reset_control_assert()
calls (and keep only the _deassert() calls), the slow resume problem
should go away too ?

Yeah, dropping the _assert()s should do the trick.
Hmmm, no, that does not help. I was hoping maybe NXP can chime in and
suggest something too ?

Can you try keep all the clks/regulators/power-domains/... on after
init, and see if the fast resume works with that. If it does,
re-introduce one resource at a time to find out which one causes the
MCU to lose its state.

I already tried that too . I spent quite a while until I reached that L2
workaround in fact.

So, with your RPM suspend/resume being NOPs, it still doesn't work?
Unless the FW is doing something behind our back, I don't really see
why this would fail on your platform, but not on the rk3588. Are you
sure the power domains are kept on at all times. I'm asking, because if
you linked all the PDs, the on/off sequence is automatically handled by
the RPM core at suspend/resume time.

I revisited this now.

Can you please test the following patch (also attached) on one of your devices, and tell me what the status is at the end . The diff sets the GLB_HALT bit and then clears it again, which I suspect should first halt the GPU and (this is what I am unsure about) then again un-halt/resume the GPU ?

"
diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index 9bf06e55eaeea..57c0d4fd29aa2 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -1087,8 +1087,16 @@ void panthor_fw_pre_reset(struct panthor_device *ptdev, bool on_hang) struct panthor_fw_global_iface *glb_iface = panthor_fw_get_glb_iface(ptdev);
                u32 status;

+pr_err("%s[%d] pre-halt status=%x\n", __func__, __LINE__, gpu_read(ptdev, MCU_STATUS));
+
                panthor_fw_update_reqs(glb_iface, req, GLB_HALT, GLB_HALT);
                gpu_write(ptdev, CSF_DOORBELL(CSF_GLB_DOORBELL_ID), 1);
+mdelay(100);
+pr_err("%s[%d] likely-halted status=%x\n", __func__, __LINE__, gpu_read(ptdev, MCU_STATUS));
+               panthor_fw_update_reqs(glb_iface, req, 0, GLB_HALT);
+mdelay(100);
+pr_err("%s[%d] likely-running ? status=%x\n", __func__, __LINE__, gpu_read(ptdev, MCU_STATUS));
+
                if (!gpu_read_poll_timeout(ptdev, MCU_STATUS, status,
                                           status == MCU_STATUS_HALT, 10,
                                           100000)) {
"

In my case, the relevant output looks like this:

"
[    3.326805] panthor_fw_pre_reset[1090] pre-halt status=1
[    3.432151] panthor_fw_pre_reset[1095] likely-halted status=2
[    3.542179] panthor_fw_pre_reset[1098] likely-running ? status=2
"

That means, the GPU remains halted at the end, even if the "GLB_HALT" bit is cleared before the last print. The clearing of GLB_HALT is also what panthor_fw_post_reset() does.

I suspect the extra soft reset I did before "un-halted" the GPU and allowed it to proceed.

I wonder if there is some way to un-halt the GPU using some gpu_write() direct register access, is there ? Maybe the GPU remains halted because setting the GLB_HALT stops command stream processing, and the GPU never samples the clearing of GLB_HALT and therefore remains halted forever ?
diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index 9bf06e55eaeea..57c0d4fd29aa2 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -1087,8 +1087,16 @@ void panthor_fw_pre_reset(struct panthor_device *ptdev, bool on_hang)
 		struct panthor_fw_global_iface *glb_iface = panthor_fw_get_glb_iface(ptdev);
 		u32 status;
 
+pr_err("%s[%d] pre-halt status=%x\n", __func__, __LINE__, gpu_read(ptdev, MCU_STATUS));
+
 		panthor_fw_update_reqs(glb_iface, req, GLB_HALT, GLB_HALT);
 		gpu_write(ptdev, CSF_DOORBELL(CSF_GLB_DOORBELL_ID), 1);
+mdelay(100);
+pr_err("%s[%d] likely-halted status=%x\n", __func__, __LINE__, gpu_read(ptdev, MCU_STATUS));
+		panthor_fw_update_reqs(glb_iface, req, 0, GLB_HALT);
+mdelay(100);
+pr_err("%s[%d] likely-running ? status=%x\n", __func__, __LINE__, gpu_read(ptdev, MCU_STATUS));
+
 		if (!gpu_read_poll_timeout(ptdev, MCU_STATUS, status,
 					   status == MCU_STATUS_HALT, 10,
 					   100000)) {

Reply via email to