Hi Tanmay, On Mon, Oct 27, 2025 at 09:57:28PM -0700, Tanmay Shah wrote: >Remote processor can crash or hang during normal execution. Linux >remoteproc framework supports different mechanisms to recover the >remote processor and re-establish the RPMsg communication in such case. > >Crash reporting: > >1) Using debugfs node > >User can report the crash to the core framework via debugfs node using >following command: > >echo 1 > /sys/kernel/debug/remoteproc/remoteproc0/crash > >2) Remoteproc notify to the host about crash state and crash reason >via the resource table > >This is a platform specific method where the remote firmware contains >vendor specific resource to update the crash state and the crash >reason. Then the remote notifies the crash to the host via mailbox >notification. The host then will check this resource on every mbox >notification and reports the crash to the core framework if needed. > >Crash recovery mechanism: > >There are two mechanisms available to recover the remote processor from >the crash. 1) boot recovery, 2) attach on recovery > >Remoteproc core framework will choose proper mechanism based on the >rproc features set by the platform driver. > >1) Boot recovery > >This is the default mechanism to recover the remote processor. >In this method core framework will first stop the remote processor, >load the firmware again and then starts the remote processor. On >AMD-Xilinx platforms this method is supported. The coredump callback in >the platform driver isn't implemented so far, but that shouldn't cause >the recovery failure. > >2) Attach on recovery > >If RPROC_ATTACH_ON_RECOVERY feature is enabled by the platform driver, >then the core framework will choose this method for recovery. > >On zynqmp platform following is the sequence of events expected during >remoteproc crash and attach on recovery: > >a) rproc attach/detach flow is working, and RPMsg comm is established >b) Remote processor (RPU) crashed (crash not reported yet) >c) Platform management controller stops and reloads elf on inactive > remote processor before reboot >d) platform management controller reboots the remote processor >e) Remote processor boots again, and detects previous crash (platform > specific mechanism to detect the crash) >f) Remote processor Reports crash to the Linux (Host) and wait for > the recovery. >g) Linux performs full detach and reattach to remote processor. >h) Normal RPMsg communication is established. > >It is required to destroy all RPMsg related resource and re-create them >during recovery to establish successful RPMsg communication. To achieve >this complete rproc_detach followed by rproc_attach calls are needed. > > >Tanmay Shah (3): > remoteproc: xlnx: enable boot recovery > remoteproc: core: full attach detach during recovery > remoteproc: xlnx: add crash detection mechanism >
I gave a test on i.MX8QM-MEK, there are failures, 1st test pass, 2nd test fail. Without this patch, I not see failures. root@imx8qmmek:~# remoteproc remoteproc0: crash detected in imx-rproc: type watchdog Partition3 reset! remoteproc remoteproc0: handling crash #1 in imx-rproc remoteproc remoteproc0: detached remote processor imx-rproc rproc-virtio rproc-virtio.1.auto: assigned reserved memory node vdevbuffer@90400000 virtio_rpmsg_bus virtio0: rpmsg host is online rproc-virtio rproc-virtio.1.auto: registered virtio0 (type 7) rproc-virtio rproc-virtio.2.auto: assigned reserved memory node vdevbuffer@90400000 virtio_rpmsg_bus virtio1: rpmsg host is online rproc-virtio rproc-virtio.2.auto: registered virtio1 (type 7) remoteproc remoteproc0: remote processor imx-rproc is now attached virtio_rpmsg_bus virtio1: creating channel rpmsg-openamp-demo-channel addr 0x1e remoteproc remoteproc0: crash detected in imx-rproc: type watchdog Partition3 reset! remoteproc remoteproc0: handling crash #2 in imx-rproc rproc-virtio rproc-virtio.1.auto: assigned reserved memory node vdevbuffer@90400000 virtio_rpmsg_bus virtio4: probe with driver virtio_rpmsg_bus failed with error -12 rproc-virtio rproc-virtio.1.auto: registered virtio4 (type 7) rproc-virtio rproc-virtio.2.auto: assigned reserved memory node vdevbuffer@90400000 virtio_rpmsg_bus virtio5: probe with driver virtio_rpmsg_bus failed with error -12 rproc-virtio rproc-virtio.2.auto: registered virtio5 (type 7) rproc-virtio rproc-virtio.5.auto: assigned reserved memory node vdevbuffer@90400000 virtio_rpmsg_bus virtio6: probe with driver virtio_rpmsg_bus failed with error -12 rproc-virtio rproc-virtio.5.auto: registered virtio6 (type 7) rproc-virtio rproc-virtio.6.auto: assigned reserved memory node vdevbuffer@90400000 virtio_rpmsg_bus virtio7: probe with driver virtio_rpmsg_bus failed with error -12 rproc-virtio rproc-virtio.6.auto: registered virtio7 (type 7) remoteproc remoteproc0: remote processor imx-rproc is now attached Thanks, Peng

