This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM6422: How to monitor R5 FreeRTOS running status by A53 linux

Part Number: AM6422


Tool/software:

Hi TI experts,

I'm using AM6422 customized board with A53 running Linux and R5 running FreeRTOS.

How can the A53 Linux monitor the R5 FreeRTOS running status?

I did some tests, and found that if the R5 FreeRTOS had a stack overflow, the /sys/class/remoteproc/remoteproc0/state still shows running. Is this normal?

And I found there are coredump and recovery under /sys/class/remoteproc/remoteproc0/. How should I use them after change their value to enabled?

BR

xixiguo

  • Hi

    Is there any updates here, thanks!

    BR

    xixiguo

  • Hello Xixiguo,

    Apologies for the delays. I will probably not be able to reply again until after Christmas - feel free to set a calendar reminder to ping me in January.

    The state will not change if the core crashes.

    Please find other information about debugging R5F here:
    https://dev.ti.com/tirex/explore/node?node=A__AU9Punu4yTQu9hRP62aoug__AM64-ACADEMY__WI1KRXP__LATEST

    I haven't played around with coredump or recovery at this point in time, if you figure out anything cool please share!

    The simplest way to monitor would probably be just to have a "heartbeat" IPC message that gets sent between your Linux app and the remote core once in a while to see if it is still alive, if crashing is a concern.

    Regards,

    Nick

  • Hi Nick,

    Thanks for your reply! We can discuss the suitable solution to monitor the R5 running status after Christmas in January.

    BR

    xixiguo

  • Hello ,

    The simplest way to monitor would probably be just to have a "heartbeat" IPC message that gets sent between your Linux app and the remote core once in a while to see if it is still alive, if crashing is a concern.

    The above solution is possible to monitor the R5F crash from Linux.

    Just send an IPC message from R5F core to A53 core for every 1sec, or you define a particular time.

    And, check whether the IPC interrupt is triggered or not. If the IPC interrupt is not triggered, then check for some samples and decide the R5F hangs.

    I have one more query. Now the R5F core hangs. Then what do you do ? You are making the entire SOC Reset ?

    Regards,

    Anil

  • Hello Anil,

    I have one more query. Now the R5F core hangs. Then what do you do ? You are making the entire SOC Reset ?

    Good question. The answer is different for every customer usecase. Please refer to the discussion on the "Graceful shutdown" page of the AM64x multicore academy, especially section "Why is graceful shutdown needed? Why can't Linux just force the remote core to shut down?": https://dev.ti.com/tirex/explore/node?node=A__AVt2qZLTY3BgKr3D4YIv8w__AM64-ACADEMY__WI1KRXP__LATEST 

    During development and debug, the developer may want to preserve the current state of the R5F cores and the rest of the system to reverse-engineer what caused the behavior. That may be different from what the customer decides to program in their final application.

    To summarize some key ideas from the "Graceful Shutdown" page:
    * Did the R5F core crash? Or is the R5F just waiting for an input from somewhere else? (e.g., ADC data or Ethernet packets that stopped coming in)
    * If the core has not crashed, is there anything Linux can do to "unblock" the R5F on its other tasks? (e.g., put a message on the display asking the user to ensure the Ethernet cable is plugged in)
    * If the R5F crashed, does it make sense to reset the entire system, or just the R5F core? (note that the Linux driver blocks you from forcing the remote core to shut down, see the linked document for more. So the customer would need to choose to modify the driver themselves)
    * If you need to reset the entire system, does just the processor need to get reset? Or are their other devices in the system that also need to be reset?

    Regards,

    Nick