[FAQ] How to ensure that the DM task is the only code running on the DM R5F?

This is a companion FAQ for

 [FAQ] [Alert] DM R5F can crash in certain conditions: AM62x, AM62Ax, AM62Dx, AM62Px, AM67, AM67A  

Please read the alert first. You can also find more information in the main FAQ here:

 [FAQ] DM R5F can crash in certain conditions: AM62x, AM62Ax, AM62Dx, AM62Px, AM67, AM67A  

This FAQ applies to AM62x, AM62Ax, AM62Dx, AM62Px.

The DM R5F does not need to be running the ipc_rpmsg_echo_linux project, as discussed in  [FAQ] Does the DM R5F need to run the IPC Echo demo task? .

We can use the empty project if we want to remove all other code from the DM R5F core, other than the critical device manager (DM) task. However, there are multiple things that we need to keep in mind if we want to use the empty project.

  • Things to keep in mind when using the empty project alongside Linux

    Do not pass garbage data to the Linux drivers

    By default, the DM R5F core is defined in the Linux devicetree file for these processors. That means that the Linux Remoteproc driver will attempt to read "resource table" data in the DM R5F's DDR memory allocation.

    If the ipc_rpmsg_echo_linux project is loaded, then a valid resource table is populated in DDR. 

    However, the empty project was not written with Linux in mind. That means that the unmodified empty project does NOT place a resource table in DDR. The Linux remoteproc driver will still attempt to read the "resource table" memory region if the DM R5F core is defined in the Linux devicetree. However, the DDR region that the Linux Remoteproc driver reads will be uninitialized. DDR memory is not initialized to a set value, so this uninitialized memory region will have a random set of 1s and 0s.

    If the Linux Remoteproc driver reads in random data, then the Linux driver could behave in an unexpected way. So if we switch to using the empty project, we must ensure that Linux does not read random data.

  • There are multiple ways to prevent Linux from reading in garbage data.

    Option 1: Disable the DM R5F devicetree node in the Linux devicetree

    When the DM R5F is "disabled" in the Linux devicetree, this does NOT mean that Linux actually turns off the DM R5F core. Instead, it means that Linux assumes that the DM R5F has been disabled, so the Linux remoteproc and RPMsg drivers never tries to interact with the DM R5F core.

    All the TI-SCI communication between Linux drivers and the DM task will continue working as usual.

    An example is attached:

    https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/2477.0001_2D00_remove_2D00_DM_2D00_R5F_2D00_from_2D00_the_2D00_Linux_2D00_devicetree_2D00_file.patch

    NOTE #1

    The DM R5F's memory regions should still be defined in the Linux devicetree. This prevents Linux from overwriting the DM R5F data. The 1MByte VIRTIO region that used to be used for RPMsg communication with the DM R5F can be removed.

    NOTE #2 

    The DM R5F devicetree node should only be disabled if the ipc_rpmsg_echo_linux firmware has been replaced by different firmware, like the empty project. If the default ipc_rpmsg_echo_linux firmware is still running on the DM R5F, then disabling the DM R5F devicetree node in the Linux devicetree can cause the DM R5F core to crash after 49 days on certain releases. Refer to the DM R5F crash alert for more information.

  • Option 2: Initialize the resource table region to a known value 

    There is another option if you want to modify the DM R5F project instead of the Linux devicetree (or if you want the system to behave in a deterministic manner, regardless of whether the DM R5F node is disabled in the Linux devicetree).

    If you can guarantee that the DM R5F resource table region of DDR is filled with a known set of values, then you can guarantee that the Linux remoteproc driver will always behave in the same way.

  • Option 2A (preferred): Add an empty resource table

    text

  • Option 2B: zero out the resource table region

    A less elegant solution involves zeroing out the resource table region. Linux will not be able to attach to the DM R5F and view the trace logs, but at least you can guarantee that the Linux remoteproc driver will always act in the same way, since it always reads the same data from the DDR memory region.

    Here is example code on AM62Px. Note that the address offsets will need to be set to align with the beginning of the memory region in DDR:

    diff --git a/examples/empty/am62px-sk/wkup-r5fss0-0_freertos/ti-arm-clang/linker.cmd b/examples/empty/am62px-sk/wkup-r5fss0-0_freertos/ti-arm-clang/linker.cmd
    index 1e27ee9..b29f55a 100644
    --- a/examples/empty/am62px-sk/wkup-r5fss0-0_freertos/ti-arm-clang/linker.cmd
    +++ b/examples/empty/am62px-sk/wkup-r5fss0-0_freertos/ti-arm-clang/linker.cmd
    @@ -61,6 +61,11 @@ SECTIONS
             .text:abort: palign(8) /* this helps in loading symbols when using XIP mode */
         } load = R5F_TCMB, run = R5F_TCMA
    
    +    GROUP {
    +        /* create a hole and fill it with 0x00000000 */
    +        .resource_table: fill = 0x00000000 { .+= 0x0400; } palign(1024)
    +    } > DDR_RESOURCE_TABLE
    +
         .lpm_data (NOLOAD)      : {} align(4)       > DDR_LPM_DATA
         .text                   : {} palign(8)      > DDR
         .const                  : {} palign(8)      > DDR
    @@ -161,6 +166,11 @@ MEMORY
    
         WKUP_SRAM_TRACE_BUFF (RWIX) : ORIGIN = 0x41880000 LENGTH = 0x0000800
    
    +    /*
    +     * Fill the resource table region with zeros so Linux remoteproc driver does
    +     * not read garbage data.
    +     */
    +    DDR_RESOURCE_TABLE          : ORIGIN = 0x9C900000, LENGTH = 0x400
         /* DDR for DM LPM data [ size 640.00 KB ] */
         DDR_LPM_DATA    (RWIX)      : ORIGIN = 0x9CA00000 LENGTH = 0x000A0000
         /* DDR for DM R5F code/data [ size 27MiB + 416 KB ] */

    Here is an example of how Linux can behave with random resource table values:

    [    7.562659] platform 78000000.r5f: R5F core may have been powered on by a different host, programmed state (0) != actual state (1)
    [    7.593146] platform 78000000.r5f: configured R5F for IPC-only mode
    [    7.625007] platform 78000000.r5f: assigned reserved memory node r5f-dma-memory@9c800000
    [    7.681768] remoteproc remoteproc1: 78000000.r5f is available
    [    7.691070] remoteproc remoteproc1: attaching to 78000000.r5f
    [    7.698364] remoteproc remoteproc1: rsc table is truncated
    [    7.704812] remoteproc remoteproc1: Failed to process resources: -22
    [    7.773275] k3_r5_rproc bus@f0000:bus@b00000:r5fss@78000000: rproc_add failed, ret = -22
    [    7.783217] k3_r5_rproc bus@f0000:bus@b00000:r5fss@78000000: k3_r5_cluster_rproc_init failed, ret = -22
    [    7.793242] k3_r5_rproc: probe of bus@f0000:bus@b00000:r5fss@78000000 failed with error -22
    [    7.886106] remoteproc remoteproc1: releasing 78000000.r5f 

    And here is how Linux behaves with a resource table region that is filled with zeros:

    [    7.455715] platform 78000000.r5f: R5F core may have been powered on by a different host, programmed state (0) != actual state (1)
    [    7.456773] platform 78000000.r5f: configured R5F for IPC-only mode
    [    7.456873] platform 78000000.r5f: assigned reserved memory node r5f-dma-memory@9c800000
    [    7.457671] remoteproc remoteproc1: 78000000.r5f is available
    [    7.457779] remoteproc remoteproc1: attaching to 78000000.r5f
    [    7.457792] remoteproc remoteproc1: remote processor 78000000.r5f is now attached