This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM6442: Failed to run code from TCM memory

Genius 13655 points
Part Number: AM6442


Hello Champs,

SW:C:\ti\mcu_plus_sdk_am64x_09_00_00_31\examples\drivers\udma\udma_memcpy_polling

Customer wants to run the udma_memcpy_polling example from TCM memory, so he modified the .cmd file, changed the .bss section from MSRAM to R5F_TCMB0 or R5F_TCMA. But the code can't run successfully.

The log:

ASSERT: 0.8870s: ../udma_memcpy_polling.c:udma_memcpy_polling_main:112: CSL_UDMAP_TR_RESPONSE_STATUS_COMPLETE == trRespStatus failed !!!





Thanks
Regards
Shine

  • Hi Shine-san

    When I do the same changes in the `linker.cmd` file, the build is not successful probably because the sections assigned to R5F_TCMB0 are too large to fit into 32KB TCMB. The build error is shown below

    Scrubing: am64x:r5fss0-0:nortos:ti-arm-clang udma_memcpy_polling ...
    Generating SysConfig files ...
    Running script...
    Validating...
    info: /kernel/dpl/debug_log uartLog.baudRate: Actual Baudrate Possible: 115385 (0 % error)
    Generating Code (example.syscfg)...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_dpl_config.c...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_dpl_config.h...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_drivers_config.c...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_drivers_config.h...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_drivers_open_close.c...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_drivers_open_close.h...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_pinmux_config.c...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_power_clock_config.c...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_board_config.c...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_board_config.h...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_board_open_close.c...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_board_open_close.h...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_enet_config.c...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_enet_config.h...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_enet_open_close.c...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_enet_open_close.h...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_enet_soc.c...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_enet_lwipif.c...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_enet_lwipif.h...
    Writing /home/p-shivhare/ti/mcu_plus_sdk/am64x/examples/drivers/udma/udma_memcpy_polling/am64x-evm/r5fss0-0_nortos/ti-arm-clang/generated/ti_pru_io_config.inc...
    Compiling: am64x:r5fss0-0:nortos:ti-arm-clang udma_memcpy_polling.debug.out: ../../../udma_memcpy_polling.c
    Compiling: am64x:r5fss0-0:nortos:ti-arm-clang udma_memcpy_polling.debug.out: ../main.c
    Compiling: am64x:r5fss0-0:nortos:ti-arm-clang udma_memcpy_polling.debug.out: generated/ti_drivers_config.c
    Compiling: am64x:r5fss0-0:nortos:ti-arm-clang udma_memcpy_polling.debug.out: generated/ti_drivers_open_close.c
    Compiling: am64x:r5fss0-0:nortos:ti-arm-clang udma_memcpy_polling.debug.out: generated/ti_board_config.c
    Compiling: am64x:r5fss0-0:nortos:ti-arm-clang udma_memcpy_polling.debug.out: generated/ti_board_open_close.c
    Compiling: am64x:r5fss0-0:nortos:ti-arm-clang udma_memcpy_polling.debug.out: generated/ti_dpl_config.c
    Compiling: am64x:r5fss0-0:nortos:ti-arm-clang udma_memcpy_polling.debug.out: generated/ti_pinmux_config.c
    Compiling: am64x:r5fss0-0:nortos:ti-arm-clang udma_memcpy_polling.debug.out: generated/ti_power_clock_config.c
    .
    Linking: am64x:r5fss0-0:nortos:ti-arm-clang udma_memcpy_polling.debug.out ...
    "linker.cmd", line 64: error: program will not fit into available memory, or
       the section contains a call site that requires a trampoline that can't be
       generated for this section. run placement with alignment fails for section
       "GROUP_4" size 0xab3d.  Available memory ranges:
       R5F_TCMB0    size: 0x8000       unused: 0x8000       max hole: 0x8000
    error: errors encountered during linking; "udma_memcpy_polling.debug.out" not
       built
    tiarmclang: error: tiarmlnk command failed with exit code 1 (use -v to see invocation)
    make: *** [makefile:168: udma_memcpy_polling.debug.out] Error 1

    Could you please ask the customer to share the exact steps followed which results in successful build.

    Regards,

    Prashant

  • Hello Prashant,

    Thank you very much for your great support. 

    Customer only modified the --stack_size and --heap_size, please see below snapshots.




    Thanks
    Regards
    Shine

  • Hi Shine-san,

    Thank you for providing the exact changes. I could now build the example & see that it is indeed failing with the same logs as the logs shared by the customer.

    On experimenting, I can say that the failure only comes on allocating `.bss` section to TCM memory. If I allocate `.sysmem` & `.stack` to TCM while keeping `.bss` to MSRAM like below the application runs successfully.

    /* This is rest of uninitialized data. This can be placed in DDR if DDR is available and needed */
    GROUP {
        .bss:    {} palign(8)   /* This is where uninitialized globals go */
        RUN_START(__BSS_START)
        RUN_END(__BSS_END)
    } > MSRAM
    
    .sysmem: {} palign(8) > R5F_TCMB0  /* This is where the malloc heap goes */
    .stack:  {} palign(8) > R5F_TCMB0  /* This is where the main() stack goes */

    Now, one possible reason of why the application fails when keeping `.bss` also to TCM is the TCMB address 0x41010000 is not a SoC address but a local address to R5F core. Any other third entity can't access the TCMB with the local address 0x41010000. And this exactly is the problem. The UDMA examples involve the DMA peripheral which is a third entity & it requires SoC addresses of the different memories to access them. So, the DMA can access the TCMB with its SoC address 0x78100000 only & not with the local address 0x41010000.

    In the UDMA_MEMCPY_POLLING example, we use the following variables which are allocated to `.bss` & are used by DMA. To write to these variables, the DMA would require their SoC equivalent addresses but we simply pass the local addresses of these to DMA which creates the issue.

    /* UDMA TRPD Memory */
    uint8_t gUdmaTestTrpdMem[UDMA_TEST_TRPD_SIZE] __attribute__((aligned(UDMA_CACHELINE_ALIGNMENT)));
    
    /* Application Buffers */
    uint8_t gUdmaTestSrcBuf[UDMA_ALIGN_SIZE(UDMA_TEST_NUM_BYTES)] __attribute__((aligned(UDMA_CACHELINE_ALIGNMENT)));
    uint8_t gUdmaTestDestBuf[UDMA_ALIGN_SIZE(UDMA_TEST_NUM_BYTES)] __attribute__((aligned(UDMA_CACHELINE_ALIGNMENT)));

    So, to use TCM memory in these examples, I would suggest to keep `.bss` to MSRAM only & if required other sections like `.data` can be allocated to TCMB memory like below

    /* This is rest of initialized data. This can be placed in DDR if DDR is available and needed */
    GROUP {
        .data:   {} palign(8)   /* This is where initialized globals and static go */
    } > R5F_TCMB0
    
    /* This is rest of uninitialized data. This can be placed in DDR if DDR is available and needed */
    GROUP {
        .bss:    {} palign(8)   /* This is where uninitialized globals go */
        RUN_START(__BSS_START)
        RUN_END(__BSS_END)
    } > MSRAM
    
    .sysmem: {} palign(8) > R5F_TCMB0  /* This is where the malloc heap goes */
    .stack:  {} palign(8) > R5F_TCMB0  /* This is where the main() stack goes */

    Regards,

    Prashant