AM625: MCU_M4 execute code from external memory

Tony Tang

Mastermind 30262 points

Part Number: AM625
Other Parts Discussed in Thread: SYSCONFIG

How to execute code from external memories? Just change .cmd to allocate code to DDR memory space?

over 3 years ago

0 Nick Saulnier over 3 years ago

TI__Guru** 109380 points

Copy-pasting the offline discussion around this thread while e2e was down:

FROM TONY:

Based on MCU SDK project:

<image, ipc_rpmsg_echo_linux_am62x-sk_m4fss0-0_freertos>

#1. modify MCU project .cmd to add a MEMORY DDR_1, allocate all sections to DDR_1. And build.

MEMORY

{

    M4F_VECS : ORIGIN = 0x00000000 , LENGTH = 0x00000200

    M4F_IRAM : ORIGIN = 0x00000200 , LENGTH = 0x0002FE00

    M4F_DRAM : ORIGIN = 0x00030000 , LENGTH = 0x00010000

 

    /* when using multi-core application's i.e more than one R5F/M4F active, make sure

     * this memory does not overlap with R5F's

     */

    /* Resource table must be placed at the start of DDR_0 when M4 core is early booting with Linux */

    DDR_0       : ORIGIN = 0x9cc00000 , LENGTH = 0x1000

    DDR_1    : ORIGIN = 0x9cb00000 , LENGTH = 0x100000 //0x9cc01000

}

#2. Download .out, M4 suspend on address as below.

#3. Then reload, M4 suspend on entry point, CCS report error in console as below:

BLAZAR_Cortex_M4F_1: Can't Run Target CPU: (Error -1268 @ 0x1090001) Device is locked up in Hard Fault or in NMI. Reset the device, and retry the operation. If error persists, confirm configuration, power-cycle the board, and/or try more reliable JTAG settings (e.g. lower TCLK). (Emulation package 9.8.0.00235)

Does it need A53 core to configure something to enable M4 execute from external memory?

0 Nick Saulnier over 3 years ago in reply to Nick Saulnier

TI__Guru** 109380 points

NICK REPLY

Hello Tony,

Just to double check: Are you loading the M4F with the Linux remoteproc driver, then attaching to the M4F in CCS following the steps here?
https://software-dl.ti.com/mcu-plus-sdk/esd/AM62X/08_04_00_16/exports/docs/api_guide_am62x/CCS_LAUNCH_PAGE.html

If so, what happens if you do NOT reset the CPU? Instead of resetting the CPU, try loading symbols to debug the software that was initialized by Linux and is already running:

Select the down arrow here, select “Load Symbols”, and select the .out file that is currently running.

If that still does not work, please attach the full linker.cmd file.

Regards,

Nick

0 Nick Saulnier over 3 years ago in reply to Nick Saulnier

TI__Guru** 109380 points

FROM TONY

Nick,

Thanks. Current I boot up A53 with Linux, then connect CCS to download M4 .out from CCS as MCU+ SDK user guide instruction, it can’t run to main.

FROM NICK

Hello Tony,

Please try loading symbols as shown in the previous response instead of re-loading the firmware from CCS. Does that allow you to debug?

It is possible that the SDK Instructions only work in certain situations. If so, we need to figure out what those situations are.

Regards,

Nick

0 Tony Tang over 3 years ago in reply to Nick Saulnier

TI__Mastermind 30262 points

Nick,

Everything is OK with code allocated on on-chip memory, just modify .cmd file of the same project, .out can't run.

Would you please try it out on your side?

0 Nick Saulnier over 3 years ago in reply to Tony Tang

TI__Guru** 109380 points

Hello Tony,

Followup questions

1) Did you try loading symbols?

2) We need to see your full .cmd file if you want us to test it. E.g., how is the rest of the linker file placing data into DDR_1?

Explaining why I want you to try loading symbols

Let's assume that everything in the .cmd file looks correct. Then I want to make sure that this is an issue when the Linux remoteproc driver loads the M4F, in addition to when CCS reloads the M4F after Linux has already initialized the M4F. If you are able to debug the core by loading symbols, then that tells us that the Linux remoteproc driver was able to successfully load the M4F even with data in DDR. If you are NOT able to debug the core by loading symbols, then that means we have to debug something else.

Based on your feedback, this is what I think you have tested. Please correct me if I got anything wrong:

CCS reloads M4 program that does NOT use DDR after Linux initializes M4 - TONY TESTED, WORKING

CCS reloads M4 program that uses DDR after Linux initializes M4 - TONY TESTED, NOT WORKING

CCS loads symbols to debug M4 program that was loaded by Linux. Program does NOT use DDR - NOT TESTED

CCS loads symbols to debug M4 program that was loaded by Linux. Program uses DDR - NOT TESTED

Regards,

Nick

0 Tony Tang over 3 years ago in reply to Nick Saulnier

TI__Mastermind 30262 points

[deleted]

0 Tony Tang over 3 years ago in reply to Tony Tang

TI__Mastermind 30262 points

#1. in order to load from Linux without IPC, change .cmd and add resource table in main.c

https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/6266.linker.cmd

#2. load m4 firmware from Linux:

root@am62xx-evm:~# cat /sys/class/remoteproc/remoteproc0/state
running
root@am62xx-evm:~# echo stop> /sys/class/remoteproc/remoteproc0/state
[ 398.667178] remoteproc remoteproc0: stopped remote processor 5000000.m4fss
root@am62xx-evm:~# cat /sys/class/remoteproc/remoteproc0/state
offline
root@am62xx-evm:~# ln -sf /lib/firmware/i2c_read_am62_loadable.out am62-mcu-m4f0_0-fw

root@am62xx-evm:~# echo start> /sys/class/remoteproc/remoteproc0/state
[ 584.444982] remoteproc remoteproc0: powering up 5000000.m4fss
[ 584.451945] remoteproc remoteproc0: Booting fw image am62-mcu-m4f0_0-fw, size 382804
[ 584.477119] remoteproc0#vdev0buffer: assigned reserved memory node m4f-dma-memory@9cb00000
[ 584.486289] virtio_rpmsg_bus virtio0: rpmsg host is online
[ 584.495330] remoteproc0#vdev0buffer: registered virtio0 (type 7)
[ 584.501661] remoteproc remoteproc0: remote processor 5000000.m4fss is now up
root@am62xx-evm:~# cat /sys/class/remoteproc/remoteproc0/state
running

#3. Connect JTAG without gel file. download symbol, report error in console:

BLAZAR_Cortex_M4F_1: Trouble Setting Breakpoint with the Action "Process CIO" at 0x9cb0eece: (Error -1066 @ 0x9CB0EECE) Unable to set/clear requested breakpoint. Verify that the breakpoint address is in valid memory. (Emulation package 9.8.0.00235)
BLAZAR_Cortex_M4F_1: Breakpoint Manager: Retrying with a AET breakpoint
BLAZAR_Cortex_M4F_1: Breakpoint Manager: _JobHardwareBreakpoint::ARM_DEBUG_V7M_fpb_add_breakpoint: Bad parameter passed to IP driver[22002]
BLAZAR_Cortex_M4F_1: Trouble Setting Breakpoint with the Action "Terminate Program Execution" at 0x9cb0f9d0: (Error -1066 @ 0x9CB0F9D0) Unable to set/clear requested breakpoint. Verify that the breakpoint address is in valid memory. (Emulation package 9.8.0.00235)
BLAZAR_Cortex_M4F_1: Breakpoint Manager: Retrying with a AET breakpoint
BLAZAR_Cortex_M4F_1: Breakpoint Manager: _JobHardwareBreakpoint::ARM_DEBUG_V7M_fpb_add_breakpoint: Bad parameter passed to IP driver[22002]

0 Tony Tang over 3 years ago in reply to Tony Tang

TI__Mastermind 30262 points

Nick,

Would like to occupy your time to help on this.

0 Tushar Thakur over 3 years ago in reply to Tony Tang

TI__Guru 60608 points

Hi Tony,

I am working on above issue, please allow me some time to get it resolved.

0 Terry Wu83110 over 3 years ago in reply to Tushar Thakur

TI__Prodigy 620 points

Hi, Tushar,

Thank you and is there a target deadline that when we would give any suggestion to customer?

0 Anshu Choudhary over 3 years ago in reply to Terry Wu83110

TI__Genius 12420 points

Terry,

Tushar has been able to recreate the issue you have pointed out by Tony 3 days ago, we are working with software team to understand the cause of the issue, let us understand the issue in more detail and we can comment on resolution timeline.

please understand we acknowledge the urgency of the issue and are working towards resolution

Regards

Anshu

0 Terry Wu83110 over 3 years ago in reply to Anshu Choudhary

TI__Prodigy 620 points

Anshu, understand. Thanks in advance.

0 Tushar Thakur over 3 years ago in reply to Terry Wu83110

TI__Guru 60608 points

Hi Terry,

https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/791/example.syscfg

I changed the RAT configuration in example.syscfg attached above and tried to load the program from CCS debugger. It is not giving any error while loading program but nothing is printed on console.

While loading symbols from CCS debugger I am getting the error attached below. We are working with the hardware and software team to get it resolved.

Regards,

Tushar Thakur

While Loading program :

While Loading symbols :

0 Nick Saulnier over 3 years ago in reply to Tushar Thakur

TI__Guru** 109380 points

edited later on Oct 14

Hello Tony & Terry,

How much instruction data is there?

Keep in mind that the ENTIRE 256 KB of local SRAM can be used for instruction code (not just the 192KB of I-RAM). See AM62x TRM, section "MCU_M4FSS Internal RAMs" for more information.

If the customer usecase is less than 256KB of instruction code, I would suggest placing the instruction code in M4F local SRAM, and the data code somewhere else (the 64kB OCSRAM for faster accesses if not used by other cores, or the DDR for more memory space.

Are there any hardware limitations?

I checked with the hardware team today. From a signal connectivity standpoint, it looks like M4F should be able to access instruction data that is stored in the rest of the processor (ICode --> CBASS --> RAT --> DDR):

From a processor behavior standpoint, I am not sure what the core would do if it was expecting to move the program counter to the next instruction, and then the instruction data read had not completed by that clock signal (e.g., just halt for however many clock cycles it takes to execute the read? something else?) I am also not sure if the instructions are read into the M4F as a batch, or one at a time (i.e., would you need to wait potentially hundreds of clock cycles between EVERY SINGLE assembly instruction, or would you only need to wait every ??? number of assembly instructions?)

Next steps for debugging

It is possible that the program is running, but CCS does not know how to set breakpoints for instructions that are stored somewhere other than the local IRAM.

You could still check to see if the M4F code is running as expected, or freezing in certain locations, by using the M4F to read/write to certain known memory addresses, and then reading and writing those memory addresses from Linux. For an example of how to do that from the PRU core, reference the PRU Getting Started Labs Debug lab: https://software-dl.ti.com/processor-sdk-linux/esd/AM62X/08_04_01_03/exports/docs/common/PRU-ICSS/PRU-Getting-Started-Labs_Lab5.html#debugging-the-pru-from-linux-core

Another debug option is to check the sysfs log for the remote core. NOTE: If using the resource table workaround I provided on your other thread, there is an additional step:

you will need to go into SysConfig and ensure that the "Enable Memory Log" option is selected (not sure where it is in Sysconfig). You can tell that the option was properly selected by doing a build.
Look in this generated file: project/board/m4fss0-0_freertos/ti-arm-clang/generated/ti_dpl_config.c
Ensure that the generated file has this line of code:

      void putchar_(char character)
      {
          // Output to memory trace buffer
          DebugP_memLogWriterPutChar(character);
      }

After that, Linux should be able to see the debug log as described here: https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1131412/am6442-redirect-logs-from-core-r5-to-linux

Regards,

Nick

0 Nick Saulnier over 3 years ago in reply to Nick Saulnier

TI__Guru** 109380 points

Example of checking how much instruction memory a project takes

I looked at AM62x MCU+ SDK 8.4:
make -s -C examples/drivers/ipc/ipc_rpmsg_echo_linux/am62x-sk/m4fss0-0_freertos/ti-arm-clang

From the linker.cmd file, it looks like these memory sections are getting stored in the local SRAM IRAM section:

SECTIONS
{
...
    .text:   {} palign(8) > M4F_IRAM     /* This is where code resides */

...
    .sysmem: {} palign(8) > M4F_IRAM     /* This is where the malloc heap goes */
    .stack:  {} palign(8) > M4F_IRAM     /* This is where the main() stack goes */

...

    /* Sections needed for C++ projects */
    .ARM.exidx:     {} palign(8) > M4F_IRAM  /* Needed for C++ exception handling */
    .init_array:    {} palign(8) > M4F_IRAM  /* Contains function pointers called before main */
    .fini_array:    {} palign(8) > M4F_IRAM  /* Contains function pointers called after main */
}

Now let's look at the map file:

vi examples/drivers/ipc/ipc_rpmsg_echo_linux/am62x-sk/m4fss0-0_freertos/ti-arm-clang/ipc_rpmsg_echo_linux.release.map

In this example, 0x14b50 = 84816 bytes = 82KB of IRAM is used:

MEMORY CONFIGURATION

         name            origin    length      used     unused   attr    fill
----------------------  --------  ---------  --------  --------  ----  --------
  M4F_VECS              00000000   00000200  00000140  000000c0  RWIX
  M4F_IRAM              00000200   0002fe00  00014b50  0001b2b0  RWIX
  M4F_DRAM              00030000   00010000  0000e8f8  00001708  RWIX
  DDR_0                 9cc00000   00001000  00001000  00000000  RWIX

And then you can scroll further down in the map file to see exactly what is taking up each part of memory.

Regards,

Nick

0 Yi Liao over 3 years ago in reply to Nick Saulnier

Prodigy 30 points

/* make sure below retain is there in your linker command file, it keeps the vector table in the final binary */
--retain="*(.vectors)"
/* This is the stack that is used by code running within main()
 * In case of NORTOS,
 * - This means all the code outside of ISR uses this stack
 * In case of FreeRTOS
 * - This means all the code until vTaskStartScheduler() is called in main()
 *   uses this stack.
 * - After vTaskStartScheduler() each task created in FreeRTOS has its own stack
 */
--stack_size=163840
/* This is the heap size for malloc() API in NORTOS and FreeRTOS
 * This is also the heap used by pvPortMalloc in FreeRTOS
 */
--heap_size=327680


SECTIONS
{
    /* This has the M4F entry point and vector table, this MUST be at 0x0 */
    .vectors:{} palign(8) > M4F_VECS
    .text:   {} palign(8) > DDR_AIRAM     /* This is where code resides */

    .bss:    {} palign(8) > DDR_ADRAM     /* This is where uninitialized globals go */
    RUN_START(__BSS_START)
    RUN_END(__BSS_END)

//	.cinit:	 {} palign(8) > M4F_DRAM
    .data:   {} palign(8) > DDR_ADRAM     /* This is where initialized globals and static go */
    .rodata: {} palign(8) > DDR_ADRAM     /* This is where const's go */
    .sysmem: {} palign(8) > DDR_AIRAM     /* This is where the malloc heap goes */
    .stack:  {} palign(8) > DDR_AIRAM     /* This is where the main() stack goes */

	GROUP {
        /* This is the resource table used by linux to know where the IPC "VRINGs" are located */
        .resource_table: {} palign(4096)
    } > DDR_0

    /* Sections needed for C++ projects */
    .ARM.exidx:     {} palign(8) > DDR_AIRAM  /* Needed for C++ exception handling */
    .init_array:    {} palign(8) > DDR_AIRAM  /* Contains function pointers called before main */
    .fini_array:    {} palign(8) > DDR_AIRAM  /* Contains function pointers called after main */
}

MEMORY
{
    M4F_VECS : ORIGIN = 0x00000000 , LENGTH = 0x00000200
    M4F_IRAM : ORIGIN = 0x00000200 , LENGTH = 0x0002FE00
    M4F_DRAM : ORIGIN = 0x00030000 , LENGTH = 0x00010000

	/* when using multi-core application's i.e more than one R5F/M4F active, make sure
     * this memory does not overlap with R5F's
     */
    /* Resource table must be placed at the start of DDR_0 when M4 core is early booting with Linux */
    DDR_0       : ORIGIN = 0x9CC00000 , LENGTH = 0x100000
    DDR_AIRAM	: ORIGIN = 0x9CD00000 , LENGTH = 0x80000
    DDR_ADRAM	: ORIGIN = 0x9CD80000 , LENGTH = 0x80000
}

0 Yi Liao over 3 years ago in reply to Yi Liao

Prodigy 30 points

Hi Nick,

Our goal is to run this link.cmd normallyq, but now there is a compilation error code "#10099-D". In fact , we need to modify the internal memory IRAM and DRAM to the external memory in DDR.

0 Tony Tang over 3 years ago in reply to Tushar Thakur

TI__Mastermind 30262 points

Tushar Thakur said:
I changed the RAT configuration in example.syscfg attached above and tried to load the program from CCS debugger.

if change RAT in project, need to execute it to take effect. won't impact firmware load.

I have some questions:

Need to configure RAT before loading firmware? from M4 memory view, M4 can't access DDR space in default, but why can load from CCS without configure RAT?

From test result, if only allocate data to DDR as below, remain .text in IRAM, can load and run.

0 Nick Saulnier over 3 years ago in reply to Tony Tang

TI__Guru** 109380 points

Hello Tony,

We are still checking on when the RAT is configured during M4F boot. However, I am not sure that the RAT actually has to be configured before DDR can be accessed. It is possible that if the RAT is not configured, a memory access to local address 0x4_0000 - 0x83_FFFF and 0x6000_0000 - 0xDFF_FFFF will just go to the exact same system address (as opposed to being blocked).

Regards,

Nick

0 Tony Tang over 3 years ago in reply to Nick Saulnier

TI__Mastermind 30262 points

Nick Saulnier said:
It is possible that if the RAT is not configured, a memory access to local address 0x4_0000 - 0x83_FFFF and 0x6000_0000 - 0xDFF_FFFF will just go to the exact same system address (as opposed to being blocked).

Thanks, it helps me understand:

Any transactions hitting this address range coming from MCU M4F will go through RAT for address re-mapping before reaching other end points.

Check RAT first, if not mapped, then go to system address. it explained can download from CCS and load from Linux successfully, From current test result, just can't execute from DDR, but can access data from DDR.

Processors

Processors forum

AM625: MCU_M4 execute code from external memory