AM6442: Program stops without response when accessing Cortex-M3

Guru 10135 points
Part Number: AM6442
Other Parts Discussed in Thread: TMDS64EVM, SYSCONFIG

Tool/software:

Hi Support Team,

Our customer is looking into a problem that when accessing the Cortex-M3 (DMSC-L) in AM6442,
the program stops without response.
Could you provide answers to the following questions?

When EnetAppUtils_setDeviceState() is executed, the program stops until it times out
because there is no response from Cortex-M3 in Sciclient_waitForMessage() in
MCU+ SDK's SCI Client API
Sciclient_pmGetModuleState() The program is stopped until it times out
because there is no response from the Cortex-M3.

Q1: We would like to check whether Cortex-M3 is working or not. Is there any way to check?

We would like to check if it is the first access after turning on the power,so please let us check the following as well.

Q2. Does SBL access Cortex-M3 (DMSC-L)?

Q3. Does FreeRTOS access Cortex-M3(DMSC-L) at startup?

Best Regards,
Kanae

  • Hi Kanae,

    The SYSFW logs show a firewall exception when R50-1 is trying to access 0x4983ffe8 (DMASS0_RINGACC_CFG) from the boot logs, the customer is not even booting an application for R5F0-1, I am not sure why it is only observed on some of the customer's board. I am checking if I can get more information on this internally, meanwhile please check with the customer if they have any lead on R5F0-1 accessing this memory and if there is anyway to avoid it.

  • Hi Meet,

    Thank you for your support.

    Meet said;
    The SYSFW logs show a firewall exception when R50-1 is trying to access 0x4983ffe8 (DMASS0_RINGACC_CFG) from the boot logs,

    Where exactly in the logs can “0x4983ffe8” be confirmed?

    Best Regards,
    Kanae

  • It's either 0x4983FFC8 or 0x4983FFD0, but this memory falls in DMASS0_RINGACC_CFG memory region.

  • Hi Meet,

    If that is what you meant, I understand.

    Kanae

  • Hi Meet,

    Meet said;
    The SYSFW logs show a firewall exception when R50-1 is trying to access 0x4983ffe8 (DMASS0_RINGACC_CFG) from the boot logs, 

    Does this firewall exception occurring mean an exception occurs on the M3 core?
    If an exception occurs on the M3 core, does executing functions like Sciclient_getVersionCheck()
    cause the M3 core to stop responding?

    Best Regards,
    Kanae

  • Hi Meet,

    Thank you for your support.

    Meet said:
    Meanwhile, please check with the customer if they have any lead on R5F0-1 accessing this memory and if there is any way to avoid it.

    On the customer's system, the firmware (FreeRTOS) runs on only one A53 core.
    R5F is not used.
    Upon verifying SBL operation, we observed that a reset is performed
    when transitioning to the loaded program. This reset causes the A53, R50-0,
    and R50-1 to start operating.

    Both R50-0 and R50-1 will always cause an exception to occur.
    but R50-0 executes the exception handling and there is no problem.

    However, with R50-1, the stack save operation during the exception
    causes writes to an unspecified location, and this is the problem.
    Since this is the first operation, the stack pointer is not initialized,
    causing R50-1 to write somewhere.

    As a first step, could you please answer the following questions from my previous post?
    Q1: Does this firewall exception occurrence mean an exception happened on the M3 core?
    Q2: If an exception occurs on the M3 core, does executing a function like
      `Sciclient_getVersionCheck()` cause the M3 core to stop responding?

    Best Regards,
    Kanae

  • Hi Meet,

    I need to report this to the customer, so could you please let me know the status?
    If there is anything that needs to be confirmed on the customer side, please let me know.

    Best Regards,
    Kanae

  • Hi Kanae,

    Apologies for the delay, I will post an update on thursday.

    Best Regards,

    Meet.

  • Hi Meet,

    Thank you for your reply.

    I wait for your update.

    Best Regards,
    Kanae 

  • Does this firewall exception occurrence mean an exception happened on the M3 core?

    The exception doesn't take place on the M3 core, SYSFW just reports a firewall exception.

    If an exception occurs on the M3 core, does executing a function like
      `Sciclient_getVersionCheck()` cause the M3 core to stop responding?

    Ideally this should not cause SYSFW to crash, but as your SYSFW traces only show this exception being reported and nothing else, I wanted to check if there is an issue due to this exception.

  • Hi Meet-san,

    Thank you for your update.
    I'm Machida who work with Kawakami. Sorry to interrupt you, but let me comment about below.

    We think that "firewall exception" associate with "M3 core stack(stop)".
    However, you described that "firewall exception" does not cause exception on M3 core and does not stop M3 core.

    Here is customer's investigation about "firewall exception".
    (Quote from kawakami's comment which is posted on previous thread.)
    ---
    Upon verifying SBL operation, we observed that a reset is performed
    when transitioning to the loaded program. This reset causes the A53, R50-0, and R50-1 to start operating.

    Both R50-0 and R50-1 will always cause an exception to occur.
    but R50-0 executes the exception handling and there is no problem.

    However, with R50-1, the stack save operation during the exception
    causes writes to an unspecified location, and this is the problem.
    Since this is the first operation, the stack pointer is not initialized,causing R50-1 to write somewhere.
    ---

    Actually, some MCU/MPU core(Ex, cortex M) define default certain address for stack pointer.
    However, according to above, it seems that R5 core does not define such as above.
    Customer assume that certain device (maybe device which cause issue) will try to access region which show "firewall execption" in this timing.
    What do you think about above ?

    It may be meaningless to confirm about above, but I would like to go ahead this problem.
    So, could you comment about above ?

    Best Regards,

  • Hi, 

    It may be meaningless to confirm about above, but I would like to go ahead this problem.

    This seems unlikely and still doesn't explain why the issue is only observed on few devices, when the other devices work properly.

    It was previously mentioned that they see that the issue is resolved with 10.1 SDK, while they still see the issue with 10.0, could you please ask them to share the set of images (SBL+appimage) for 10.1 where the issue is not observed as well as for 10.0 where the issue is still there.

    Best Regards,

    Meet.

  • Hi,

    1. Boot R5F0-0 and R5F0-1 in addition to A53 and see if the issue is resolved. You can take any default example from SDK like hello world for R5F0-0 and R5F0-1.

    2. You mentioned that they are trying to reproduce this issue on the EVM, if they are able to reproduce it on EVM then please share the results as well as the steps. so that we can test it at our end, and debug it.

    3. Please share these: 

    It was previously mentioned that they see that the issue is resolved with 10.1 SDK, while they still see the issue with 10.0, could you please ask them to share the set of images (SBL+appimage) for 10.1 where the issue is not observed as well as for 10.0 where the issue is still there.

    4. If adding R5F0-0 and R5F0-1 doesn't resolve the issue then please check this: https://github.com/TexasInstruments/mcupsdk-core/commit/71cc50389cae4ed91ec835423fe992b19a52702f#diff-4e40750a24c97efe51bdf15dd520fd109b87d9e8dc746aad94f2356d9addd211 

    This contains the changes done in SBL from SDK V10 to V10.1, you can try to port these changes from V10.1 to 9.1 and see if that resolves the issue.

    Best Regards.

    Meet.

  • Hi,

    I sent answer for Q2 and Q3 to you and Anil-san via email.
    Please check it.

    I will check related to R5 to customer.

    Best Regards,

  • Hi,

    As I imformed email, however I will answer for Q1 related to R5.

    Customer tried to import “Hello Worldproject to both R5F0-0 and R5F0-1.
    After applying this to the custom board which they observed M3 problem, they did NOT observe M3 problem under this condition.
    (Following image
     show A53 core output clock frequency which get from M3 core)

    Best Regards,

  • Hi Machida-San,

    Thanks for the update.

    1. Can we ask the customer to boot any dummy images like this to R5F0-0 and R50-1 for now, to unblock them for further development? We can continue to debug on why we get this issue when R5F0-0 and R50-1 are not booted. 

    2. For debugging further, please ask them to remove the R5F0-0 and R5F0-1 from their setup again and reproduce the issue, once the SBL is executed, they can connect to R5F0-0 and R5F0-1 in CCS and check what address they observe:

    3. Is there any update on their testing to try to reproduce this issue on EVM? 

    Best Regards,
    Meet.

  • Hi,

    For 1 and 2, I will inform them to customer.
    For 3, I have already informed this via e-mail. However, I will post same contents on this thread as well.
    ---
    As I said above, I lent TMDS64EVM, however maybe EVM broke in transit.
    So, customer will buy new TMDS64EVM. So please wait our feedback for a while.
    ---

    And also, I sent detail about exception handler related to Stack Point via email (topic "-4-" on email).
    Please confirm it and give your feedback about this.
    (Maybe you changed assembly code from SDK 10.0 to SDK 10.1.)

    Best Regards, 

  • Hi Meet,

    Thank you for your support.

    Here is a result of No.2 from our customer.

    **********************************************************************************

    We reproduced the symptom again with only the Cortex-A53 running,
    and took a screenshot of the CCS debug screen showing the hung state.

    ***********************************************************************************************

    I would appreciate it if you could share your views on the above result and the progress on other matters.

    Meet said;
    1. Can we ask the customer to boot any dummy images like this to R5F0-0 and R50-1 for now, to unblock them for further development?

    In addition, although our customer understands the above request,
    I have not asked them to do so in order to allow them to focus on debugging.
    Thank you for your understanding.

    Best Regards,
    Kanae

  • Hi Kanae,

    Thanks for sharing this, This does show the SP for R5F0-1 accessing the restricted memory, which can cause the exception. But I am not able to reproduce this on my EVM, even if the R5F0-1 is not loaded it is supposed to be in WFI state, which is what I observe at my end:

    Was there any change done to the default self reset sequence(Bootloader_socCpuResetReleaseSelf) in customer's setup?

  • Hi Meet,

    Thank you for your support.
    Here is a reply from our customer to your question.

    Meet said:
    Was there any change made to the default self reset sequence (Bootloader_socCpuResetReleaseSelf) in the customer's setup?

    Regarding the self reset sequence (Bootloader_socCpuResetReleaseSelf),
    we have not made any changes to it.

    Just to be sure, we reinstalled SDK10_00 where the hang problem was observed
    and compared it to the SDK10_00 folder referenced in the build environment.

    We compared the entire folder containing the Bootloader_socCpuResetReleaseSelf source,
    but found no differences.
    \mcu_plus_sdk_am64x_10_00_00_20\source\drivers\bootloader

    Best Regards,
    Kanae

  •  

    Hi Meet,

    Thank you for your support.

    Here are data from our customer regarding the following item for confirmation.

    Meet said:
    3. Is there any update on their testing to try to reproduce this issue on EVM?

    We ran the sample program on EVM (Rev.C) and confirmed the exception on both R5F0-0 and R5F0-1 cores.
    The sample program used for verification was built in the following environment:

    We made no changes whatsoever to the imported source code.

    CCS: 12.5.0

    SysConfig: 1.18.0

    MCU+ SDK for AM64x: 9.1.0.41

    SBL: The following sample program within the SDK: import
             \mcu_plus_sdk_am64x_09_01_00_41\examples\drivers\boot\sbl_sd\am64x-evm\r5fss0-0_nortos\ti-arm-clang

    Application: Import the following sample program from the SDK:
             \mcu_plus_sdk_am64x_09_01_00_41\examples\hello_world\am64x-evm\a53ss0-0_freertos\gcc-aarch64

    After building, rename the output file “hello_world_am64x-evm_a53ss0-0_freertos_gcc-aarch64.appimage.hs_fs”
    to ‘app’. Save the “tiboot3.bin” and " Save the “app” file to the SD card, verify operation via SD card boot,
    and confirm that after outputting “Hello World!”, exception handling occurs on both R5F0-0 and R5F0-1 as shown in the image below.

    I will also share the “tiboot3.bin” and “app” files.

    20251107_files.zip

    Please also confirm the status of the following points.

    Meet said:
    We can continue debugging why this issue occurs when R5F0-0 and R5F0-1 are not booted.

    Best Regards,
    Kanae

  • Hi Kanae,

    I will need some more time to debug/check this.

    Best Regards,

    Meet.