This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM2634: Multicore Program Fails CacheP_getEnabled() On First Execution

Part Number: AM2634
Other Parts Discussed in Thread: SYSCONFIG

Hello,

I have a four core program. On first launch in DevBoot mode Cores 1,2 and 3 always fail an assert check in MpuP_enable:

The assert expression highlighted below is always false on first execution:

In other words, type is not zero but 15 instead. I tried to find out what type==15 actually means but I'm having difficulty finding the relevant information.

However, if I do a group Reset -> Restart -> Resume then this check passes (i.e. type must be zero) and all four cores run just fine. So it appears a CPU Reset fixes the issue but I need to understand how.

Questions:

1) What are the "current enabled bits" returned from CacheP_getEnabled()?

2) Why are they already set when DevBoot mode is launched?

3) How might I reset them programmatically?

Thank you.

  • If I add a call to CacheP_disable(CacheP_TYPE_ALL); before Mpu_init(); in ti_dpl_config.c the problem goes away. This further demonstrates that cache "bits" are somehow enabled before MpuP_init(); has been called:

    However, this is a very inconvenient fix since since ti_dpl_config.c is generated by SysCfg.

    I guess that answers Q3 but Q2 remains. How do I find out what is setting the cache enabled bits before MpuP_init() is called? I can't see any cache configuration registers so am feeling around in the dark. Also, there appears to be no MPU settings in the GEL files.

  • Can TI answer Q2 please?

  • Hi Kier,

    The assert expression highlighted below is always false on first execution:

    The check mentioned is to ensure that the MPU settings are done first and cache is enabled next. We have observed issues if the sequence is reversed. This means, if your cache is enabled, before the MPU is enabled, the software will be stuck in the assert.

    2) Why are they already set when DevBoot mode is launched?

    Let me check this. This may be due to your debugger settings that the cache of the device is enabled.

    Can you also try this method of initialization ? Does this give you the same result ?

    https://software-dl.ti.com/mcu-plus-sdk/esd/AM263X/latest/exports/docs/api_guide_am263x/ADDITIONAL_DETAILS_PAGE.html#autotoc_md38

    Best Regards,
    Aakash

  • Hi Aakash,

    Thank you.

    I tried the SBL init method:

    But Core1 would not connect at all and I could not see any debug symbols on cores 0 and 3 Only Core 2 ran properly.

    In any case, it seems like this method also executes GEL files.

    After examining the contents of CacheP_armv7r_asm.S, I think what would help is to understand how to examine the System Control Register in CCS:

    Cortex-A7 MPCore Technical Reference Manual r0p3 (arm.com)

    I guess this isn't memory mapped so how can I view its status please?

  • H Aakash,

    I found the System Control Register now:

    Also, now I think there's an issue with our custom boot assembly code. It could be that mpu_init is being called twice. Let's put this on hold for the moment while I investigate.

  • OK, so the boot code is not calling MpuP_init() twice, it seems to be a quirk of the debugger. It stops once at the BP then pressing Resume it stops again at the same place but I confirmed by other means that MpuP_init() is only called once.

    I guess the next step is confirm the contents of the CP15_SYSTEM_CONTROL register when this occurs.

    By the way Aakash, why is it that there's no bitfield breakdown of the CP15 registers in the Register view like there is for other registers?

  • Hi Kier,

    For CP15_SYSTEM_CONTROL registers, I can take that feedback to our concerned team and plan to get that added.

    Do keep us posted on any luck with your debugging.

    Best Regards,
    Aakash

  • Hi Aakash,

    I seem to have reached a dead end with this problem. The weird thing is that when I add a breakpoint to debug it seems to actually fix the problem! Let me explain.

    As mentioned in the original post, the assert trap in MpuP_enable() fails because type == 15. This is because, for reasons unknown, instruction and data caches are already enabled as indicated by the I and C flags in the CP15_SYSTEM_CONTROL register:

    However, if I repeat the test and just put a BP at line 140, just before the assert trap, then type is always 0. CP15_SYSTEM_CONTROL reg has its reset value (no cache enabled) which means the MpuP_enable() continues successfully if allowed to run on:

    It's like a quantum observation problem. When I try to examine the issue, the behaviour changes. I'm at a loss as to how to debug the issue. Do you have any suggestions please?

  • Hi Kier,

    We will try this experimentation to use dev boot mode for IPC notify application on TI-EVM (control card) to confirm if the issue is reproducible or not.

    Best Regards,
    Aakash

  • Hi Kier,

    I have tried IPC notify application on am263x (control card) and for me CacheP_getEnabled() returns 0. And this issue is not reproducible.

    Regards,
    Gunjan

  • I'm also experiencing this issue.

    The application behaves as normal when using SDK version 08.05.00.24, but the exact symptoms Kier describes are present when building with version 09.01.00.41

    I am able to consistently get past this point (specifically, line 140 of MpuP_armv7r.c), but only by removing an MPU entry in SysConfig which marks a small region of OCRAM as non-cached (for the .bss:ENET_CPPI_DESC section). Of course, this is no good, as the Enet library then throws an assert because that must be non-cached.

    We have no custom boot code, i.e. are using boot_armv7r_asm.S from the SDK.

    Resets, power cycling, and whether or not the debugger is attached before this assert is hit makes no difference.

    "Kier said:

    It could be that mpu_init is being called twice"

    Modifying the MpuP_enable function to use a static variable to track how many times the if(MpuP_isEnable()==0U) branch is entered show that it is being entered twice.

  • Hello,

    I am able to successfully run code when I am compiling and debugging ipc_notify example for first time. But if I do CPU reset on all cores and then perform debug once again, my program gets 'type=15'. I have raised JIRA for fixing this issue.

    Regards,
    Gunjan

  • I'm also experiencing this issue.

    Thanks Adam. Good to know the problem isn't just local to me.

    Modifying the MpuP_enable function to use a static variable

    I thought of doing that but auto init is called after MPU init so I assumed that any variable variable will be wiped. I will try that again.

    So just to be sure, you can show by this method that MpuP_enable() is called two times?

  • Hi,

    Did you get this issue even when you power cycle the board and run ipc eg for very first time after power cycle?

    Thanks,
    Gunjan

  • Hi Gunjan,

    I tried the example:ipc_notify_echo_am263x-cc_r5fss0-0_freertos_ti-arm-clang from 9.1.0.41 SDK with CCS12.5 however, it does not seem to work at all.

    After power on then Resume, I get no output in the Console window. After a few seconds, I stop the cores and see the following:

    After Reset, Resume and Restart, I still get no output. After a few seconds, I stop the cores and see the following:

  • Hi Kier,

    From your first snapshot it looks like program executed successfully, can you try (after power cycle) launching serial terminal for output in CCS (using your respective UART com<> port)
    Steps: View -> Terminal -> Open a terminal ->

    For second snapshot it seems they are not able to sync with Cortex_R5_0.

    Thanks,
    Gunjan

  • Thank you. Yes, UART Log works, I assumed incorrectly that CCS Log would be used.

    The example runs first time and repeated resets don't produce the issue for me. I guess we just have to hope that the resolution to your Jira will also fix my issue. Can you post a link to the Jira please?

  • Hello,

    Jira link for issue: https://jira.itg.ti.com/browse/MCUSDK-13200

    Regards,
    Gunjan

  • Thanks but that one doesn't work for me. I was expecting a JIRA link similar to this one:

    [EXT_EP-11682] CLB Tile Design settings reset when changing tile Name - Software Issue Report (SIR)

  • Hello Kier,
    You can track it with the help of FAE. Our jira link is accessible internally.

    Also, Can you do these steps on each core when you get issue in MpuP_enable(for each core):

    Step1: Pause program
    Step2: CPU reset
    Step3: Restart

    For me it removes the issue.

    Thanks,
    Gunjan

  • Hello Gunjan,

    You can track it with the help of FAE. Our jira link is accessible internally.

    Why is your JIRA link internal and others are external? Just wondering why you seem to have two different systems.

    Can you do these steps on each core when you get issue in MpuP_enable(for each core)

    You can see in my original question text that I describe this action. The complaint is that this occurs on first (and therefore most important) execution.

    The only progress made on this topic is that you have created a JIRA ticket I cannot see. Please let me know the cause and solution of this problem.

  • Hello Kier,

    Accessibility of JIRA links differs from project to project. You can reach out to your FAE for tracking it.

    Thanks,
    Gunjan

  • I have made some progress.

    There has been a lot of code change in our application since I raised this issue and now the the original trigger (assert trap in MpuP_enable()) does not occur for reasons unknown.

    However, I did speculate previously that a possible cause could be that __mpu_init() / MpuP_init() is being called twice. I now have evidence that was/is probably the case all along.

    I instrumented the code (purple box) such that a separate counter is updated by each core just before MpuP_init() is called. I am using a pointer to unused memory instead of a variable because this circumvents the cinit initialisation routines which would otherwise reset the counter. In other words, the pointer contents are not reset if the start-up code is called again for any reason.

    The results show that both cores in SS0 is, somehow, calling __mpu_init() a second time.

    - I do not think the program is starting again from the software vector. If I put a BP at 0x0, in all the cores, I only hit it once.

    - I than added a SW breakpoint at the call to MpuP_init() with a Skip Count of 1 to understand the reason for the second execution of MpuP_init(). The code ran to main() successfully as expected but on Resuming, the SW breakpoint seems to cause a PABT:

    - I then started a 'PC Trace' and I find that the and of the PC Trace does not record anything to do with the PABT handler:

    However, notice that the PC Trace has stopped at line 444 in Port.c (WFI). This could be a coincidence but it reminded me of this issue:

    AM2634: Multicore FreeRTOS Empty Project Doesn't Work Correctly - Arm-based microcontrollers forum - Arm-based microcontrollers - TI E2E support forums

    The advice there is to remove the WFI instruction from line 444 of Port.c and rebuild. I did so here and now this particular problem of double __mpu_init() has gone although the code does appear to fall over at other points but let's ignore that for the moment.

    On the one hand I'm relieved to have made some progress but on the other, the normal debug tools have let me down and I've l learned very little.

    So I hope to find answers to the following questions on Wednesday in our arranged debug session.

    1) Why does the PC Trace stop at the WFI instruction?

    2) Why does a SW BP with Skip Count > 0 cause a PABT?

    3) Why would the WFI instruction apparently cause the double call to __mpu_init()?

  • Hi Gunjan,

    As per your request, stepping out of the the second __mpu_init() call simply returns the PC to the boot code.

    There are no function calls before __mpu_init() so we can infer that the program is starting again from the entry point. However, as demonstrated, the reason cannot be deduced because:

    - If I set a BP at 0x0, I just get a PABT.

    - The PC Trace ends with WFI instruction in vApplicationIdleHook().

    I had one more thought to check if a reset is occurring so I checked the RST_STATUS_CAUSE of both sub systems:

    Bit0: POR Reset

    Bit1: Warm Reset

    Bit 7: Reset for CORE0 and MSS_CORE00_VIM caused because of reset request by debugger in CORE00

    Bit 9: Reset for CR5SS0 by the RESET FSM using MSS_CTRL::R5SS0_CONTROL_RESET_FSM_TRIGGER.

    Bit 10: MSS_RCM.MSS_CR5SS_POR_RST_CTRL0 Reset Source: mod_g_rst_n.

    The registers for SS0 and SS1 are the same except for Bit 7 and moreover, the register values are identical to the values seen before the problem occurs.

    The conclusion is that I cannot rule out a reset but if it is occurring, it cannot be distinguished from the normal condition.

  • Hello Kier,

    It might have happened that your program accessed some wrong/unexpected address and it caused a reset. 
    For debugging this, you can corrupt location 0x0.
    When you start your program, just before first call to mpu_init, corrupt value at 0x0 (instead of jumping to Common_StartUp_Steps address). So that when you come back at 0x00 an exception will occur and you can trace-back reason for this corruption.

    Best regards,
    Gunjan

  • Hello Gunjan,

    I'm not sure why this is different to inserting a breakpoint at 0x0. I shall try again with the latest code.

  • Hello Kier,

    Are you able to debug issue of reset in your code?

    Thanks,
    Gunjan