This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CCS/AM6548: SMP debug issue

Part Number: AM6548
Other Parts Discussed in Thread: TMDX654IDKEVM, SYSBIOS, CODECOMPOSER

Tool/software: Code Composer Studio

Hello All,

I have problem to run and debug SMP on AM6548.

My configuration is:
CCS ver: 9.2.0.00013
PDK - pdk_am65xx_1_0_7 (processor_sdk_rtos_am65xx_6_03_00_106)
AM65x (IDK) - TMDX654IDKEVM
SYSBIOS - 6.76.3.01

I already tried to run SMP with previous PDK versions but without success, more details can be found in following posts:

https://e2e.ti.com/support/processors/f/791/t/801637

e2e.ti.com/.../3128559

https://e2e.ti.com/support/processors/f/791/t/798502

With newest processor SDK version (processor_sdk_rtos_am65xx_6_03_00_106), first i tried to create simple SMP application. Then I set two A53 cores (first cluster) as Group 1 with "Sync Group cores" and when i tried to run SMP application, the application was stuck in function "Void Core_startup()" on following line:

/* wait for Core 0 to perform C runtime initialize */
while (ti_sysbios_family_arm_v8a_smp_Core_waitForCore0 != 0x1234567) {
;
}

Then I tried to build and run the UART_SMP_TestApp example from pdk_107. I created sync group on first A53 cluster and i load UART_SMP_TestApp app, but the application was stuck in function  "Void Core_startCoreX(Void)" on folowing line:

while (!(Core_module->syncCores[0][idx]));

I followed instructions from link below how to debug SMP Debug with Code Composer Studio:
software-dl.ti.com/.../ccs_smp-debug.html

Has anyone tried and managed to run SMP on AM6548?

Best regards,
Novica

  • Novica,

    I'll be taking a look into your issue and will follow up.

    Best regards,

    Dave

  • Novica,

    I was able to reproduce your observations, and with the SDK 6.3 release there are indeed issues with the SMP examples.

    We recently released SDK 7.0 over the weekend and I've gone through the examples there and they are working now with this update. Would you be able to give a try and is working with 7.0 a possibility for you? The download links are updated at https://www.ti.com/tool/PROCESSOR-SDK-AM65X 

    For the UART examples specifically, I confirmed both UART_SMP_TestApp and  UART_DMA_SMP_TestApp are working fine for me now.

    Best regards,

    Dave

  • Hello Dave,

    Sorry for the late reply.

    Thank you on the link.

    I will install SDK 7.0 and i will test SMP on my side. I will get back to you with test results as soon as possible.

    Best regards,

    Novica

  • Hello Dave,

    I insalled new SDK 7.0 on my side. I was able to build and run both UART_SMP_TestApp and UART_DMA_SMP_TestApp examples from SDK 7.0, and i can confirmed those examples are working on my side.

    Then i made simple SYSBIOS project in CCS without configured SMP. My idea was to create two simple tasks in which i will increment the counter variables each time when some of those two task is run by BIOS. I created both task dynamically in main() function by using "Task_create()" function. But when i tried to build that project i had build error:

    sdk_07_00_00_Bios_DynamicallyTasks.zip

    Then I created both task statically in bios config file (.cfg) and when i tried to build project i didn't have any error and i was able to run that code on IDK board.

    sdk_07_00_00_Bios_StaticallyTasks.zip

    NOTE:
    To build code i was using tools from SDK 7.0 (xdctools_3_61_00_16, bios_6_82_01_19 and gcc-arm-9.2-2019.12-mingw-w64-i686-aarch64-none-elf) 

     

    Just to be shure, i tried to build both version of projects (with statically and dynamically created tasks) with previous version of SDK 6.3 (xdc_tool_3_55_02_22, bios_6_76_03_01, gcc-linaro-7.2.1-2017.11-i686-mingw32_aarch64-elf) . Both version of project were compiled without any error.

    In our application we are creating all Bios object (task, semaphore...) dynamically so it is important for us to have this option.

    Q1: Can you help me to figure out why when i use SDK 7.0 i am not able to build project in which tasks were created dynamically ?

     

    Then i continued to work on SMP by using bios project where task were created statically. I made all modifictaion that are needed to be done to run SMP.
    For first task i set affinity to 0 (run first task on first core) and for second task i set affinity to 1. I was able to compile code without error and to run code on both cores.

    But the problem occured when i tried to pause and continue with code execution or when i tried to continue with execution after breakpoint.

    Source, bios config and linker file that i am using are given above under name  sdk_07_00_00_Bios_StaticallyTasks.zip

    I have created a video with screen capture to show what i did and after which action the problem occurs.

    Q2: Do you have idea what can cause this problem?

    Thank You very much in advance,

    Novica

  • Hello Dave,

    One update from my side.  I noticed few strange things regarding to newest SDK.

    1. First one is when i tried to import rtos_template_app_am65xx_a53 example from processor_sdk_rtos_am65xx_07_00_00_05 i noticed that example is configured to use tools/components from previous SDK 6.3 instead of newest SDK 7.0. When i changed tools/component to newest SDK 7.0 and i tried to build this project i got following error:

    2. Then i tried to generate usb CCS project by using pdkProjectCreate.bat script from newest pdk_jacinto_07_00_00. But i had following error:

    3. Then i tried include and use newest PDK in SYS/BIOS project with statically created tasks from previous post. I included UART driver from PDK but when i tried to build this example i got an error:

    Best regards,

    Novica

  • Novica,

    I had previously recompiled the SMP examples from the command line and used CCS to load and run only... I did not check from within a CCS project.

    I do see the same error when importing and compiling the BIOS version of the template app (baremetal compiles fine). I'll follow up and see what the issue is here.

    Best regards,

    Dave

  • Regarding the undefined reference to ti_sysbios_rts_gnu_ReentSupport_checkIfCorrectLibrary:

    We've had the same issue after migrating our application to the new SDK. There is a Library search path in the template application that points at ${xdc_find:gnu/targets/arm/libs/install-native/arm-none-eabi/lib:${ProjName}}

    This is apparently wrong for the A53, although I'm not sure how this worked for SDK 06.03. Anyway, we fixed this by changing the library search path to ${xdc_find:gnu/targets/arm/libs/install-native/aarch64-none-elf/lib:${ProjName}}

    Regards,

    Dominic

  • Dominic, 

    You are right. There are 5 files that need to be updated (carry-over from SDK6.3 that weren't modified):

    <pdk>\demos\rtos_template_app\am65xx\components_am65xx.dtd:  change to "com.ti.pdk.jacinto:7.0.0"

    <pdk>\demos\rtos_template_app\common\cgt_a53_ver.dtd: change to "GNU_9.2.1:null aarch64"

    <pdk>\demos\rtos_template_app\common\common_components.dtd: change to "com.ti.rtsc.SYSBIOS:6.82.01.19" 

    <pdk>\demos\rtos_template_app\common\xdc_ver.dtd: change to "3_61_00_16_core"

    <pdk>\demos\rtos_template_app\am65xx\evmAM65xx\A53\template_app\rtos\rtos_template_app_am65xx_a53_evmAM65xx.projectspec: change lib path to "-L${xdc_find:gnu/targets/arm/libs/install-native/aarch64-none-elf/lib:${ProjName}}"

    With these changes the CCS project should build properly. This has been filed for correction to the SDK for the next release.

    Best regards,

    Dave

  • Hello Dominic,

    Sorry for the late reply.

    Thank you very much for sharing information related to this issue, it was really helpful for us.

    Regards,

    Novica

  • Hello Dave,

    I updated 5 files as you described in your latest post and now i am able to build project without error.

    But still i have the same issue regarding to SMP debugging.

    • I can compile+link simple SYSBIOS application in SMP mode with SDK7
    • But i cannot debug it

    Do you have any information about progress on this SMP issue?

    Best regards,

    Novica

  • Novica,

    I'm working on your debug issue, though progress is slow. Advice to pass along is to be sure that hardware breakpoints are used exclusively (disable SW interrupts in the debug configuration), and load the binary to both CPUs in the sync cluster and not just one. 

    I have some some further follow up internally later today get further assistance.

    Best regards,

    Dave

  • Novica,

    I have some additional guidance that should hopefully help.

    We recommend to create a user-defined idle function that does nothing / something innocuous. OPtions include WFI or spin loop. This will be the lowest-priority task in the system. Without it then execution spins in a loop directly in the BIOS kernel

    Some tips for debug: Select the group when loading the cores. Select cores individually for run/halt/breakpoint configuration. Also, disable the automatic run to main in the debug configuration..

    I have your task configuration on my board and will get some addition updates and screen captures tomorrow.

    Best regards,

    Dave

  • Hello Dave,



    Thank you on the reply and suggestion.

    Today I will apply tips that you suggested and i will go back to you with results.

    Best regards,

    Novica

  • Hello Dave,

    I was forced to postpone my activity on SMP for one day.
    I will continue with SMP activities as soon as possible

    Best regards,
    Novica

  • Hello Dave,

    I'm working on your debug issue, though progress is slow. Advice to pass along is to be sure that hardware breakpoints are used exclusively (disable SW interrupts in the debug configuration), and load the binary to both CPUs in the sync cluster and not just one. 

    To use hardware breakpoints  I unselected Allow software breakpoints to be used in the debug configuration (but i didn't know how to disable SW interupts in the debug configuration). Then i load the binary to both CPUs in the sync cluster, as you mentioned, but the execution is stuck on the first line of main() and i was not able to start execution by pressing Resume(F8) button (the code execution remained stuck on first line of main() function).

    We recommend to create a user-defined idle function that does nothing / something innocuous. OPtions include WFI or spin loop. This will be the lowest-priority task in the system. Without it then execution spins in a loop directly in the BIOS kernel

    I created Idle Task, some counter variable is incremented inside of that Idle task. But i didn't understand what you mean when you said "OPtions include WFI or spin loop"?


    Some tips for debug: Select the group when loading the cores. Select cores individually for run/halt/breakpoint configuration. Also, disable the automatic run to main in the debug configuration..

    - I disabled the automatic run to main in the debug configuration.
    - I am not shure now should I load binary to both CPUs in the sync cluster separatly or should i load code to sync group (to both cores in the same time)?
    - I am not shure how to select cores individually for run/halt/breakpoint configuration. Can you cleraife to me how to do this?


    Best regards,

    Novica

  • Novica,

    I did a number of experiments today and I am not having success with the sync group enabled for Core 0 & Core 1. As you are, I see Core 0 stuck in Main and Core 1 stuck waiting for Core 0. I'm checking with colleagues.

    However, I am having success leaving the CPUs un-grouped and doing explicit commands. If I load Core 1 before Core 0 then things are running as expected. If, however, I load Core 0 before Core 1 then I see similar behavior to what we have with the sync group.

    Can you try the same? Some sequences I ran through;

    NO SYNC GROUP
    =============
    Load C0
    Load C1 - Halt
    Run C0
    Restart C1
    ** FAIL
    C0 stuck at Main
    C1 spinning waiting for C0

    Load C0
    Run C0
    Load C1
    ** FAIL
    C0 waiting for C1
    C1 in the weeds

    Load C1 - Halt
    Load C0
    Run C0
    Restart C1
    SUCCESS

    Load C1 - Halt
    Load C0
    Restart C1
    Run C0
    SUCCESS

    Load C1
    Load C0
    Run C0
    SUCCESS

    SYNC GROUP
    Load Group
    Run Group
    ** FAIL

    Load Group
    Run Group
    Halt Group
    Restart Group
    ** FAIL

    Load Group
    Run C0
    Run C1
    ** FAIL

    Load C1
    Load C0
    Run
    ** FAIL

  • Novica,

    I had a configuration error on my tests and did not properly adjust my debug configuration for Core 1. This explains why I was needing to explicitly halt Core 1 in some of my runs. Both cores need to independently have the run-to de-selected in the debug configuration. Let me know if his was already the case for you in your earlier observations.

    With this corrected I'm getting more consistent behavior with the Sync Group enabled as well.

    Best regards,

    Dave

  • Hello Dave,

    Sorry for delay, I was on short vacation but now i am back.

    I performed the same tests as you did. I results are

    • All the tests that don't work on your side don't work on mine either
    • All the tests that work on your side partially works on my side. For example if i do folowing sequences: Load C1 - Halt->Load C0->Run C0->Restart C1 then application will work fine and with success. But then if i HALT core 0 and core 1 and i tried to continue with code execution i will have the following error:

     

    The same thing happens for the other two tests that you was able to run with success.

    "Both cores need to independently have the run-to de-selected in the debug configuration. Let me know if his was already the case for you in your earlier observations." 

    I saw that you attached a picture in your replay but I can't see it and I was not successful in finding run-to de-selected option. Can you give me instruction where to find  run-to de-selected option in debug configuration?


    Best regards,

    Novica

     

  • Novica,

    Thanks for pointing out the image didn't post properly. I've attached two images. What I'm looking to point out and confirm is that you have disabled the run-to-symbol option for both cores in the SMP cluster. In my earlier testing I had disabled only for Core0, but not for Core1 and this was causing problems.

    Best regards,

    Dave

  • Hi Dave,


    Thanks for clarifying this.

    I disabled the run-to-symbol option for both cores for the SMP cluster in the debug configuration and i made following tests:


    SYNC GROUP
    =============

    Load Group
    Run Group
    **Application is running on both core with SUCCESS
    Halt Group
    Run Group
    **FAIL
    In Console window is printed:
    CortexA53_0_0: Unhandled ADP_Stopped exception 0x7002B5E0

    Load Group
    Put breakpoint
    Run Group
    **Application is running on both core with SUCCESS until breakpoint is reached
    Run Group
    **FAIL
    In Console window is printed:
    CortexA53_0_0: Unhandled ADP_Stopped exception 0x7002B5E0


    NO SYNC GROUP
    =============

    Load C0
    Load C1
    Run C0
    Run C1
    ** Application is running on both core with SUCCESS
    Halt C0
    Halt C1
    Run C0
    Run C1 - **FAIL after this step
    In Console window is printed:
    CortexA53_0_1: Unhandled ADP_Stopped exception 0x7002EF80
    CortexA53_0_0: Unhandled ADP_Stopped exception 0x7002B5E0


    Load C0
    Run C0
    Load C1
    Run C1
    ** FAIL - execution of application fails

    Load C1
    Run C1
    Load C0
    Run C0
    ** Application is running on both core with SUCCESS
    Halt C1
    Run C1
    ** Application is running on both core with SUCCESS
    Halt C0
    Run C0 - **FAIL after this step
    In Console window is printed:
    CortexA53_0_0: Unhandled ADP_Stopped exception 0x7002B5E0
    CortexA53_0_1: Unhandled ADP_Stopped exception 0x7002EF80

    Can you try the same on your side?
    Do you have the same results of testing on your side as i did or these tests fails only on my side.

    Best regards,

    Novica

  • Novica,

    After fixing my debug configuration on Core 1 I'm seeing correct behavior. I'm able to halt/resume at the group and core level and step. I see the application continue to run and see the count variables increment.

    Let me do a couple variations and I'll confirm I see consistency for the sync-group and no-sync group, etc.

    To confirm, I'm on CCS v10.

    Best regards,

    Dave

  • Novica,

    I re-checked changes to your files I did on my side and they were minor. I did add an Idle function to the project.

    in the cfg file:

    >>>

    /* Add Idle function */
    var Idle = xdc.useModule('ti.sysbios.knl.Idle');
    Idle.addFunc('&taskFxn_2');

    <<<

    In main.c:

    >>>

    Void taskFxn_2(UArg a0, UArg a1)
    {
    while(1){}
    }

    <<<

    Both sync group and non-sync-group are letting me load/run/step without throwing an error. The sync-group operation does this more smoothly. I additionally set hardware breakpoints on the counter increments and when running you can see cores reach the "Suspended - HW Breakpoint" (for the core hitting the BP) or "Suspended" (for the other).

    Best regards,

    Dave

  • Hello Dave,

    I re-checked changes to my files and I found what was causing problem on my side. I did create Idle task but i didn't put spin loop inside Idle task. After i put spin loop I do not have SMP debug problem  anymore.  

    Just one note, for SMP testing i was using the project which has statically created tasks in BIOS config file.

    Then in the same project (with statically created tasks) i add UART driver to print somthing on UART0. I can build project without error but when i tried to run code the code execution is stuck on line 97 in file  Hwi_asm_vecs_gnu.sv8A:

    But when i remove MMU init code from BIOS config file i can run the code and i can do SMP debuging without problem. I removed following code from BIos config file.

    var Mmu = xdc.useModule('ti.sysbios.family.arm.v8a.Mmu');
    Mmu.initFunc = "&HalInitMmu";
    Mmu.tableArrayLen = 30;

    Does in the case when the SMP is enabled, the initialization of the MMU must be done differently?

    sysbios_UART_Minimal_staticallyTasks.7z

    Then i continued with further testing. Now, instead to create all tasks statically in bios config file I create tasks dynamically in main function (in our project all BIOS object are created dynamically).

     I can build project without error but when i tried to run code on sync group the code is stuck in file C:\ti\bios_6_82_01_19\packages\ti\sysbios\family\arm\v8a\smp\Core.c 


    sysbios_Typical_dynamicallyTasks.7z

    Can you try to run both projects on your side?

    Do you have the same results of testing on your side as i did or these tests fails only on my side.

    Best regards,

    Novica

  • Novica,

    Thanks for your update, and glad to see that you were able to get through the initial configuration so we're seeing the same. I'll look through the MMU configuration and dynamic task creation issues you're reporting and come back to you.

    Best regards,

    Dave

  • Hi Novica,

    Re-opening this thread based on request. I am looking into this issue and will report back with updates.

    Can you tell me if it would be possible for you to move to SDK 7.1 for AM65 ? It would be easier to support.

    Regards

    Vineet

  • Hello Vineet,

    Thank you on the reply and the effort.
    I have taken over this task from Novica.

    I moved to SDK v07.01 for AM6548 and I also used CCS 10.2
     
    and still have exactly the same issue as Novica did.

    Best regards,
    Milan

  • Hi Milan,

    Thanks for replying. I will take a look

    Regards

    Vineet

  • Hi,

    Sorry for the delay.

    I have made some progress on this topic, I was able to build and run an SMP application after some experimentation.

    Here are the steps I followed

    1. Download and install the AM65x 7.01 RTOS SDK

    2. Import the rtos_template_app_am65xx_a53 in CCS (I am using v 9.3) inside processor_sdk_rtos_am65xx_07_01_00_14/demos/rtos_template_app

    3. Apply the attached patch which modifies the BIOS configuration and application to enable SMP, adds Idle task and sets task affinity

    4. Re-build the project

    5. Group the first two cores in the AM65 target configuration CortexA53_0_0 and CortexA53_0_1. (Don't use the other two cores as SMP is not supported for them)

    6. Run the GEL initialization script <SDK_INSTALL_PATH>/pdk_am65xx_07_01_00_55/packages/ti/drv/sciclient/tools/ccsLoadDmsc/am65xx/launch.js (modify the variable thisJsFileDirectory inside the script before). This will initialize the IDK.

    7. Connect the SMP group composed of CortexA53_0_0 and CortexA53_0_1 once initialization is complete.

    8. Load the project and step through the code.

    Assumptions: Modules which are part of SDK 7.01 have been imported in CCS and those toolchains are selected for the project

    Patch for SMP application

    SMP_app.patch

    Regards

    Vineet

  • Hello Vineet,

    Milan tried to answer but got an error message from the (new) forum.
    But he anyway hands over this topic to me, because he is going to work on other urgent things.

    He told me that he tried your patched project rtos_template_app and it works fine on his side.
    He can place breakpoint in task that is running on A53 core 0 and also breakpoint in task running on core 1 and debug seems to be working fine.

    Our plan is to use core 1 running comunication protocol based on ethernet (using emac driver) so in a way this is connected to this post:
    e2e.ti.com/.../am6548-using-emac-driver-in-mii-mode-on-am6548-sr2-0
    The rest of the application will be running on core 0.

    My plan is now to also try your example on my own, then find out the difference between our failing app and yours which works fine.
    Hopefully we just did something wrong.

    Afterwords I will try to migrate our 'real' application to SMP (with fix affinities to core 0) and do some tests.
    So this will take some time and then I come back.

    Best regards,
    Ruediger

  • Hi Vineet,
    now I tested according to your last post.
    But I did the tests with latest CCS v10.2.

    As you described, I imported rtos_template_app_am65xx_a53 and applyed your patch.
    In your patch, in function biosTaskCreate() the new function argument affinity was not applied to the taskParams.
    I corrected that.

    Then I created the Sync Group, loaded the program and let it run.
    It seem to run as expected.
    To confirm SMP behavior, I replaced LED blinking in GPIO task with just incrementing counter and removed task_sleep() so that task would never yield.
    As GPIO task has higher priority than uart task if both are running on the same core uart task would never execute which is not the case.

    Then I inspected the debugging problems we had so far.
    Result:

    1. Using HW Breakpoints seem to work
    2. Single-stepping seem to work
    3. Using SW Breakpoints does NOT reliably work well.
      The SW breakpoint adds a "hlt #0" instruction at the location, and does not remove it anymore.
      So the execution gets stuck at this location and does never resume.
    4. "Step-Over" does NOT work, since this implicitely uses SW breakpoints

    How to reproduce this debug-issue using your sample application:

    METHOD A: use core 0 and Step-over

    1. After loading the program, place a SW breakpoint in function gpio_toggle_led_task() just at the line GPIO_toggle()
    2. Start the cores and the target will halt at the breakpoint. gpio_toggle_led_task() runs at core 0.
    3. In the disassembly window you can see that the core is at
      mov w0, #0
      bl #0x7000b420
    4. Press F6 (Step over)
    5. Now, execution is at the next line "Task_sleep(LED_BLINK_DELAY_VALUE);"
    6. In the disassembly window, press icon "refresh view" - then you see that the core is at
      hlt #0
      bl #0x70018a70
      See attached screendump. [I did not manage to attach something in this new forum, I can enter just an URL]
    7. Now you cannot step anymore, and you cannot run the core anymore. It got stuck at his breakpoint forever.

    I had the same effect not by "Step Over" but just by using several SW breakpoints, just running from one to the next.

    METHOD B: use core 1 just with a simple SW breakpoint

    1. After loading the program, place a SW breakpoint in function uart_task() just at the line UART_printf(echoPrompt);
    2. Start the cores and the target will halt at the breakpoint. uart_task() runs at core 1.
    3. You need to select core 1 and in the disassembly window you can see that the core is at
      UART_printf(echoPrompt);
      hlt #0
      bl #0x7000fa10
    4. Now you cannot step anymore, and you cannot run the core anymore. It got stuck at his breakpoint forever.

    I guess this issue comes from the way the debugger manages SW breakpoints, so it's maybe a CCS/debugger issue and not a SYSBIOS issue?

    Regards,
    Ruediger

  • Ruediger,

    I'm unlocking this thread to allow further discussions here. My colleague will start posting updates here soon.

    Regards

    Karthik

  • Hi Karthik,
    I try again to upload the screendumps.

    Ok, it worked now. Maybe I had an issue with my web browser before (chrome).

    Regards,
    Ruediger

  • Hi Ruediger, 

    I tried both the UART SMP demo and the RTOS_template_app as you and Vineet tried. I set several breakpoints and it seems working for me. One note though - I used the run,stepping... from the debug pane, instead of the top level tool bars. You can show the debug pane tool bar by clocking the three dots on the top right corner of the debug window. see below screenshots where I circled areas 1 and 2. 

    I also attached my complete project, so you can try on your side. If we are still seeing differences, and your test code is free of any private code, you can send yours so i can cross check. 

    regards

    Jian

    me. This is the rtos_template_app_am65xx_a53.zip

  • Hi Jian,

    thank your for the response.
    I will test this in the next days and tell you then more.

    What I can say today is that in your app.c file one line is missing which sets the affinity.
    But this has maybe no impact on debugging, we will see.

    Regards, Ruediger

  • Hi again Jian,

    now I did a further test with exactly the project you sent me (without the affinity fix).
    It shows the same issues as I had before: When using "step over" the core keeps haltet.
    See this picture:

    Could you try again at your setup?

    If everything really works at your side, I think we need to compare our debug chain.
    Therefore I attach here in advance some information of my CodeComposer configuration:

    smp_ccs_configuration.zip

    smp_ccs_debugconf.zip

    There is also a (little) chance that I just make something wrong, when connecting/loading/debugging. If this is the case, a common debug session would help.

    Regards,
      Ruediger

  • Ruediger, 

    I think I can see what you observed - the disssembler view  must be open. I tried the following:

    1. with disassembler window open, "Move" to the next line after hlt #0, you can step further. 

    2. without disassembler window open, the source code can step over fine. 

    Could you confirm both 1 and 2 on your side?

    I assume the hlt instruction was added so RTOS can execute at next interrupt. I will check with my team on RTOS-aware debugging, and update late my Monday. 

    regards

    Jian 

  • Ruediger, 

    also can you confirm you grouped the cores as "Sync Group"?

    I also sent a invite to compare our CCS configurations. 

    Jian

  • Ruediger, 

    I had a debug session with the author of the the document:

    https://software-dl.ti.com/ccs/esd/documents/ccs_smp-debug.html#debugging-with-a-sync-group

    We noticed that the MSMC region, where the shared code space was loaded, was not setup for the debug. Thus the debugger does not know the memory is shared and cores are seeing different memory content. we saw the HLT instruction viewed from one core but not from the other. this issue will only show up when we set breakpoint or step through code.  

    Since the we use .js script to load sysfw, the recommended steps to add the shared memory for the debugger would be:

    1. create a GEL file that sets up the shared memory regions:


      GEL_TextOut("MSMC and DDR memory mapped as shared with cross triggering for SMP Configrations.\n");

      GEL_MapAddStr(0x70000000, 0, 0x00200000, "R|W|AS4|SH1C", 0); /* MSMC Area, no need to specify CACHE as MSMC is IO-coherent */
      GEL_MapAddStr(0x80000000, 0, 0x80000000, "R|W|AS4|SH2C", 0); /* DDR Area, no need to specify CACHE as DDR is IO-coherent */

    2. Add GEL file for both A53 CPUs in the target config file. Thus when both cores are connected, the GEL get executed. 

    Please give it a try. I will also try on my side. 

    regards

    Jian 

  • Jian,

    I tested your suggestions with the GEL files, but it doesn't help.
    The problem with the "hlt #0" instruction is still there.

    What I have done:

    1. for both A53 cores I added a separate GEL file and loaded them via CCS 'hotmenu' after loading the program, but before starting debugging
    2. for core 0, I added in M4_R5orA53_Startup.gel function Startup() the line
      GEL_LoadGel("$(GEL_file_dir)/../A53_debug_smp_c0.gel");
    3. for core 1, I added A53_debug_smp_c1.gel in the target configuration

    Here I attached the two GEL files (they are nearly identical)

    A53_debug_smp_c0.gelA53_debug_smp_c1.gel

    After program load, I selected each core and run the GEL function via CCS menu "Scripts" -> "default" -> "a53_debug_smp_c0/1"

    Console output is

    CortexA53_0_0: GEL Output: MSMC and DDR memory mapped as shared with cross triggering for SMP Configrations.
    CortexA53_0_1: GEL Output: MSMC and DDR memory mapped as shared with cross triggering for SMP Configrations.

    When I run these GEL functions BEFORE program load, the program doesn't run, it always gets stuck.
    So I run these AFTER program load, but like I said before, it didn't solve the problem.

    Any other ideas?
    Ruediger

  • Ruediger, 

    I did the test on my side using similar steps, where the GEL was ran before the program load for both cores. without enabling any breakpoint, the program runs properly, but once enable breakpoint (debugger started), the program got stuck in idle loop, similar to what you've seen. So I suspect additional memory configuration is needed. I have another work session with the CCS team tomorrow and will report back after that. We may need to describe full memory map for the debugger. 

    There is a working example for DRA7xx, at:

    C:\TI\ccs1031\ccs\ccs_base\emulation\gel\DRA75x_DRA74x\DRA7xx_cortexa15_cpu0_startup.gel

    in case you want to experiment more in the mean time. 

    regards

    jian

  • Ruediger, 

    Just wrapped up a work session with Ki-Soo from our CCS team. We still seeing two issues:

    1.  CCS->Tools->Memory Map menu does not show correct memory region we setup for the debug (this is likely a cosmetic issue that he will look into)

    2. core0 is suspended for "unknow reason" when core1 hit the breakpoint then resume. Though we no-longer seeing the unwanted HLT instructions, we still see an issue with the debugger.

    Ki-soo will get a board and do some digging on his side. In the mean time, can you add a GEL command:

        GEL_MapOn();

    in the end of GEL file, and give it a try, then confirm to me if you still see the HLT instruction on your side. 

    My complete GEL file is below. I broke up the memory regions matching to the linker command file we used. but it should not matter to designate the whole MSMC and DDR region in one block. 

    Thanks

    Jian

    hotmenu a53_debug_smp_c0()
    {
    GEL_TextOut("MSMC and DDR memory mapped as shared with cross triggering for SMP Configrations.\n");

    GEL_MapAddStr(0x41C00000, 0, 0x00080000, "R|W|AS4|SH1C|CACHE", 0); /* OCMC Area,  CACHE management is needed */
    GEL_MapAddStr(0x70000100, 0, 0x00000F00, "R|W|AS4|SH2C", 0); /* MSMC Area, no need to specify CACHE as MSMC is IO-coherent */
    GEL_MapAddStr(0x70001000, 0, 0x00001000, "R|W|AS4|SH3C", 0); /* MSMC Area, no need to specify CACHE as MSMC is IO-coherent */
    GEL_MapAddStr(0x70002000, 0, 0x001FE000, "R|W|AS4|SH4C", 0); /* MSMC general Area, no need to specify CACHE as MSMC is IO-coherent */
    GEL_MapAddStr(0x80000000, 0, 0x10000000, "R|W|AS4|SH5C", 0); /* DDR0 Area, no need to specify CACHE as DDR is IO-coherent */
    GEL_MapAddStr(0x90000000, 0, 0x10000000, "R|W|AS4|SH6C", 0); /* DDR1 Area, no need to specify CACHE as DDR is IO-coherent */
    GEL_MapAddStr(0xA0000000, 0, 0x60000000, "R|W|AS4|SH7C", 0); /* DDR2 Area, no need to specify CACHE as DDR is IO-coherent */
    GEL_MapOn();
    }

  • Jian,

    with the additional GEL_MapOn() I can confirm that we made now good progress!
    As long as I debug in core 0 everything is fine: HLT instruction is no more visible and I can "step-over" any source lines, even the Task_sleep().

    The only remaining topic seems now the issue when stepping in core1. Debugging here leads really to get stuck somehow.
    I am curious what Ki-Soo will find out.

    Regards, Ruediger

  • Ruediger, 

    Ki-soo and I had another work session seems we can step over on core 1 also. Can you confirm back which line the code get stuck when you were stepping on core1? 

    Also can you try the following steps to see if you can reproduce the same steps without any errors:

    1. Run .js, connect to the synchronous cores. Run the SMP GEL for core 0, then run the same SMP gel for core 1 (note you must run the GEL for both cores each);

    2. set three breakpoints in app.c, as shown below screenshot

    3. verify you can hit all three breakpoints repeatedly. Note that the second breakpoint is a scanf() thus you need to input some character in the UART window.

    3. Test Step Over at each breakpoints, by first hit a breakpoint upon run, then Step-Over. For UART_Printf(), you should see UART lines printed out; for UART_Scanfmt(), you will be able to see assembly step over, but to be able to type a character in the  UART window, you will need to run the program until the next breakpoint. 

    Not sure if I missed any descriptions, we can always do a webex session to look at your screen. 

    Jian

  • Jian,

    I tested again, following exactly your instructions and setting the breakpoints at the same lines you described:
    For a while it works fine, but after some stepping and setting other breakpoints I have strange effects:

    • When "stepping over" a UART_printf in the switch 'default' block, debugger stops inside (!) function UART_printf() at the beginning.
      Then, when clicking 'Step return' the system runs until it hits the next breakpoint.

    • When setting an additional breakpoint for core 0 in function 'gpio_toggle_led_task()' and running to alternating breakpoints core 0 / core 1, the core 1 gets suddenly stuck completely. I even cannot do assembly stepping on that core anymore, it just doesn't step.
      In this situation assembly stepping of core 0 works, but not for core 1.

    I attached pictures here.

    Maybe we could make a debug session, so that I can show the behavior to you?
    Next week I am on vacations and I am back at June 14.

    Regards, Ruediger

  • Ruediger, 

    Ki-soo and I have seen similar errors where "core get suspended due to reason-unknown". But we were not able to reproduce with the steps i described, so we thought the extra bottle of beer cured CCS :)  Likely we can following your additional steps to recreate the issue and no need for for a call. 

    We suspect the rest of SOC memory map need to be spelled out to the debugger, beyond MSMC and DDR. So likely when stepping across a peripheral operation, the same issue happened where a MMR address was not being setup as shared memory to both cores. - to be confirmed.

    regards

    Jian

  • Just a quick update: I can reproduce this issue locally and have escalated it to engineering. I filed a bug for this. Tracking ID: https://sir.ext.ti.com/jira/browse/EXT_EP-10453