This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

MCU-PLUS-SDK-AM243X: RPRC-image-problem when loading from flash by Bootloader (part2)

Part Number: MCU-PLUS-SDK-AM243X

Hello,

as mentioned in https://e2e.ti.com/support/microcontrollers/arm-based-microcontrollers-group/arm-based-microcontrollers/f/arm-based-microcontrollers-forum/1106819/mcu-plus-sdk-am243x-rprc-image-section-loading-problems/4103767#4103767

I still have a problem with loading our application from Flash. I finally got to debug it correctly.

So if I load our Bootloader to the OSPI-Flash and it will load our application, it won't run. If I then connect to it, it stays in a data-abort, which is called by the loaded application (which is located in flash). I can verify that since, when I connect and load the symbols to the stored application from my elf-file (which is the same used to create an appimage with RPRC), I will stay exactly there, so the symbols match which means it's the application which was stored in Flash and loaded correctly by the Bootloader.

So it is not the bootloader which creates this abort and we can see via the connected UART that the bootloader behaves correctly and loads the correct image.

And finally I was able to set a HW-breakpoint at 0x00000000 where our application-vectors are located and where the entry of the application is.

So, I did reset our board and load the bootloader-image directly to the board via CCS (its exactly the same one stored onto flash, even with the debug-symbols). I run this one and then it stops at 0x00000000, so where our loaded application starts (after runCpus()...).

If I do debug like this and just let the application run from here, our application starts and runs as expected. I do not even need to load the symbols. It's exactly the application loaded by the Bootloader. So I cannot see where the abort happens. If I just let the Bootloader run from Flash without loading it via CCS, it crashes into an abort.

I have no more idea how I can debug this one or where I need to have a look or what I am doing wrong here.

Best regards

Felix

  • Hi Felix,

    Looks like the appImage is the one which causes the data abortion. Can you add the debug log in the beginning of the main(), so that you can confirm data abortion happens before or after main().

    If it is before the main() then it is most likely the MPU settings. If it is after the main(), then you can add more debug log to pin point where the problem is.

    Can you do the assembly single step in the bootloading case from 0x00000000?

    Best regards,

    Ming

  • yes that's what I did already. But I think at least we could identify the problem a bit better:

    It seems a third-party-library just expects CCS to be connected and prints out a string. When I am connected with CCS and the device, so in fact, when I am fast enough to connect to the currently repowered device and get it before it starts our application, it will start fine. And additionally it will print out something in the CCS-Console. The abort does not happen in this case. No matter whether symbols are loaded or anyting.

    I did deactivate all Debug-Logs via the SysCfg but it seems the lib just hard-prints something. And I think that this is the issue. Is there some kind of "connection" the application of the device builds up, when being connected to CCS and which can produce such an abort, when you hard-write something to it, when there is no CCS?

  • Hi Felix,

    What " third-party-library" are you referring to?

    I sounds like the CCS console is required somehow. As far as I know, you can disable the Debug Log completely in example.syscfg. 

    One more thing you can try is to use the memory log instead of CCS console or UART. This way, you can check what has been output (via CCS Runtime Object View) once the JTAG is connected.

    Best regards,

    Ming 

  • Hey Ming, not sure if I can post it here since some NDA stuff, but it seems that this output wasn't the problem.

    We arranged to delete that ouput but that abort happens still. I got a bit further: The abort also happens when I'm connected to CCS but only some times, not always.

    So if I connect fast enough, before the Bootloader loads the application and I have my breakpoint at 0x00000000, I can then run it and sometimes it runs fine, sometimes it runs into the abort.

    Now something interesting happens: I can set SW or even a HW-breakpoint into the abort-functions of the sdk but it does not trigger. what instead happens if the abort occurs: CCS shows me "A reset occured on the target" for the core which throws that abort. I can then just connect to it and it stays inside the abort-function.

    This only happens sometimes. If I step through my code with some more SW breakpoints it runs fine every time.

    Additionally: if no symbols for that app are loaded in CCS it will somehow really catch that HW-breakpoint inside the abort-handler. CCS tells me in the call-stack that it may come from a HW_WR_FIELD32_RAW, but in CCS all the given values to that function are 0 and nothing more.

    I had a look at the registers of the SoC but I can't figure out which ones give me more information. Inside the abort-registers I see the following:

    the value of R13 would link to the TCMB. at this location our .undefinedstack is located. the R14 links to the HW_WR_FIELD32_RAW

    we link it like this:

        GROUP {
            .text.hwi: palign(8)
            .text.cache: palign(8)
            .text.mpu: palign(8)
            .text.boot: palign(8)
        } > R5F_TCMB

    But I don't think that this is a problem. What would you suggest as next steps to catch that issue?

    And yes I may use the memory-log but it is a bit unclear for me how to use that. where in memory is this one logged? We do not use any example-syscfg or example-linker-script. And I'm not sure we get reliable logs for that issue.

  • Hi Felix,

    I an not sure how big is your __SVC_STACK_SIZE in linker.cmd. Can you increase it from 256 to 4096 (if it is not already the value). What you described sounds like stack overflow. 

    HW_WR_FIELD32_RAW is the MMR device register write. I do not understand why it causes the undefined instruction error.

    Best regards,

    Ming

  • it's already 4096 because of the freeRTOS running.

    the abort is ending in the HwiP_data_abort_handler. Investigating this a bit further:

    this is the HW_WR_FIELD32_RAW disassembly. So I looked at the four registers, where the arguments are stored:

    the address of 0x28001040 is in the range of the ADC0 and exactly this address seems to be the adc-control-register.

    We are using the ADC so this could really be the issue but I do not understand the timing problem here. The call seems to be a call to ADCPowerUp().

    Seems like this one comes too early?

    I could track it down with doing a hw-breakpoint at the start of ADCPowerUp() and at the end of the function (where the asm-pop is called). This start is reached but if I continue running it runs into the data-abort and does not reach the end of ADCPowerUp().

    Seems like I found the faulty one.

    But I really do not know what I can do here.

  • Hi Felix,

    Here is the ADC module initialization procedure:

    System_init
       PowerClock_init(void)
          Module_clockEnable();  <-- where ADC module clock is enabled

    adc_singleshot_main
       App_adcInit
          ADCClearIntrStatus 

             HW_WR_REG32(baseAddr + ADC_IRQSTATUS, intrMask);  <-- first ADC MMR write
          ADCPowerUp
             HW_WR_FIELD32(baseAddr + ADC_CTRL, ADC_CTRL_POWER_DOWN, ADC_CTRL_POWER_DOWN_AFEPOWERUP);
          ClockP_usleep(5U);
          ADCInit(baseAddr, FALSE, 0U, 0U);

    I would put some delay between System_init() and adc_singleshot_main() or in the beginning of App_adcInit() .

    Best regards,

    Ming