This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Gracefully Restarting AM335x + TI-RTOS Application -- Solution

Other Parts Discussed in Thread: SYSBIOS, AM3358

Hello!

I previously posted about this situation, but the whole thread seems to have gotten corrupted or something because it is no longer reaching the TI people I was discussing it with, and so I have to assume that the whole thread is also not available to the public who COULD USE this information, along with the SYSBIOS (TI-RTOS) team.  (And I believe the below is both a better and shorter explanation.)

The problem I was trying to solve is the fact that I sometimes find myself in a situation where I need to restart my AM335x + TI-RTOS application under the debugger over and over again (sometimes dozens of times) in order to trace down and better understand a problem, but the CCS "Restart" button was not working, because as soon as I would step forward or let the (otherwise working) program run, CPU exceptions would be generated.  A classic example is when the ORIGINAL problem is that a CPU exception has been generated and this causes the call stack to go away, so I cannot just look at the call stack to find out how the application got to where it is, but rather have the trace the execution manually.  These CPU exceptions that were present after pressing the "Reset" button, that were not present when doing a fresh program load, would often be around attempting to use NULL pointers and/or wisely-placed assertions that check for NULL pointers.  And if I removed one CPU exception (e.g. commenting out the code that generates it), another one would show up.  And so I started out this search for a solution with a strong feeling that SOMETHING is different between these 2 cases:

A) fresh debug session and program load, vs

B) pressing the CCS "Restart" button.

In particular, this is for the TI-RTOS team that might want to pay particular attention because some design assumptions about TI-RTOS program start-up WERE NOT geared for using the CCS "Restart" button with the AM335x architecture!  I discovered the root cause of the problem, and also found an elegant solution that, if I were the TI-RTOS team, I would want to use (at least a variation of it) for the AM335x + TI-RTOS combination, which has some unique properties.

Additionally, this is for the TI team members I was previously discussing this with whose user IDs are:  Ki-Soo Lee , ToddMullanix, and ScottG, each of whom had excellent contributing ideas that helped me find the solution.

What I'm working with:  Win7-64-bit,

Dev Env:  CCS 6.1.2

Platform:  Custom board with MYIR brand MCC-AM335X-Y board with AM3358, 250MB RAM and other electronics that seem to be working perfectly.

Packages:  SYS/BIOS 6.45.1.29, UIA 2.0.5.50, AM335x PDK 1.0.3

 

Here is the root cause of the issue and its solution:

1.  In my environment where I have an AM335x (microPROCESSOR as opposed to a microCONTROLLER), which typically loads all if its program AND data into RAM before executing from RAM, I cannot just "reset" the processor and expect the program to still be there, because it is located in DDR RAM (not SRAM where it would be preserved), and the silicon that drives the DDR and all of its refresh operations will have just been reset.  In this case, the CCS "Restart" button SHOULD work nicely, because it simply places the Program Counter (PC) back to the  _c_int00  entry point, which SHOULD shut off interrupts and re-initialize everything and allow my program to run correctly, right?  Well, not quite.  Because the program is entirely in DDR RAM, this causes a linker situation whereby these symbols defined in the linker script have the same value:

     __data_load__ == __data_start__.

2.  At the end of executing   _c_int00,  it branches to the  gnu_targets_arm_rtsv7A_startupC  function in  <bios>/packages/gnu/targets/arm/rtsv7A/startup.c, which carries out 2 loops to initialize RAM (BSS section and DATA section) before calling Startup_exec().

The loop that initializes the DATA section is supposed to populate the global (and static) variables that have initializers (example:   int  my_global = 255;).  This includes all program sections named ".data" or that have sections that start with ".data" -- which includes all the TI-RTOS  ...__state__V  struct variables, which have such initializers.  This is KEY to the "Restart" failures as you will see shortly.

The source for that loop is this (as of SYSBIOS 6.45.01.29):

	/* relocate the .data section */
	dl = & __data_load__;
	ds = & __data_start__;
	de = & __data_end__;
	if (dl != ds) {
		while (ds < de) {
			*ds = *dl;
			dl++;
			ds++;
		}
	}

Pay careful attention to what happens when  __data_load__ == __data_start__:  the loop doesn't execute!  This is correct, because it would just be copying a block of RAM into itself, which isn't going to accomplish anything except a delay.

So when there is a NEW DEBUG SESSION or a FRESH PROGRAM LOAD, with  __data_load__ == __data_start__,  the linker itself has initialized this area of RAM and the FRESH PROGRAM LOAD writes this data in its pristine, initialized state, where each variable will have its initializer value.

When the program is allowed to execute past the TI-RTOS start-up initialization, then PAUSED and the CCS "Restart" button is pushed, the PC goes back to  _c_int00  routine, and comes forward and THAT LOOP doesn't execute (and indeed cannot do any good anyway, since there is no source available to copy the freshly-initialized variable data from, save doing a fresh PROGRAM LOAD), and so the program enters main() with these variables NOT RE-INITIALIZED, but instead IN THE STATE THEY WERE IN WHEN THE DEBUGGER WAS PAUSED.

There are at least 22 TI-RTOS state struct variables (ending in ...__state__V) that are in this situation.  Only one of them is  'xdc_runtime_Startup_Module__state__V'.  And that struct has a field  'execFlag'  that after a FRESH PROGRAM (RE)LOAD, contained the value 0, and now (after TI-RTOS initialized once), and the "Restart" button is pushed, that field now contains the value 1.

So the next step in the start-up process is calling  xdc_runtime_Startup_exec__E().  This (through a function pointer) ends up in the  Startup_exec()  function in  <xdctools>\packages\xdc\runtime\Startup.c,  the code to which is:

/*
 *  ======== Startup_exec ========
 */
Void Startup_exec()
{
    Int i;

    if (module->execFlag) {  // <-- 'module' is a pointer to the xdc_runtime_Startup_Module__state__V struct,
        return;              //     so the function exits here, and all of the below does not execute!
    }

    module->execFlag = TRUE;

    for (i = 0; i < Startup_firstFxns.length; i++) {
        Startup_firstFxns.elem[i]();
    }

    (Startup_execImpl)();

    for (i = 0; i < Startup_lastFxns.length; i++) {
        Startup_lastFxns.elem[i]();
    }
}

Therefore:

1.  All the "firstFxns" do not run (HeapMem_init() is one of these -- this is what causes Memory_alloc() and Memory_calloc() to fail, which further down the line caused NULL pointers to be accessed as objects, or wisely-placed assertions to fail, generating exception messages and terminating the program).  This list can contain custom functions, as it does in my case, required to ensure DMTIMER3 (my source for my application's Clock module) has the correct input clock source.  I believe it is also here that the Interrupt Controller would be set so that all interrupts are disabled as a (correct) part of the start-up process.

2.  The Startup_startMods()  function doesn't run (most TI-RTOS system modules require this).

3.  All of the "lastFxns" do not run (which, like "firstFxns", can contain custom start-up functions).

=-=-=-=

So there you have it.  Either by design, or by oversight, TI-RTOS was not DESIGNED to work with the CCS "Restart" button where __data_load__ == __data_start__.  Although a fix to this is not difficult to implement, nor would it be difficult to implement in an elegant way in the TI-RTOS design.

=-=-=-=

I'm including (attached here) some source code that has been tested and makes the CCS "Restart" button work reliably under the AM335x + TI-RTOS combination.  All someone needs to do is copy these 3 source files anywhere into their project.  The _startup.c  file will require that the have an additional include directory (if it is not already there) in the XDCTOOLS installation that is in use by the project.  Example:  C:\ti\xdctools_3_32_00_06_core\packages.

4774.RestartAssistantSource.zip

Note that I have the following code in my  main()  routine -- before calling  BIOS_start(), I call this:

	/* Report on Restart Assistance data as soon as UART0 is ready for reporting. */
	RestartAssistant_ReportImportantDataBufferSizeDifferences();

and it reports:

INFO:  Size of DATA section storage array is larger than needed at [65536],
        whereas size of DATA section is [7116].

Note that in that code, SOLUTION NUMBER 1 is implemented and under testing has been working perfectly under the AM335x + TI-RTOS combination.  But SOLUTION NUMBER 2 is not implemented, but is a suggestion as to how the TI-RTOS team could remedy this problem for good by implementing this design concept:

Instead of:

      type_specifier  module_xyz__state__V = {struct initializer values};

It could be:

      type_specifier  module_xyz__state__V;

      type_specifier  module_xyz__state__V__initial_values = {struct initializer values};

      // then at start-up time, regardless of whether the program was freshly loaded or not:

      module_xyz__state__V = module_xyz__state__V__initial_values;   // for each state struct


And I cannot attest (yet) whether those structs ending in ...__state__V   are the ONLY  ones that would need to be treated this way, but this could easily be determined by the TI-RTOS team.

I hope this helps a lot of people.

I know it has allowed me to save a TON of time (because my program is LARGE and requires quite a lag to re-load it) -- since I can now use my CCS "Restart" button and have it work as it was intended!

Kind regards,
Vic

  • Hi Vic,

    Thank you VERY much for your efforts to help improve our products!  And to help speed development time for others, letting them avoid running into hard to diagnose problems.

    I think what you describe boils down to a general constraint: CCS “restart” functionality does not work for programs built for RAM model (where the loader is responsible for initializing variables).  This is the case if only SRAM is used too; DDR reset is definitely bad, but modified variables in SRAM are bad too.  And this isn’t specific to TI-RTOS apps, but any app that uses standard runtime startup routines (that do not have “restart assistant” support like you’ve shown).

    Does this sound right to you?

    I will follow up internally with others on the TI-RTOS team, and also with Ki-Soo from the debugger side.  I’m thinking that maybe when a “restart” is triggered for a program, and the program is built for --ram_model, that a partial load (of just .data and .bss) should be performed by the debugger.  This should be fast, and allow restart to work properly.  This keeps the boot code lean, and common between debug and deploy environments.   

    Please let me know if I’m missing something here…

    And thanks again!

    Regards,
    Scott

  • Hi, Scott!

    Not only do I think you nailed it here, but your idea about added debugger functionality is even better than trying to design (or re-design) a program or complex library/system like TI-RTOS to survive the RAM model differences. This works to the advantage of situations where RAM is tightly constrained, and thus, storing "initializer" variables in RAM to ALSO make copies of them for the run-time variables they serve, could cost RAM space that just isn't available. I also think that this role (under CCS) is well placed, considering the "knowledge" about what the "Restart" button is and does lives in CCS, that CCS can also thus rightfully have the knowledge about the different program instruction/data models and perform an intelligent step to make "Restart" not have to rely on the models for it to work. I agree that that sphere of responsibility is well placed.

    You probably know this but since you mentioned "partial load" in the same sentence as the .bss section, in the TI-RTOS (and typical default Stnd C start-up steps), the .bss section is already cared by writing 0's (zeros) across that entire block. What the TI-RTOS (and other systems) doesn't have is the ability to re-populate a fresh .data section by "reaching through the debugger" and grab the appropriate part of the .out file that contains the initialized .data section. So I mention it in case any readers are not already familiar with the differences between the .bss and .data sections.

    Thank you for your time and thoughtful evaluation!

    Have a fun & productive rest of your week!!

    Kind regards,
    Vic
  • Hi Vic,

    I talked to Ki-Soo, and then filed a CCS bug for this (CCBT-1993).

    Thanks again for all your work on this.  I hope you have a fun and productive rest of your week too!

    Best regards,
    Scott

  • Outstanding, Scott!! Thanks for the nice wish! :-)
  • Hi Vic,
    I too had been struggling with this issue so I downloaded the .zip and added the files to my project. It seems that there were are some .h files that could not be found such as hmAssertxx.h and hmGenericTypes.h. when I tried to build the project. Should these have been included in the .zip file? I searched my entire TI installation folder and could not find these so I assume these are your files rather than standard TI files.
    Regards,
  • Hi, Mike!

    My apologies.  You should be able to just delete the #include lines.  If that then doesn't compile because of a missing typedef (such as UINT32 or something similar, then just substitute the <stdint.h> version [e.g. uint32_t] or something equivalent).

    Please note that these files are (in my opinion) not a perfect solution as they don't quite cover ALL contingencies.  I believe that a modification to CCS was in the works to implement a better solution by managing the issues I found through the debugger/emulator (my memory is kind of vague on this as I've been working on other projects for the last 6 months).  Either way, I'd recommend stepping through that start-up code carefully to confirm it handles what you're seeking to handle.

    Kind regards,

    Vic