This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CLA: nested function calls

Hi all,


I'm trying hard to get some decent code running in the CLA. It seems that I can only call one level of function hierarcy.

The following CLA-code works, I can tell from my blinking LED as the task gets called via a timer interrupt:


float32 calc2(float32 argb){
    return argb+1.0;
}

float32 calc(float32 arga){
    return calc2(arga)+1.0;
}

__interrupt void cla_Task8(void){
    LEDstate = LEDstate^1;
    out = calc2(in);
}

Now if I add one more level of function call nesting it does not work anymore:

float32 calc2(float32 argb){
    return argb+1.0;
}

float32 calc(float32 arga){
    return calc2(arga)+1.0;
}

__interrupt void cla_Task8(void){
    LEDstate = LEDstate^1;
    out = calc(in); // calc2(in);
}

This behavior contradicts what the C/C++ compiler manual www.ti.com/lit/ug/spru514j/spru514j.pdf says:

The CLA compiler supports multiple nested levels of function calls. The restriction to two levels of function
calls has been removed.

 

So what is going wrong? Using a F28379D and the TI 15.12.1 compiler. I use compiler option --cla_support=cla1.

Thanks, Tim

  • Hi Tim,

    Is the compiler generating an error for that nested call or is the code just not working when you make that nested call? If its the latter please post the generated assembly for the code you have shown.
  • Hi Vishal,

    no, the compiler does not generate an error/warning, the program is just not working.
    The corresponding section from my .map file:
    0000a800 00000020 CLATest.obj (Cla1Prog:_calc)
    0000a820 0000001c CLATest.obj (Cla1Prog:_cla_Task8)
    000a83c 0000000c CLATest.obj (Cla1Prog:_calc2)

    and the assembler:
    16 float32 calc(float32 arga){
    c, calc():
    00a800: 8902 AND ACC, @0x2
    00a801: 7740 NOP *-SP[0]
    00a802: 0000 ITRAP0
    00a803: 7FA0 MOV @AR0, AR7
    00a804: 8900 AND ACC, @0x0
    00a805: 74C0 SUB *+XAR0[0], AL
    17 return calc2(arga)+1.0;
    00a806: 0036 TRAP #22
    00a807: 799F MOV *+XAR7[AR1], AR1
    00a808: 0000 ITRAP0
    00a809: 7FA0 MOV @AR0, AR7
    00a80a: 0000 ITRAP0
    00a80b: 7FA0 MOV @AR0, AR7
    00a80c: 0000 ITRAP0
    00a80d: 7FA0 MOV @AR0, AR7
    00a80e: 3F80 MOV *XAR0++, P
    00a80f: 77C0 NOP *+XAR0[0]
    18 }
    00a810: 8902 AND ACC, @0x2
    00a811: 7700 NOP
    00a812: 0000 ITRAP0
    00a813: 7FA0 MOV @AR0, AR7
    00a814: 0000 ITRAP0
    00a815: 7FA0 MOV @AR0, AR7
    00a816: 0000 ITRAP0
    00a817: 7FA0 MOV @AR0, AR7
    00a818: 0000 ITRAP0
    00a819: 79AF MOV *BR0--, AR1
    00a81a: 0000 ITRAP0
    00a81b: 7FA0 MOV @AR0, AR7
    00a81c: 0000 ITRAP0
    00a81d: 7FA0 MOV @AR0, AR7
    00a81e: 0000 ITRAP0
    00a81f: 7FA0 MOV @AR0, AR7
    21 LEDstate = LEDstate^1;
    c, cla_Task8():
    00a820: 0000 ITRAP0
    00a821: 7841 MOV *-SP[1], AR0
    00a822: 8000 MOVZ AR7, @0x0
    00a823: 7580 SUB *XAR0++, AH
    00a824: 0001 ABORTI
    00a825: 7881 MOV *XAR1++, AR0
    00a826: 0004 PUSH RPC
    00a827: 7CA0 MOV @AR0, AR4
    00a828: 8000 MOVZ AR7, @0x0
    00a829: 75C0 SUB *+XAR0[0], AH
    22 out = calc(in); // calc2(in);
    00a82a: FFD6 LSR AH, 7
    00a82b: 799F MOV *+XAR7[AR1], AR1
    00a82c: 0000 ITRAP0
    00a82d: 7FA0 MOV @AR0, AR7
    00a82e: 0000 ITRAP0
    00a82f: 7FA0 MOV @AR0, AR7
    00a830: 8002 MOVZ AR7, @0x2
    00a831: 73C0 ADD *+XAR0[0], AH
    00a832: 8004 MOVZ AR7, @0x4
    00a833: 74C0 SUB *+XAR0[0], AL
    23 }
    00a834: 0000 ITRAP0
    00a835: 7FA0 MOV @AR0, AR7
    00a836: 0000 ITRAP0
    00a837: 7FA0 MOV @AR0, AR7
    00a838: 0000 ITRAP0
    00a839: 7FA0 MOV @AR0, AR7
    00a83a: 0000 ITRAP0
    00a83b: 7F80 MOV *XAR0++, AR7
    12 float32 calc2(float32 argb){
    c, calc2():
    00a83c: 8904 AND ACC, @0x4
    00a83d: 74C0 SUB *+XAR0[0], AL
    13 return argb+1.0;
    00a83e: 3F80 MOV *XAR0++, P
    00a83f: 77C0 NOP *+XAR0[0]
    14 }
    00a840: 0000 ITRAP0
    00a841: 79AF MOV *BR0--, AR1
    00a842: 0000 ITRAP0
    00a843: 7FA0 MOV @AR0, AR7
    00a844: 0000 ITRAP0
    00a845: 7FA0 MOV @AR0, AR7
    00a846: 0000 ITRAP0
    00a847: 7FA0 MOV @AR0, AR7
    00a848: 0000 ITRAP0
    00a849: 0000 ITRAP0
    00a84a: 0000 ITRAP0
  • The assembly shown here is C28x. What you can do is rright click on the .cla file and go to properties. Under c2000 Compiler-> advanced properties-> assembler options, select --keep_asm, --c_src_interlist, and generate listing file, (-al i think). When you build the .cla file it will dump the assembly file, of the same name, in the output folder. 

    You can post the assembly from that file or attach the file itself.

  • Hi Tim,

    processors.wiki.ti.com/.../C2000_CLA_C_Compiler

    According to this wiki article

    "On CGT 6.2.x and older, the function call depth was 1 i.e. a task could call a function but a function could not call another. On CGT 6.4.0+ this restriction is no longer present, the call depth is infinite (as long as you have the memory for it). See 'Calling Conventions' below for details."


    What is the CGT version you are using?

  • Here the .asm file of my CLATest.cla

    CLATest.asm

    The CGT version is ti-cgt-c2000_15.12.1.LTS, at least that is what I take from the cl2000.exe being locted in a folder with that name. Also the .lst states TMS320x280xx Control Law Accelerator C/C++ Codegen PC v15.12.1.LTS.
     I don't know if I am looking at the right place, as v15 seems to be way newer than 6.x.

  • Tim,

    the assembly looks correct. Where is the variable "in" located. You can check this in the .map file. What is the value of "in" passed to calc()?

  • Vishal,

    in is declared as

    #pragma DATA_SECTION(in, "CLADataLS0");
    volatile float32 in;


    in my main.c.  The .map file says

    address     data page           name
    --------    ----------------    ----

    00008004     200 (00008000)     _in

    which resides in RAMLS0, as

    RAMLS0                00008000   00000800  00000008  000007f8  RWIX

    tells me. Just as it is supposed to be. RAMLS0 gets configured as CLA data ram via

    MemCfgRegs.LSxMSEL.bit.MSEL_LS0 = 1;
    MemCfgRegs.LSxCLAPGM.bit.CLAPGM_LS0 = 0;

    in main() and is set to 2. This seems to work alright. If I call out = in+1.0; in my CLA task I can confirm out==3.0 in the debugger.

    Tim

  • Tim,

    Would you be able to zip and send me your project. From the looks of it it doesn't seem like you have any IP in here - you are just testing the CLA? If you do have sensitive IP, would you be able to strip it out and send me the bare-bones project that has the issue.

  • Vishal,

    here you go.CLATest.zip

    You are right, there is no IP in it. Neither Intellectual Property nor Internet Protocol :)

    Tim

  • Ok i see what is happening. You need to transfer ownership of the RAMLSx  containing the .scratchpad over to the CLA, RAMLS1 in this case.

    Each of the leaf functions, calc and calc2 will save off the MSTF (status register) on entry, this register also has the return PC (RPC). While i could see that the arithmetic portion was working out correctly, calc2 would not return to calc correctly --- it pops the return address off the scratchpad (all zeros) and jumps to the wrong location.

    I added the following to main.c and the code worked:

    	MemCfgRegs.LSxMSEL.bit.MSEL_LS1 = 1;
    	MemCfgRegs.LSxCLAPGM.bit.CLAPGM_LS1 = 0; //Datenspeicher

  • Vishal,

    thank you very much for your help! This helps a great bit because now I can move on with implementing my actual code.

    I have not delved into to thicket of memory allocation and linker scripts yet. I have three more questions regarding memory:

    - can I rearrange the sizes of the RAMLSs in my linker file?

    - how free am I in rearranging the memory layout of the entire device? I assume that there are certain hardware differences between RAMGS, RAMLS, and other RAM types but that within one particular RAM type I can freely redefine RAM slices as long as I respect the limits of that particular RAM type.

    - does the scratchpad need to reside within a RAMLS exclusively reserved for the scratchpad?

    Thanks again, Tim

  • Tim,

    Tim Hilden said:
    - can I rearrange the sizes of the RAMLSs in my linker file?

    yes. The memory is unified so you could combine physical RAMs into logical (im not sure thats the right term) RAMs. So i could club LS0,1 into 1 and the linker wont care......BUT.....you have to be careful doing this for CLA sections. You could club RAMLS0, 1 into 1 but you would still need to assign both RAMs to the CLA and configure them as either program or RAM....when you configure an LSx ram as CLA program/data ram the whole block becomes either program or data for the CLA. you cant have code and data in the same block.

    Tim Hilden said:
    - how free am I in rearranging the memory layout of the entire device? I assume that there are certain hardware differences between RAMGS, RAMLS, and other RAM types but that within one particular RAM type I can freely redefine RAM slices as long as I respect the limits of that particular RAM type.

    You can create your own logical sections - they need to correspond to the physical RAM blocks, but the space does need to be accesible by the data or program buses. Again, this does not apply to the CLA - you need to transfer ownership of the physical block (as defined in the datasheet) over to the CLA. RAMGS are meant to be global shared space that can either be assigned to CPU1 or CPU2. Now when you assign ownership you do it for each physical GS block, but that shouldnt stop you from, say splitting GS0 into 3 sections or clubbing GS0,1,2 into 1. RAMLSx are to be shared between the CPU, CLA and DMA

    Tim Hilden said:
    - does the scratchpad need to reside within a RAMLS exclusively reserved for the scratchpad?

    No, as long as the RAMLSx block it resides is configured as data space for the CLA, you can put other data sections in there too (e.g. .bss_cla, or your own data sections)