CLA: nested function calls

Tim Hilden

Hi all,

I'm trying hard to get some decent code running in the CLA. It seems that I can only call one level of function hierarcy.

The following CLA-code works, I can tell from my blinking LED as the task gets called via a timer interrupt:

float32 calc2(float32 argb){
   return argb+1.0;
}

float32 calc(float32 arga){
   return calc2(arga)+1.0;
}

__interrupt void cla_Task8(void){
   LEDstate = LEDstate^1;
   out = calc2(in);
}

Now if I add one more level of function call nesting it does not work anymore:

float32 calc2(float32 argb){
   return argb+1.0;
}

float32 calc(float32 arga){
   return calc2(arga)+1.0;
}

__interrupt void cla_Task8(void){
   LEDstate = LEDstate^1;
   out = calc(in); // calc2(in);
}

This behavior contradicts what the C/C++ compiler manual www.ti.com/lit/ug/spru514j/spru514j.pdf says:

The CLA compiler supports multiple nested levels of function calls. The restriction to two levels of function
calls has been removed.

So what is going wrong? Using a F28379D and the TI 15.12.1 compiler. I use compiler option --cla_support=cla1.

Thanks, Tim

over 9 years ago

0 Vishal_Coelho over 9 years ago

TI__Mastermind 20850 points

Hi Tim,

Is the compiler generating an error for that nested call or is the code just not working when you make that nested call? If its the latter please post the generated assembly for the code you have shown.

0 Tim Hilden over 9 years ago in reply to Vishal_Coelho

Prodigy 60 points

Hi Vishal,

no, the compiler does not generate an error/warning, the program is just not working.
The corresponding section from my .map file:
0000a800 00000020 CLATest.obj (Cla1Prog:_calc)
0000a820 0000001c CLATest.obj (Cla1Prog:_cla_Task8)
000a83c 0000000c CLATest.obj (Cla1Prog:_calc2)

and the assembler:
16 float32 calc(float32 arga){
c, calc():
00a800: 8902 AND ACC, @0x2
00a801: 7740 NOP *-SP[0]
00a802: 0000 ITRAP0
00a803: 7FA0 MOV @AR0, AR7
00a804: 8900 AND ACC, @0x0
00a805: 74C0 SUB *+XAR0[0], AL
17 return calc2(arga)+1.0;
00a806: 0036 TRAP #22
00a807: 799F MOV *+XAR7[AR1], AR1
00a808: 0000 ITRAP0
00a809: 7FA0 MOV @AR0, AR7
00a80a: 0000 ITRAP0
00a80b: 7FA0 MOV @AR0, AR7
00a80c: 0000 ITRAP0
00a80d: 7FA0 MOV @AR0, AR7
00a80e: 3F80 MOV *XAR0++, P
00a80f: 77C0 NOP *+XAR0[0]
18 }
00a810: 8902 AND ACC, @0x2
00a811: 7700 NOP
00a812: 0000 ITRAP0
00a813: 7FA0 MOV @AR0, AR7
00a814: 0000 ITRAP0
00a815: 7FA0 MOV @AR0, AR7
00a816: 0000 ITRAP0
00a817: 7FA0 MOV @AR0, AR7
00a818: 0000 ITRAP0
00a819: 79AF MOV *BR0--, AR1
00a81a: 0000 ITRAP0
00a81b: 7FA0 MOV @AR0, AR7
00a81c: 0000 ITRAP0
00a81d: 7FA0 MOV @AR0, AR7
00a81e: 0000 ITRAP0
00a81f: 7FA0 MOV @AR0, AR7
21 LEDstate = LEDstate^1;
c, cla_Task8():
00a820: 0000 ITRAP0
00a821: 7841 MOV *-SP[1], AR0
00a822: 8000 MOVZ AR7, @0x0
00a823: 7580 SUB *XAR0++, AH
00a824: 0001 ABORTI
00a825: 7881 MOV *XAR1++, AR0
00a826: 0004 PUSH RPC
00a827: 7CA0 MOV @AR0, AR4
00a828: 8000 MOVZ AR7, @0x0
00a829: 75C0 SUB *+XAR0[0], AH
22 out = calc(in); // calc2(in);
00a82a: FFD6 LSR AH, 7
00a82b: 799F MOV *+XAR7[AR1], AR1
00a82c: 0000 ITRAP0
00a82d: 7FA0 MOV @AR0, AR7
00a82e: 0000 ITRAP0
00a82f: 7FA0 MOV @AR0, AR7
00a830: 8002 MOVZ AR7, @0x2
00a831: 73C0 ADD *+XAR0[0], AH
00a832: 8004 MOVZ AR7, @0x4
00a833: 74C0 SUB *+XAR0[0], AL
23 }
00a834: 0000 ITRAP0
00a835: 7FA0 MOV @AR0, AR7
00a836: 0000 ITRAP0
00a837: 7FA0 MOV @AR0, AR7
00a838: 0000 ITRAP0
00a839: 7FA0 MOV @AR0, AR7
00a83a: 0000 ITRAP0
00a83b: 7F80 MOV *XAR0++, AR7
12 float32 calc2(float32 argb){
c, calc2():
00a83c: 8904 AND ACC, @0x4
00a83d: 74C0 SUB *+XAR0[0], AL
13 return argb+1.0;
00a83e: 3F80 MOV *XAR0++, P
00a83f: 77C0 NOP *+XAR0[0]
14 }
00a840: 0000 ITRAP0
00a841: 79AF MOV *BR0--, AR1
00a842: 0000 ITRAP0
00a843: 7FA0 MOV @AR0, AR7
00a844: 0000 ITRAP0
00a845: 7FA0 MOV @AR0, AR7
00a846: 0000 ITRAP0
00a847: 7FA0 MOV @AR0, AR7
00a848: 0000 ITRAP0
00a849: 0000 ITRAP0
00a84a: 0000 ITRAP0

0 Vishal_Coelho over 9 years ago in reply to Tim Hilden

TI__Mastermind 20850 points

The assembly shown here is C28x. What you can do is rright click on the .cla file and go to properties. Under c2000 Compiler-> advanced properties-> assembler options, select --keep_asm, --c_src_interlist, and generate listing file, (-al i think). When you build the .cla file it will dump the assembly file, of the same name, in the output folder.

You can post the assembly from that file or attach the file itself.

0 Keyur Acharya over 9 years ago

Expert 1740 points

Hi Tim,

processors.wiki.ti.com/.../C2000_CLA_C_Compiler

According to this wiki article

"On CGT 6.2.x and older, the function call depth was 1 i.e. a task could call a function but a function could not call another. On CGT 6.4.0+ this restriction is no longer present, the call depth is infinite (as long as you have the memory for it). See 'Calling Conventions' below for details."

What is the CGT version you are using?

0 Tim Hilden over 9 years ago in reply to Vishal_Coelho

Prodigy 60 points

Here the .asm file of my CLATest.cla

CLATest.asm

The CGT version is ti-cgt-c2000_15.12.1.LTS, at least that is what I take from the cl2000.exe being locted in a folder with that name. Also the .lst states TMS320x280xx Control Law Accelerator C/C++ Codegen PC v15.12.1.LTS.
I don't know if I am looking at the right place, as v15 seems to be way newer than 6.x.

0 Vishal_Coelho over 9 years ago in reply to Tim Hilden

TI__Mastermind 20850 points

Tim,

the assembly looks correct. Where is the variable "in" located. You can check this in the .map file. What is the value of "in" passed to calc()?

0 Tim Hilden over 9 years ago in reply to Vishal_Coelho

Prodigy 60 points

Vishal,

in is declared as

#pragma DATA_SECTION(in, "CLADataLS0");
volatile float32 in;

in my main.c. The .map file says

address data page name
-------- ---------------- ----

00008004 200 (00008000) _in

which resides in RAMLS0, as

RAMLS0 00008000 00000800 00000008 000007f8 RWIX

tells me. Just as it is supposed to be. RAMLS0 gets configured as CLA data ram via

MemCfgRegs.LSxMSEL.bit.MSEL_LS0 = 1;
MemCfgRegs.LSxCLAPGM.bit.CLAPGM_LS0 = 0;

in main() and is set to 2. This seems to work alright. If I call out = in+1.0; in my CLA task I can confirm out==3.0 in the debugger.

Tim

0 Vishal_Coelho over 9 years ago in reply to Tim Hilden

TI__Mastermind 20850 points

Tim,

Would you be able to zip and send me your project. From the looks of it it doesn't seem like you have any IP in here - you are just testing the CLA? If you do have sensitive IP, would you be able to strip it out and send me the bare-bones project that has the issue.

0 Tim Hilden over 9 years ago in reply to Vishal_Coelho

Prodigy 60 points

Vishal,

here you go.CLATest.zip

You are right, there is no IP in it. Neither Intellectual Property nor Internet Protocol :)

Tim

0 Vishal_Coelho over 9 years ago in reply to Tim Hilden

TI__Mastermind 20850 points

Ok i see what is happening. You need to transfer ownership of the RAMLSx containing the .scratchpad over to the CLA, RAMLS1 in this case.

Each of the leaf functions, calc and calc2 will save off the MSTF (status register) on entry, this register also has the return PC (RPC). While i could see that the arithmetic portion was working out correctly, calc2 would not return to calc correctly --- it pops the return address off the scratchpad (all zeros) and jumps to the wrong location.

I added the following to main.c and the code worked:

	MemCfgRegs.LSxMSEL.bit.MSEL_LS1 = 1;
	MemCfgRegs.LSxCLAPGM.bit.CLAPGM_LS1 = 0; //Datenspeicher

0 Tim Hilden over 9 years ago in reply to Vishal_Coelho

Prodigy 60 points

Vishal,

thank you very much for your help! This helps a great bit because now I can move on with implementing my actual code.

I have not delved into to thicket of memory allocation and linker scripts yet. I have three more questions regarding memory:

- can I rearrange the sizes of the RAMLSs in my linker file?

- how free am I in rearranging the memory layout of the entire device? I assume that there are certain hardware differences between RAMGS, RAMLS, and other RAM types but that within one particular RAM type I can freely redefine RAM slices as long as I respect the limits of that particular RAM type.

- does the scratchpad need to reside within a RAMLS exclusively reserved for the scratchpad?

Thanks again, Tim

0 Vishal_Coelho over 9 years ago in reply to Tim Hilden

TI__Mastermind 20850 points

Tim,

Tim Hilden said:
- can I rearrange the sizes of the RAMLSs in my linker file?

yes. The memory is unified so you could combine physical RAMs into logical (im not sure thats the right term) RAMs. So i could club LS0,1 into 1 and the linker wont care......BUT.....you have to be careful doing this for CLA sections. You could club RAMLS0, 1 into 1 but you would still need to assign both RAMs to the CLA and configure them as either program or RAM....when you configure an LSx ram as CLA program/data ram the whole block becomes either program or data for the CLA. you cant have code and data in the same block.

Tim Hilden said:
- how free am I in rearranging the memory layout of the entire device? I assume that there are certain hardware differences between RAMGS, RAMLS, and other RAM types but that within one particular RAM type I can freely redefine RAM slices as long as I respect the limits of that particular RAM type.

You can create your own logical sections - they need to correspond to the physical RAM blocks, but the space does need to be accesible by the data or program buses. Again, this does not apply to the CLA - you need to transfer ownership of the physical block (as defined in the datasheet) over to the CLA. RAMGS are meant to be global shared space that can either be assigned to CPU1 or CPU2. Now when you assign ownership you do it for each physical GS block, but that shouldnt stop you from, say splitting GS0 into 3 sections or clubbing GS0,1,2 into 1. RAMLSx are to be shared between the CPU, CLA and DMA

Tim Hilden said:
- does the scratchpad need to reside within a RAMLS exclusively reserved for the scratchpad?

No, as long as the RAMLSx block it resides is configured as data space for the CLA, you can put other data sections in there too (e.g. .bss_cla, or your own data sections)

C2000™︎ microcontrollers

C2000 microcontrollers forum

CLA: nested function calls