TMS320F28377D: Not enough LSxRAM for CLA tasks

Etiam

Part Number: TMS320F28377D

Hello

I would like to use CLA tasks to implement jobs in parallel with main C28x core, but I'm facing one problem: I don't have enough RAM space for program. Is there any solution for this, for instance, some kind of dynamic memory assignment for CLA instructions...

Thank you

Maite

over 5 years ago

0 Vivek Singh over 5 years ago

TI__Guru** 113755 points

Hi Maite,

CLA has access to LSxRAM only but CPU has other RAMs. I hope you are not sharing the LSxRAM with CPU also?

Regards,

Vivek Singh

0 Etiam over 5 years ago in reply to Vivek Singh

Intellectual 730 points

Hi Vivek,

I'm using RAMLS0 to RAMLS4 for CLA program and RAMLS5 for CLA data, CPU is not using this memory. I've tried to optimize my code but still it does not fit in RAM space.

Thanks

Maite

0 Vivek Singh over 5 years ago in reply to Etiam

TI__Guru** 113755 points

If you are writing C code, what compiler switch you are using for optimization? Also how much more RAM you are looking for? If it's small data RAM then you could use CLAtoCPU MSG RAM if that's not being used already in your application.

Regards,

Vivek Singh

0 Etiam over 5 years ago in reply to Vivek Singh

Intellectual 730 points

Hi Vivek,

it's much more RAM than the space of CLAtoCPU MSG RAM. I've tried with all levels of optimization but still does not fit. I've checked that the same code compiled for CPU1 needs around 3 times less RAM than compiled for CLA. I've also checked in assembly that CLA compiler adds a lot of MNOPs instructions.

What I'm trying to do is to load from flash to RAM not all the code since depending on my final device setup, not all the code will be used. I will use in .cmd a UNION to assign the same RAM memory for two different CLA program sections. I think it could work...

Thank you

Maite

0 Etiam over 5 years ago in reply to Etiam

Intellectual 730 points

Hi Vivek

loading from flash to RAM only some parts of the code using UNION in .cmd file is working. But now I'm facing another problem. The same code executed in CPU1 or in CLA1, in CLA lasts around 1.5 times than CPU execution time. As I told you in a previous post, CLA .asm file has a lot of MNOP operations... could it be the reason?

I also see in .asm files that CLA uses CPU-FPU (in fact, it is a floating point unit) while CPU1 uses (CPU-ALU)

This is an example of CLA .asm file lines:

	.dwpsn	file "../AFQ_CLA_funcs.cla",line 206,column 5,is_stmt,isa 0
        MMOVZ16   MR0,@_adapt+3         ; [CPU_FPU] |206| 
        MMOVIZ    MR1,#65535            ; [CPU_FPU] |206| 
        MLSL32    MR0,#16               ; [CPU_FPU] |206| 
        MMOVXI    MR1,#65535            ; [CPU_FPU] |206| 
        MASR32    MR0,#16               ; [CPU_FPU] |206| 
        MCMP32    MR1,MR0               ; [CPU_FPU] |206| 
        MNOP      ; [CPU_FPU] 
        MNOP      ; [CPU_FPU] 
        MNOP      ; [CPU_FPU] 
        MBCNDD    $C$L239,NEQ           ; [CPU_FPU] |206| 
        MNOP      ; [CPU_FPU] 
        MNOP      ; [CPU_FPU] 
        MNOP      ; [CPU_FPU]

The same in CPU is:

    MOVW      DP,#_adapt+3           ; [CPU_ARAU] 
        CMP       @_adapt+3,#-1       ; [CPU_ALU] |649| 
        B         $C$L4,NEQ                ; [CPU_ALU] |649|

Is there some solution for this? Are this MNOP needed?

Thank you

0 Vivek Singh over 5 years ago in reply to Etiam

TI__Guru** 113755 points

Hi,

Sorry for late reply. Yes, MNOP will take addition cycle hence execution will take more time and MNOP may be needed for correct operation.

Regards,

Vivek Singh

C2000™︎ microcontrollers

C2000 microcontrollers forum

TMS320F28377D: Not enough LSxRAM for CLA tasks