This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Hello
I would like to use CLA tasks to implement jobs in parallel with main C28x core, but I'm facing one problem: I don't have enough RAM space for program. Is there any solution for this, for instance, some kind of dynamic memory assignment for CLA instructions...
Thank you
Maite
Hi Maite,
CLA has access to LSxRAM only but CPU has other RAMs. I hope you are not sharing the LSxRAM with CPU also?
Regards,
Vivek Singh
Hi Vivek,
I'm using RAMLS0 to RAMLS4 for CLA program and RAMLS5 for CLA data, CPU is not using this memory. I've tried to optimize my code but still it does not fit in RAM space.
Thanks
Maite
If you are writing C code, what compiler switch you are using for optimization? Also how much more RAM you are looking for? If it's small data RAM then you could use CLAtoCPU MSG RAM if that's not being used already in your application.
Regards,
Vivek Singh
Hi Vivek,
it's much more RAM than the space of CLAtoCPU MSG RAM. I've tried with all levels of optimization but still does not fit. I've checked that the same code compiled for CPU1 needs around 3 times less RAM than compiled for CLA. I've also checked in assembly that CLA compiler adds a lot of MNOPs instructions.
What I'm trying to do is to load from flash to RAM not all the code since depending on my final device setup, not all the code will be used. I will use in .cmd a UNION to assign the same RAM memory for two different CLA program sections. I think it could work...
Thank you
Maite
Hi Vivek
loading from flash to RAM only some parts of the code using UNION in .cmd file is working. But now I'm facing another problem. The same code executed in CPU1 or in CLA1, in CLA lasts around 1.5 times than CPU execution time. As I told you in a previous post, CLA .asm file has a lot of MNOP operations... could it be the reason?
I also see in .asm files that CLA uses CPU-FPU (in fact, it is a floating point unit) while CPU1 uses (CPU-ALU)
This is an example of CLA .asm file lines:
.dwpsn file "../AFQ_CLA_funcs.cla",line 206,column 5,is_stmt,isa 0 MMOVZ16 MR0,@_adapt+3 ; [CPU_FPU] |206| MMOVIZ MR1,#65535 ; [CPU_FPU] |206| MLSL32 MR0,#16 ; [CPU_FPU] |206| MMOVXI MR1,#65535 ; [CPU_FPU] |206| MASR32 MR0,#16 ; [CPU_FPU] |206| MCMP32 MR1,MR0 ; [CPU_FPU] |206| MNOP ; [CPU_FPU] MNOP ; [CPU_FPU] MNOP ; [CPU_FPU] MBCNDD $C$L239,NEQ ; [CPU_FPU] |206| MNOP ; [CPU_FPU] MNOP ; [CPU_FPU] MNOP ; [CPU_FPU]
The same in CPU is:
MOVW DP,#_adapt+3 ; [CPU_ARAU] CMP @_adapt+3,#-1 ; [CPU_ALU] |649| B $C$L4,NEQ ; [CPU_ALU] |649|
Is there some solution for this? Are this MNOP needed?
Thank you
Hi,
Sorry for late reply. Yes, MNOP will take addition cycle hence execution will take more time and MNOP may be needed for correct operation.
Regards,
Vivek Singh