This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Using Assembler to write my Algorithm

HI, everybody:

I have some problem in recently. I want to write my Algorithm using Assembler.

My original Algorithm in C language are not have good performance. I try to use CCS opt level to increase my performance but I fell not good. So I want to write the Assembler.

When I finish to write the Algorithm in Assembler. I use the CCS profile compare with my Algorithm in C and my  Assembler Algorithm CPU cycle.The result is my Assembler

language CPU cycle much more than O2. So I hope everybody can help me :

first question is -> I fell that ....just one cycle  but I testing the CPU cycle LDW instruction are have lots of CPU cycle?

  ADD   .L1    A1,A10,A11

||ADD   .L2    B1,B10,B11

||LDW   .D1    *A2++,A20

||LDW   .D2    *B2++,B20

 

second question is -> When I using O2 I can't debug the Algorithem.

thired question is -> If I want to write the Assembler how can I faster than O2.