This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320F280039C: How to reduce the execution time of complex logic operation code in F2800039C?

Part Number: TMS320F280039C
Other Parts Discussed in Thread: TMS320F280039

Tool/software:

    Hello, please pay attention to this problem. Thank you very much!

    In DSP TMS320F280039 C28x project, how to reduce the execution time of logic operation code about multiple expressions? For example:

      PFC_Ctrl_DW.Delay3_DSTATE_f = ((((VePFC_V_PLLVltQ <= KePFC_V_QValDwnLim) || (VePFC_V_PLLVltQ >= KePFC_V_QValUpLim)) && rtb_LogicalOperator_i) || PFC_Ctrl_DW.Delay3_DSTATE_f);

      PFC_Ctrl_DW.Delay1_DSTATE_l = ((rtb_LogicalOperator_i && ((VePFC_Hz_PLLFreq <= VePFC_Hz_PLLFreqErrDwn) || (VePFC_Hz_PLLFreq >= VePFC_Hz_PLLFreqErrUp))) || PFC_Ctrl_DW.Delay1_DSTATE_l);

    Although the compilation level is improved and FPU and TMU are enabled,  compared to the Infineon TC233, the 280039C is significantly  slower in executing the above code. 

  • Hello,

    Can you please provide some additional details?

    - Was the code executed from Flash?

    - Are the structures and variables all globals? Can you try creating local variables to see if it improves performance?

    - Can you share the generated assembly?

    Thanks,

    Sira

  • Hello,

    -Whether it's 280039C or TC233, the code is executed in Flash.

    - All structures are globals, and all other variables are partly global and partly local. There is no performance improvement in swapping local variables for global variables.

    -The code executed by the DSP and TC233 is identical, but it has been fine-tuned to suit their respective environments. It has been found that the execution time of these complex logic operation codes in 280039C is longer than that in TC233. But in other aspects, DSP performs better. The complex logic code execution reduces the overall execution efficiency of 280039C. 

    -Looking forward to your reply. Thank you.

  • Tianze,

    Can you share the generated assembly for this code?

    Also, how many cycles does it take on C28 vs TC233?

    Thanks,

    Sira

  • Hello,

    -The generated assembly for this code in question is shown below:

       (1) PFC_Ctrl_DW.Delay3_DSTATE_f = ((((VePFC_V_PLLVltQ <= KePFC_V_QValDwnLim) || (VePFC_V_PLLVltQ >= KePFC_V_QValUpLim)) && rtb_LogicalOperator_i) || PFC_Ctrl_DW.Delay3_DSTATE_f);

    1008      PFC_Ctrl_DW.Delay3_DSTATE_f = ((((VePFC_V_PLLVltQ <= KePFC_V_QValDwnLim) ||
    082de2: E6940011 CMPF32 R1H, R2H
    082de4: AD14 MOVST0 NF,ZF
    082de5: 6509 SB $C$L78, LEQ
    082de6: 761F02A4 MOVW DP, #0x2a4
    082de8: E2AF021E MOV32 R2H, @0x1e, UNCF
    082dea: E6940011 CMPF32 R1H, R2H
    082dec: AD14 MOVST0 NF,ZF
    082ded: 6403 SB $C$L79, LT
                  $C$L78:
    082dee: 5200 CMPB AL, #0x0
    082def: 6006 SB $C$L80, NEQ
                 $C$L79:
    082df0: 761F02A1 MOVW DP, #0x2a1
    082df2: 932D MOV AH, @0x2d
    082df3: 56B100A6 MOVB @AR6, #0x00, EQ
                 $C$L80:
    082df5: 761F02A1 MOVW DP, #0x2a1

        (2) PFC_Ctrl_DW.Delay1_DSTATE_l = ((rtb_LogicalOperator_i && ((VePFC_Hz_PLLFreq <= VePFC_Hz_PLLFreqErrDwn) || (VePFC_Hz_PLLFreq >= VePFC_Hz_PLLFreqErrUp))) || PFC_Ctrl_DW.Delay1_DSTATE_l);

    1022     PFC_Ctrl_DW.Delay1_DSTATE_l = ((rtb_LogicalOperator_i && ((VePFC_Hz_PLLFreq <=
    082df7: 5200 CMPB AL, #0x0
    082df8: B601 MOVB XAR7, #0x01
    1008     PFC_Ctrl_DW.Delay3_DSTATE_f = ((((VePFC_V_PLLVltQ <= KePFC_V_QValDwnLim) ||
    082df9: 7E2D MOV @0x2d, AR6
    1022     PFC_Ctrl_DW.Delay1_DSTATE_l = ((rtb_LogicalOperator_i && ((VePFC_Hz_PLLFreq <=
    082dfa: 6113 SB $C$L81, EQ
    082dfb: 761F02A0 MOVW DP, #0x2a0
    082dfd: E2AF0124 MOV32 R1H, @0x24, UNCF
    082dff:  761F02A4 MOVW DP, #0x2a4
    082e01: E2AF0232 MOV32 R2H, @0x32, UNCF
    082e03: E6940011 CMPF32 R1H, R2H
    082e05: AD14 MOVST0 NF,ZF
    082e06: 650C SB $C$L82, LEQ
    082e07: E2AF0234 MOV32 R2H, @0x34, UNCF
    082e09: E6940011 CMPF32 R1H, R2H
    082e0b: AD14 MOVST0 NF,ZF
    082e0c: 6306 SB $C$L82, GEQ
                  $C$L81:
    082e0d: 761F02A1 MOVW DP, #0x2a1
    082e0f: 932E MOV AH, @0x2e
    082e10: 56B100A7 MOVB @AR7, #0x00, EQ

    -Due to the different system clock(280039C-120MHz vs TC233-200MHz), after conversion, these two lines of code are successively executed 240ns and 214ns in 280039C, while in TC233 these two lines of code are successively executed 90ns and 26ns.

    Looking forward to your reply. Thanks,

    Tianze

  • -The assembly code based on TC233 mentioned in the question was obtained after the approval of the department, as shown in the figure:

    Looking forward to your reply. Thanks,

    Tianze

  • Tianze,

    The measurements for C28 seem consistent with the generated assembly you posted.

    240ns x 120MHz = 28 cycles

    214ns x 120MHz = 25 cycles

    For the TC23x,

    90ns x 200MHz = 18 cycles

    26ns x 200MHz = 5 cycles (this part does not make sense to me, given the operation is very similar to the above; the picture also doesn't indicate this is done in 5 cycles)

    As such, for C28 I don't believe there is much room for further improvement.