This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS570LS1227 performance issues

Other Parts Discussed in Thread: TMS570LS1227, HALCOGEN

Dear TI,

In datasheet, it describes that TMS570LS1227 provides 1.6 DMIPS and has configurations which can run up to 180MHz providing up to 288 DMIPS. But when I benchmark it, the performance is just only 50 DMIPS. Could you explain it or solve it? Please refer my testing method.

I try 3 methods to benchmark _delay_cycles(1000) function, and measure their executed time.

1. Pin toggle

2. RTI

3. CCS(count event feature)

All of these 3 testing results are almost 50 DMIPS.

Please have the attached file.

7144.PIN TOGGLE.pdf

  • Hi ShunFan,

    Please check your dis-assembly code to see how many instructions the _delay_cycles() use. My test shows it uses 2 instructions, and each instruction takes 1 CPU cycle.

    1. 0x00005254  SUBS   R12, R12, #1

    2. 0x00005258  BNE     xxxx

    The flash wrapper fetches 128 bits every time. If the 2 instructions are aligned in the same line (0x00005250, ...54, ...58, ...5C), they are not fetched at the same time, it will take 2 cycles to execute those 2 instructions. If they are not in the same line, the CPU needs to read the flash to fetch the second instruction after running the 1st instruction. Reading flash will takes more cycles (HalCoGen configures the flash wait states to 3).

    Can you please execute your code in RAM and measure again?

    Regards,

    QJ 

  • Dear QJ,

    Thanks your reply.

    I checked my dis-assembly code and see there are 1000 instructions to realize _delay_cycles(1000),

             main():

    0000610c:   E300C1F3 MOVW            R12, #499                         Execute once

    00006110:   E340C000 MOVT            R12, #0                              Execute once

             $1_$4:

    00006114:   E25CC001 SUBS            R12, R12, #1                    Execute 499 times

    00006118:   1AFFFFFD BNE             $1_$4                                  Execute 499 times

     

    Each instructions take 16~20 CPU clocks. The whole _delay_cycles(1000) take 3539 CPU cycles.

    BTW, I don't know how to  execute my code in RAM. Could you teach me?

    Also, Could you explain what's the wait states?

    Please have my project, too.

    5670.20150320performance.rar