This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

execution time

Hello everyone,

I am using CCS v5.5.

I have a program and I want to integrate into a dsp.
First, I simulating, I am using simulators texas instruments, how can I know the execution time  for the program implementation in the DSP? That is, I want to know the time that will be running the program or consume CPU cycles that program in the DSP processor. I can calculate the exact time or approximated by a simulation, without having the processor?

Thanks, greetings.
  • Hello,
    What exact simulator are you using?

    Thanks
    ki
  • c674x CPU Cycle Accurate Simulator, Little Endian.

    Thanks.
  • You can use the CCS Function Profiler on C6x simulators:
    processors.wiki.ti.com/.../Profiler

    There is also the profile clock to simply count cycles from point A to point B:
    processors.wiki.ti.com/.../Profile_clock_in_CCS

    Thanks
    ki
  • Ok. thank you very much.

    The second option, the profile clock, the cpu.cycle to the debug mode are different than cpu.cycle to the release mode.

    When it is not a simulation, and I will have the DSP processor, how many cycles will consume ¿in debug mode, the release mode or another number of cycles?
    If it consume other number of cycles, how I can calculate this cycles?

    Thanks, greetings
  • user4364201 said:
    The second option, the profile clock, the cpu.cycle to the debug mode are different than cpu.cycle to the release mode.

    This is common since the Release configuration often has optimization enabled

    user4364201 said:
    When it is not a simulation, and I will have the DSP processor, how many cycles will consume ¿in debug mode, the release mode or another number of cycles?

    The profile clock can be used on simulators and on actual hardware. So you can use the profile clock method to measure cycles on the HW also.

  • Ok. thank you very much.

    So, when I have actual hardware I will can measure cycles.

    But actually, I don´t have actual hardware, my idea is to measure cycles before to have actual hardware. ¿The measure cycles on the actual hardware will be very similar to the measure cycles in simulation mode on release?

    Otherwise, without actual hardware, I can know how many cycles will consume actual hardware through the simulator?

    Sorry, I'm new at this.

    Thanks, greetings
  • user4364201 said:
    But actually, I don´t have actual hardware, my idea is to measure cycles before to have actual hardware. ¿The measure cycles on the actual hardware will be very similar to the measure cycles in simulation mode on release?


    It depends on the application. If you are just benchmarking an algorithm running on the core, it may be similar. But anything larger scope will be off. This is because you are using a CPU simulator that assumes a flat memory system (does not model cache) and does not model peripherals except a few timers.

    A Device Cycle Accurate simulator models peripherals and cache, while being cycle accurate for profiling. This will be a better bet but we only offer two Device Cycle Accurate simulators (C6745 and C6747).

    user4364201 said:
    Otherwise, without actual hardware, I can know how many cycles will consume actual hardware through the simulator?

    As mentioned above, it varies on the application, what you are trying to profile, and if you are using a CPU or Device simulator. But you can't count on what the data will be on real HW until you actually profile on real HW.


    Thanks

    ki

  • Hi,

    user4364201 said:
    ¿The measure cycles on the actual hardware will be very similar to the measure cycles in simulation mode on release?

    The simulator you are using is CPU Cycle accurate, which means it will account for pipeline stalls and latencies, but not for cache and DMA latencies (check this wiki category). Therefore, you will have a pretty accurate measure of the cycle count of core processing (for an FIR or IIR filter, for example) but not during the data transfers in and out of the core (via the cache, DMA, external memory, etc.).

    As the same link mentions, if you use the Device Cycle accurate simulator you will get a much closer cycle count when compared to the device. The tradeoff is the simulation will be a lot slower.  

    Hope this helps,

    Rafael