This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

C28 slower than expected

Other Parts Discussed in Thread: F28M36P63C2

Hi,

I am using the F28M36P63C2.
I just realized, after performing a servo algorithm by C28, that it was very slow. To do one instruction takes almost 50ns, which corresponds to 20MHz (quartz system clock).

At this point in time, the main question is regarding the performance of the processor. We need a particular servo algorithm (that we already have written) to operate in well under 10 microseconds, and preferably under 5 microseconds. With our latest compiler settings and such, we've got it running down to around 8.5 microseconds, but we'd still like to see if it is possible to get better performance.

I'm configuring the M3 at 75MHz and the C28 at 150MHz.

Is there a way to make the C28 faster or do we have to change device?

Thank you,
Marc 

  • Hi Marc,

    How are you profiling the code to conclude one instruction taking  50ns? Also have you measured the XCLKOUT frequency?

    Regards,

    Vivek Singh 

  • Hi Vivek,

    Thank you for your answer :).

    I just set a GPIO pin and clear it, and it takes 100ns. But even without it, the algorithm takes around 150 instructions to complete (just additions, multiplies and subtractions).

    I tried to deal with XCLKOUT. When I have the divider set to 4 it is 37,5KHz but even with the divider set to 1 it doesn't change anything.

    The timing to do the entire algorithm which I talked about is with optimization. Without it, it takes more than 10µs to do the algorithm.

    Thank you,
    Marc 

  • Hi Marc,

    This divider is only for XCLKOUT and doesn't impact the clock frq for CPU. Since for *DIV_4 the XCLKOUT frq is 37.5MHz, C28x is running @ 150MHz. 

    Now regarding the C28x instruction taking 50ns for one instruction, look like you were referring to "C" instruction and not assembly instruction. Please note that depending on what 'C' instruction is, there could be multiple assembly instructions associated with that. Like in your case if you are setting the GPIO using header file like ".bit" then this get converted into RMW instruction which will take more cycles compare to when you use ".all". Also RD/WR access to peripherals are different compare to RD/WR/Fetch from memory. Peripheral accesses (like in this case you are accessing GPIO) are not single cycle access.

    You may want to look at assembly code and see if it's not optimized properly. If you see some issue with optimization then we can ask compiler team to look into that.

    Regards,

    Vivek Singh

     

  • Marc,

    To reinforce what Vivek has said, this wiki page discusses how read-modify-write and the CPU pipeline impact GPIO toggle speed:

    http://processors.wiki.ti.com/index.php/General_Purpose_IO_(GPIO)_FAQ_for_C2000#Q:_Toggling_of_the_GPIO_seems_slower_than_it_should_be

  • Devin,

    Thank you, it helped to understand about the GPIO toggle speed :).

    But actually, I think my problem is more about the C code (for my servo algorithm). It takes too much time to do one operation like multiply, even with optimization. So with 150 operations...

    But thank you again, :)
    Marc 

  • Marc,

    Have you looked at the dissasembly - does it seem reasonable? Where is the data stored?  If it is in an array, could that be adding significant indexing overhead? If it is in different data pages (every 64 words in memory), is the compiler having to swap the page pointer repeatedly?

  • Devin,

    Here is the situation:

    The data are stored in an array in shared memories. Some variables are coming from M3, I do operations with M3 variables and C28 variables, on the C28 side.

    Inside the main, the function which calculate the servo algorithm is called in the "while" when a flag is set by M3.

    Marc