This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Is my OMAP L137 C6747 CPU not running full tilt?



Hi

I have an algorithm that I designed in Simulink and for which I generated code using Real Time Workshop Embedded Coder. The algorithm performs 64,000 multiplications and adds per second.

I'm trying to run the code generated by Real Time Workshop Embedded Coder on an OMAP L137 evaluation module. The problem that I'm having is that the CPU load graph is telling me that this algorithm consumes 10% of the CPU and this seems exceedingly large to me. My OMAP device is clocked with an external 24MHz crystal and I setup the PLL to give me a x30 gain. So my C6747 CPU should be internally clocked at 720MHz. My code is probably not most efficient so I'm probably not doing more than 1 multiplication per clock cycle, but still I would expect that an algorithm running 64,000 multiplications and add per second should approximately only consume 64,000 / 720,000,000 << 1% of the CPU time.  

Is my understanding correct? In other words, should I be expecting better performance than 10% CPU from the OMAP L137, or am I misunderstanding something about the part and 10% CPU is what it takes to compute 68,000 multiplications and add per second?

I should also mention that I'm using the BSP BIOS to schedule everything, I did not configure the PLL PREDIV and POSTDIVE values as I assumed that they default to /1, and my multiplications are all floating point. I’m also running everything (data & code) from the internal L2 RAM.

Thank you immensely for your help!

Sincerely,

Jean 

 

  • Hi Jean,

    I'm afraid I do not have an answer for you but I do feel your pain. I, too, have had cause to question the speed of the C6747 core see this post for details:

    http://e2e.ti.com/support/dsp/tms320c6000_high_performance_dsps/f/115/t/65412.aspx

    In summary: code that ran on an Intel platform under Linux went 2.5 times slower on the C6747, this was after using the fast RTS library and taking the difference in clock speeds into account. Doesn't sound like much of an issue, but given the that C6747 cannot be clocked as fast as the Intel, in reality, the slow down in my code was more like 14 times. The big stumbling block for me ws the need to do double precision floating point divides - the Intel ate these for breakfast, but the C6747 insisted on making a full three course meal out of them. Does you code include any divisions? It may go someway to explaining your shortfall.

    Also, I notice that you clocked your C6747 at 720MHz, the data sheet specifies a maximum clock speed of 456MHz for this core: are you causing problems by clocking too fast?

    Cheers,

    John.