This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Performance question for c6runlib approach, codec approach and syslink approach?

Other Parts Discussed in Thread: SYSBIOS

We are developing Machine-vision algrithm on DM8168.

Our application consists of ARM Linux code and DSP code.

As far as I know, there are three approaches to integrate ARM Linux code and DSP code.

1. C6RUNLIB approaches

2. codec approaches

3. syslink approaches

Which is the best performance approaches which execute DSP code?

We have successfully integrated ARM Linux code and DSP code using C6RUNLIB approach.

But, DSP performance is poor. So, we have to improve code performance.

Our DSP code is 30ms and we want to reduce DSP code performance to a few milliseconds.

 How to improve DSP code performance?

 

 

  • Hello Seung,

    Yes, C6000 DSP architecture requires some simple tuning to get the best performance. See the document here that should get you started: http://www.ti.com/lit/pdf/sprabf2 the techniques described should apply even when you are using C6Run.

    The choice of the development method depends on what you eventually want to achieve:

    • C6Run -> DSP is primarily used as black boxed accelerator to run your own special code and wouldn’t be doing any peripheral access.
    • Codec Engine -> DSP is used to run many algorithms that need to coexist. Algorithms should be written using xDAIS standard
    • SysLINk approach -> Maximum flexibility. DSP can run your custom code and also access peripheral. This is true DSP+ARM development model but requires you to learn SysLink/SysBIOS/Plus split your code for execution between ARM and DSP

     

    Also, you may have already considered some of the optimized code that TI provides. See here:

    http://processors.wiki.ti.com/index.php/Software_libraries

    You can use the libraries even when you are using C6Run

     

    Please note, when using C6Run, every call has to go through ARM->DSP boundary. This has large overhead (~100usec) so you should try and get more done every time you make call to DSP.

    Also, C6Run supports Asynchronous calling. Using that feature, you can do tru ARM/DSP parallel programming and best perromance out of your system. See here:

     

    http://processors.wiki.ti.com/index.php/C6RunLib_Documentation#Multi-threaded_Support_and_Asynchronous_Function_Calling_.28v0.98_or_higher.29

     

    Cheers,

    Gagan