This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Programming a parallele application using TMS320C6678

Other Parts Discussed in Thread: TMS320C6678

Hi ,

which Steps are needed to programm a parallele application using TMS320C6678? for Example ( Memory Configuration etc...)

I read a article about this problem , but  I don´t really understand

I want to parallelize a FFT Application with the total amount of (256 * 2080)  Bytes. Each FFT hat a size of 256 . So i need to distribute the total amount of data on my 8 DSP cores.

Thank you for Help

Lopez

  • hello Lopez,

    Can I ask you a question?  which method do you use to achieve multicore implementing fft? openmp? and which example code(under which ti directory specifically) did you use for the first step to get familiar with it?

    thank you in advance.

    wendy

  • Hello Wendy ,

    I use openMP  , it is easy to use. First I run the openMP Helloworld , and after this Step I run also the Matrix Vector Example with OpenMP and I modified the Programmcode for the Matrix vector Example to write my Application.

    to find this Example go to :

    New CCS Project -> select OMP Examples on project templates and examples , and then choose a project using OpenMP. Make sure that you have a OpenMP version is 1.2.0.05.

    For  FFT Example : export the FFT Project (FFT_Example_66_LE_COFF) in your Workspace , this project is located at <install_Directory>/ti/dsplib_c66x_3_1_0_0/examples

    Best Regard

    Lopez

  • Hello Lopez, 

    thank you very much for your response.

    I tried the hello world example based on the tutorial http://processors.wiki.ti.com/index.php/OpenMP_on_C6000, but everytime after I loaded my .out, it runned automatically, and on the console it showed [C66xx_1] ERROR: Ipc_start failed in OpenMP_masterTask. 

    there's also similar problem in the forum, according to their reaction, I changed my generation tool as the version of 7.4.0. but still.

    did you have the problem before? if so, could you please tell me how did you solve it? 

    thanks a lot.

    Best wishes,

    Wendy

  • Hi Wendy ,

    please following these Steps:

    Please follow the steps below to avoid this issue:

    1. Launch your target configuration for your 6678 EVM in CCS Debug perspective

    2. 'Connect Target'

    3. Now for each core, select the core, go to Tools--> Debugger Options --> Generic Debugger Options. This should bring up a window. Under 'Auto Run Options" uncheck the box that says, 'On a program load or restart.'

    4. After completing step 3 for all cores, click on 'Remember my settings'

    5. Now load your .out file on the cores. After loading is completed, cores should be in suspend mode, and at the symbol __c_int00().

    6. Now run thecores

    Best Regards

    Lopez

  • hello Lopez,

    thank you so so much for your patient and response, after I setted auto run options as yours, I reached symbol__c_int00 in the beginning. but the error was still there. :( but when I tried simulator in my target configuration, it worked. I dont know what happened, my board worked fine with other codes. the hardware shouldn't have problem...

    my CCS is 5.3,

    XDCtools is 3.25.0.48,

    OpenMP is 1.1.3.02,

    SYS/BIOS is 6.35.1.29

    CGT is 7.4.0 

    if you remember any of the setting possibly influence the result while you're testing, please tell me.

    Thank you

    Best regards,

    wendy

      

  • HI Lopez,

    Sorry for taking your time. but I have one more question to ask you.

    I'm also doing a project relate with multicore FFT parallelization. but I have problem getting a clear idea which part of the FFT algorithm should be going through the multithread? I think you cannot directly split the original input signal right? 

    Hope you can responde me.

    thank you

    Best regards,

    Wendy

  • Hello Wendy ,

    In my case I want to do a FFT on 2080 * 256 points , so I distribute 260 * 256 data to each Core. The Parallelization is doing by the distribution for data to each Core. After this each Core is able to run the FFT Algorithm .

    Best  Regards

    Lopez