This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DSPLIB Parallelization



I'm working on some HPC stuff on the EVMK2H12 board with MCSDK HPC 3.00.01.08 and DSPLIB C66x 3.4.0.0. My programs are currently just benchmarking the DSPs and figuring out how well the TI DSP libraries work. I am basing my program off the OpenMP example dsplib_fft. My question is, if I call the functions in DSPLIB that way, does OpenMP actually parallelize the execution of the functions? Going through the source code of the libraries, I don't see any "#pragma omp"'s in there so I can only assume that there is no parallel processing occurring. Of course, I don't have much knowledge of OpenMP or high performance computing in general so I'm just speaking from ignorance. I just don't get how calling a function through OpenMP could magically parallelize it when no code in the function shows any creation threads.

Also, is there any way to check how many threads/core are active? I'm not using CCS for any code development. All my compilation is done on the EVM.

EDIT: I checked the number of threads and processors. Threads shows only 1 (but that's expected as it is outside any  omp pragmas), but the processor count shows 4. Does this mean that my program is executing in the ARM cores and not the DSPs?


And one more thing, I am booting from NFS server. CCS doesn't seem to like that because I can't run any programs on the board directly through CCS.

  • Hi Ankit,
    Moving this thread over MCSDK HPC forum for faster response. Thank you for your patience.
  • Ankit,


    The dsplib_fft example demonstrates using the OpenMP target pragma to offload computation from ARM to DSP. In this case, the offloaded code happens to be call to a function from the dsplib library. There is no parallelism in the offloaded code.


    vecadd demonstrates using the target pragma for offload and the 'parallel for' pragma to parallelize the offloaded loop across the DSP cores. Wiki page with more detail: processors.wiki.ti.com/.../MCSDK_HPC_3.x_OpenMP

    Are you using CCS to load/run programs on the DSP?

    Ajay

  • Thank you for that information Ajay. That helps me understand the structure better.

    I am not using CCS to load programs on. I am using native compilation on the board. I tried to use CCS for the simple Hello World program but I kept getting an error about CCS not being able to access the memory. I need to investigate that issue further though.