This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

66AK2H14: Issues about the performance of OpenCL on 66AK

Part Number: 66AK2H14

Hi ,I am now evaluating the performance of OpenCL on the 66AK platform, but with a problem I have now written a kernel function that simply calls a FFT function provided by TI, and I  measured the time_consuming of different point.And I compare it with diretly running on the DSP cores not using opencl.And I found the performance of using Opencl is much lower than not using Opencl.SO I am very puzzled because the function is same and the input data and output data are all allocated in L3.I hope somebody can help me. Thanks!

  • Hi,

    Can you take a look at the dsplib_fft example in /usr/share/ti/examples/opencl? If you measure the performance of the same FFT function, the performance should be comparable whether it is called within the kernel or from a standalone C application on DSP. Are you measuring on the DSP side? Is your kernel work group size (1,1,1)? We can help you if you can give us more information.

    - Yuan