This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Questions on C67x DSP FastRTS library

Hi, there,

  I am using C67x DSP FastRTS library for single precision's sine and cosine functions. We are trying to increase the executions of sine/cosine functions. So we'd like to use the FastRTS (Inlining) Pipelining w/128 Calls.

  According to the benchmark of FastRTS library (c67xfastRTS_Benchmarking.pdf), the FastRTS (Inlining) Pipelining w/128 Calls will increase the processing speed significantly. 

 For example, for sine function: FastRTS need 69 cycles, while FastRTS (Inlining) Pipelining w/128 Calls only need 17 cycles. 

  However, I implemented it into the DSP, and measured the processing time. I found FastRTS (Inlining) Pipelining w/128 Calls took much longer processing time than FastRTS. The function calls in my DSP code is below.

 (1). test_a = sinsp(value_a);     // processing time is about 70 cycles;

 (2). test_b = sinsp_i(value_b);     // processing time is about 130 cycles;

Could you please tell me what's the bug to make FastRTS (Inlining) Pipelining w/128 Calls not working as the benchmark declares? How can I invoke the function of FastRTS (Inlining) Pipelining w/128 Calls, in order to make its processing time to be 17 cycles?

Thank you.