C6748 Benchmarking: Run-Time Support Libraries

Clayton Gilmore

I am evaluating several candidate DSPs, and the C6748 is running surprisingly slow on the floating point math benchmarks, compared to other candidates. I have narrowed the issue down to the floating point divide operations, and therefore I have several questions regarding floating point support in the RTS lib:

1.) Does rts6740.lib use a fixed point algorithm for single- and dual-precision divides (_divf and _divd, respectively)? I poked through the RTS code, and that's what it looks like.

2.) The fastRTS library (sprc060) uses floating point instructions, and in fact performs much closer to my expectations. Is fastRTS my only option for true floating point divides?

3.) Will I suffer any precision loss by using the fastRTS floating point divide functions (_divsp and _divdp)? The C6000 Optimizing Compiler's Guide (spru187q) says "these functions gain speed improvements at the cost of accuracy in the result," but the TMS320C67x FastRTS Library Programmer’s Reference (spru100a) says nothing about this. If there is a loss of precision, how can I quantify it?

4.) The product page says fastRTS is active, but it hasn't been updated since 2002. This page implies that there will be a new fastRTS release. If so, when? Are there any bug fixes in the new release? Is there any risk in using the old one vs. the new one?

5.) Do the fastRTS _divsp and _divdp functions handle NaN, infinity, and divide-by-zero?

6.) Other than what I've asked about, are there any downsides to using fastRTS on the C6748?

For reference, I am developing in C++ using CCS v3.3 and Code Gen Tools v7.0.4.

Thanks,

Clayton Gilmore

Software Engineer

Rockwell Collins, Inc.

over 15 years ago

0 Gagan Maur over 15 years ago

TI__Expert 8150 points

Clayton. I provided some information to you on a different thread also. Let me try and answer some of your questions here:

1) Your understanding is correct. See here: C:\Program Files\Texas Instruments\C6000 Code Generation Tools 6.1.15\lib\rtssrc\MATH\divf.c

2) Yes, the fastRTS implementation is what you should consider for achieving good performance. I provided you with the performance comparison with the last post.
Single precision divide performance:
* RTS = 540 clocks
* fastRTS = 37 clocks
* fastRTS with inlining (or vector invocation) = 3 clocks

3) You won't suffer any detectable precision loss. We have verified the accuracy to match upto 0.000001 percent. I mentioned in my previous post about the updated fastRTS library that we are close to releasing. The updated release comes with test bench that you can use to do such analysis yourself.

4) Per my other comments, fastRTS is definitely an actively supported product and we have completed significant feature update to it that we will post in Oct. If you need an early prod, please let us know and we will provide

5) We have avoided special checks in the code. BUT, we do provide the entire fastRTS SW (in the updated release) in optimized C. So such special checks if needed are extremely easy to add. However I do recommend that such checks be added during the development phase of the project and removed for production

6) As I said there are some constraints are added. But in majority cases, these are don't care. For ex, the divsp function has the below constraint:
Special Cases:
* If | y| < 1.1755e-38 = 2- 126, then the return value is NaN = Not-a-Number (exponent and mantissa are all ones) > +/- 3.402823e+38 = +/- 1 * 2+128 (largest single-precision floating-point number) with the sign of x.

Regards,
Gagan

0 Clayton Gilmore over 15 years ago in reply to Gagan Maur

Prodigy 90 points

Thanks! That's exactly what I was looking for.

Clayton Gilmore

Software Engineer

Rockwell Collins, Inc.

0 Ian Guest49803 over 15 years ago in reply to Gagan Maur

Prodigy 180 points

Good day Gagan

I am struggling to get the fastRTS to compile and run on the C6748. I have compiled the version 1.02 with the mk6x -mv6740 fastrts67x.src -l fastrts67x.lib command. When running the code the sin function invokes a HWI_NMI and hangs.

Is it possible to get the new code that will run on the C6748 as indicated in the post?

I would also like to know the compiler/linker settings for the inlining vector invocation.

Thanks

Ian

0 Gagan Maur over 15 years ago in reply to Ian Guest49803

TI__Expert 8150 points

Ian, definately. Please drop a note to the e-mail list indicated here:
http://processors.wiki.ti.com/index.php?title=Software_libraries#Developer_Mailing_List

We'll provide you the software asap

Thanks,
Gagan

Processors

Processors forum

C6748 Benchmarking: Run-Time Support Libraries