This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMDSLCDK138: Doubts with DSPLIB

Part Number: TMDSLCDK138

Hi, I am using the TMDSLCDK138 for my application. So far, I managed to transfer samples from the codec to the external memory using EDMA3 and McASP and then send them back to the codec without doing any processing.

I would like to apply a symmetric FIR filter to the samples and for that I had a look at DSPLIB. Especifically, I downloaded the package dsplib_c674x_3_4_0_0.

While it has a function for generic FIR filtering, there is no function for applying a symmetrical FIR. Why? I have thought that perhaps the reason is related with the architecture and there is no advantage in using this type of FIR implementation.

I am not using any SO or anything, it's all code from scratch, but I would like to optimize the DSP operations.

Sorry if this is not the appropiate forum and thanks in advance.

  • Hi Fidel,

    I've forwarded your query to the software experts. Their feedback should be posted here.

    BR
    Tsvetolin Shulev
  • Fidel,

    Are you looking for a floating point or fixed point implementation of the symmetric FIR filters. There may be a fixed point implementation of FIR filter in the DSPLIB C64x+ DSPLIB. I am also checking internally if we can share something for your development effort.

    software-dl.ti.com/.../index_FDS.html

    The C674x is compatible with C64x+ architecture so the code can be re-used on the C674x DSP.

    Regards,
    Rahul
  • Fidel,

    My belief (FWIW) is that you are correct in thinking "with the architecture ... there is no advantage in using this type of FIR implementation". The computation advantage of a symmetric FIR comes from not having to load as many coefficients. But the C674x memory architecture was designed specifically for being able to keep the processing pipeline filled for multiply-accumulate operations, specifically the FIR.

    You could conceivably develop a symmetric FIR that has some available D-unit slots that could be used in some other way. But this would require zippering in another algorithm with the FIR code, and that is not something TI could try to develop. And it would not get the FIR result any faster, either.

    Regards,
    RandyP
  • Rahul Prabhu

    The 24 bits samples from the codec are saved in an Int32 buffer, which I later convert to floating point for the processing. After doing the processing, I round the floats to the nearest integer using _spint. I assume that will help to reduce the quantization noise.

    So in principle, I am looking for a floating point implementation.

    RandyP

    Either way, I have realized I need my own implementation using circular buffering. The document www.ti.com/.../spra645a.pdf presents an interesting concept and I think I am going to try to exploit it.

    My application requires a fractional delay and I use a structure with FIR differentiators to implement it, based on a letter by Soo-Chang Pei and Chien-Cheng Tseng. The differentiators need to be anti-symmetric with a zero in the middle and an odd number of coefficients. Therefore, I can't make use of dsplib.

    What I think I can do is make use of circular addressing to implement the circular buffers in an efficient way and then implement my own asm version of an assymetrical FIR. Quite some work, considering that I know very few about the internal architecture of the processor, but I will give it a try.
  • Fidel,

    You will notice that the Circular Buffering on TMS320C6000 application note was released in 2001 when the C6000 devices were first released. At that time, most DSP code was written in assembly and most of our training was a the processor looking at every assembly instruction. Once the C compiler started working more efficiently, we quit teaching assembly coding and started teaching compiler optimization.

    The circular addressing modes are retained in the C6748 core for backward compatibility. But they are not recommended for anyone writing in C. If your whole application will be written in assembly and is only this task, you might not have too many problems. Otherwise, you will have to protect the circular addressing code from being interrupted by anything that uses C code, including TI-RTOS. And you will have to fully restore the circular addressing to standard addressing once the FIR has executed, and you are ready to allow C code to operate.

    I recommend against this, in favor of using C coding and DSPLIB to implement what you need to do. The generic FIR implementation will give you an efficient and much easier implementation. Use EDMA3 to move data efficiently to implement any buffer construction that you need.

    Those are only my own recommendations. You can assess the task ahead and find the best way to get it done. Please let us know how your project completes, and share what you have learned on the forum.

    And please click the Green Resolved button on your thread above, to show you are satisfied.

    Regards,
    RandyP
  • Hello Randy, thanks for your advice.

    I am currently writing a routine in assembly that performs two FIR filters at the same time, one with each register bank and functional unit block. I have been the last two days optimizing the code and now I use four accumulators and double word loads for maximum efficiency. I have looked at the C code of the function DSPF_sp_fir_gen that comes with DSPLIB and I suspect it does pretty much the same, except for one single FIR.

    About circular addresing and interruptions, I need to study the question a bit more, but here is what I had in mind before your post:

    1) Implement a circular buffer using C, but with assembly routines and circular addressing to move the pointer around instead of using C "if"' to check the bounds. So far I thought that would be more efficient. Of course I need to dissable interruptions and change the addressing mode every time I enter into the assembly code section and do the same when I exit.

    2) Use my assembly "double FIR" to apply several FIRs to the samples, reading from the memory region of the circular buffer using circular addresing. What I said about interruptions and addressing mode changes also apply here of course.

    What is exactly the problem, except that it is more complex? Perhaps the whole process of changing memory addressing modes and disabling interruptions doesn't worth it? I should say that I have only one interruption, EDMA3 transfer complete, and since EDMA3 transfers a fixed size buffer I need to end the processing of the samples before the next interruption anyways.

    If you still recommend me to write everything in C, I will look at the code of the funcion DSPF_sp_fir_gen and make my own function from there. Though I don't know to which degree the compiler will make perfect use of all the registers and units.

    By the way, I confirm that there is no point on implementing a symmetrical FIR. There is no advantage, in fact, it's worse. You can check it using loop unwinding.

    Regards,

    Fidel

  • Fidel,

    Since a post was marked Answered, this thread was closed. I saw an email of a post from you but I do not see that post here. I do not know if it was blocked. Did you delete it?

    If it was blocked, you may need to start a new thread or else reject the Answered flag. I am not sure of those restrictions one way or the other.

    Regards,
    RandyP
  • Hi,

    Yes, I deleted the post. It was an ASM function but I realized it had a couple of errors and the performance test was not done using L2 memory so the results were not appropriate to compare them with DSPLIB benchmarks. Using L2 memory the performance is awesome. Either way, my original question is more than solved at this point.

    Regards,

    Fidel.