This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320F28379D: Looking for reference paper or book of algorithm in RFFT_adc_f32.asm file.

Part Number: TMS320F28379D

Hello. 

I'm investigating RFFT DSO library for TMS320F28379.

I thought I understood the why the 2N real input RFFT is implemented for base algorithm of the MCU. 

But I couldn't understand the algorithm used in RFFT_adc_f32.asm file.  

For example, the code explanation,

;; 1) Bit reverse input data and calculate stages 1, 2 & 3:
;;
;;  In Buf (read in bit reverse order)                             Out Buf
;;  +----+                                                          +----+
;;  | I1 | (((I1 + I2) + (I3 + I4)) + ((I5 + I6) + (I7 + I8)))/8 -> | I1'|
;;  | I2 | ((I1 - I2) + COS*((I5 - I6) + (I8 - I7))          )/8 -> | I2'|
;;  | I3 | ((I1 + I2) - (I3 + I4)                            )/8 -> | I3'|
;;  | I4 | ((I1 - I2) - COS*((I5 - I6) + (I8 - I7))          )/8 -> | I4'|
;;  | I5 | (((I1 + I2) + (I3 + I4)) - ((I5 + I6) + (I7 + I8)))/8 -> | I5'|
;;  | I6 | (COS*((I8 - I7) - (I5 - I6)) - (I4 - I3)          )/8 -> | I6'|
;;  | I7 | ((I7 + I8) - (I5 + I6)                            )/8 -> | I7'|
;;  | I8 | (COS*((I8 - I7) - (I5 - I6)) + (I4 - I3)          )/8 -> | I8'|
;;     .
;;     .
;;    \|/
;; Repeat above FFTSize/8 (i.e. if FFTSize = 1024, Repeat = 128 times)

I thought this part exaplains 8 point DFT, but it couldn't match the result I expected(2N real DFT using N complex DFT). 

If there is a reference paper or book, please guide me to understand the code. 

I'm beginner to use DSP scheme with MCU. I was wondering if you please gently guide the way to find the goal. 

Thank you.

Best regards. 

  • Hello Miles,

    You can reference the FPU Software Library Guide below, although unfortunately I do not believe it offers much in terms of explanation of how these functions work in the backend (for that you will have to read the comments in the assembly files).

    I thought this part exaplains 8 point DFT, but it couldn't match the result I expected

    This looks to be only the first step to set up to do stages 1-3, whereas by the end of stages 4 and up you should get the results. I have not gone too in depth on the algorithm before, but if your application requires you to know this thoroughly I can try to find out the exact algorithm step-by-step.

    Best regards,

    Omer Amir

    FPU_SW_LIB_UG.pdf

  • Thank you for the reply.

    The reason I'm looking for the reference is the derivation URL link from "FPU_SW_LIB_UG.pdf" which is  "">www.engineeringproductivitytools.com/.../PT10.HTM", is no longer available to access. 

    I think it is the one of the key to understand the algorithm used in "FPU_SW_LIB_UG.pdf"

    Is there any way to find the contents of the link? 

    If it so, I might understand more easily. 

    Also Thank you and please to hear this comment. 

    I have not gone too in depth on the algorithm before, but if your application requires you to know this thoroughly I can try to find out the exact algorithm step-by-step.

    Thank you. 

  • Hello Miles,

    I was able to use the Wayback Machine to find an archived version of the webpage: https://web.archive.org/web/20180312110051/http://www.engineeringproductivitytools.com/stuff/T0001/PT10.HTM

    Let me know if you still require me to break down the algorithm in the rfft_adc_f32 function.

    Best regards,

    Omer Amir

  • I'm sorry. 

    The link you recovered doesn't explain about the algorithm I think. 

    It is just theoretical matters explained in spra291.pdf. 

    Please break down the algorithm used in the rfft_adc_f32 function. 

    Especially the code for 1/2/3 stage I mentioned and so on. 

    Thank you. 

    Best regards, 

  • Hello Miles,

    I'll go through the algorithm and try to come up with a sufficient explanation. Please allow some delay, as it may take some time to also contact some experts who maintain the software.

    Best regards,

    Omer Amir

  • Hello Miles,

    I was not able to get in contact with any of the experts who were involved in writing the algorithm in the DSP library, but from what I looked over in the assembly, the FFT algorithm seems to be following the one mentioned on this site:

    https://www.dspguide.com/ch12/2.htm#:~:text=The%20FFT%20operates%20by%20decomposing,into%20a%20single%20frequency%20spectrum.

    Was there a particular reason you needed to know each step of the algorithm itself? If you could tell me more details, such as you need to know the timing or memory requirements, I might be able to provide a better answer.

    Best regards,

    Omer Amir

  • Thank you for the kind reply. 

    First of all, goal to achieve is performing the RFFT in real-time using SDFM result with TMS320F28379D. 

    Some ot the experts in this forum recommended that it is good way to start from RFFT example. 

    After I have investigated the examples, all the example uses the RFFT_adc_f32.asm file as a building block for the FFT library. 

    So I thought without understanding the code including theory based on FFT, I couldn't manage it. 

    Unforunately, I'm just a beginner at assembly language, so that I couldn't understand the assembly code. 

    Thererfore, I would like to start from theoritical equation and then I finally can match the assembly code with equation. 

    But anyway, if it is not able to find any of the experts, I could start from the base.

    Was there a particular reason you needed to know each step of the algorithm itself? If you could tell me more details, such as you need to know the timing or memory requirements, I might be able to provide a better answer.

    1. My ADC results data is from the SDFM. so the data format is "signed 16-bit integer". 

    -> But I think the input format for the  RFFT_adc_f32.asm is "unsigned 12-bit integer" 

    2. Some of the remarks said that input data must be aligned. But I couldn't understant the difference between algned and unaligned data

    3. Input data must be bit-reversed. so it might be performed at the starting point of the "_rfft_adc_f32_Stages1and2and3andBitReverse:" 

    _rfft_adc_f32_Stages1and2and3andBitReverse:
    ;----------------------------------------------------------------------
    ;     Save all save-on-entry registers used
    ;----------------------------------------------------------------------
            PUSH     XAR1
            PUSH     XAR2
            PUSH     XAR3
            MOV32    *SP++,R4H
            MOV32    *SP++,R5H
            MOV32    *SP++,R6H
            MOV32    *SP++,R7H
            ADDB     SP,#14h
    
            MOVL     XAR2,*+XAR4[0]         ; &Inbuf
            MOVL     XAR5,*+XAR4[2]         ; &Outbuf
            SUBB     XAR5, #2
            MOVL     XAR4, XAR5
            MOVL     XAR5,*+XAR4[2]         ; &Outbuf
            MOVL     XAR3,*+XAR4[2]
            ADDB     XAR3,#8                ; &Outbuf[4]
            MOVL     XAR7,*+XAR4[4]         ; &CosSinbuf
    
            MOV      AR0,#0Ah
            MOV      AH,*+XAR4[AR0]         ; FFT SIZE
            LSR      AH,1                   ; FFT SIZE/2 - for 16-bit input data
            MOV      AR0,AH
    ;       LSR      AH,3
            LSR      AH,2                   ; for 16-bit input data
            SUBB     AH,#1                  ; (Size / 8) - 1
            MOVL     XAR1,#0000h            ; index if memory is not aligned
    
            RPTB    _rfft_32_Last, AH

    From the line 16 to 19, what is the purpose ot the "SUBB     XAR5, #2"?

    Thank you. 

  • Hello Miles,

    After I have investigated the examples, all the example uses the RFFT_adc_f32.asm file as a building block for the FFT library.

    Do you mean to say that all examples use the rfft_adc_f32 function as a building block for the FFT library? Because this is not the case, there are many examples that use different versions of the FFT function (complex, real, windowed, ADC input, etc.), this is just the only one that converts a fixed-point input to a floating-point output

    1. My ADC results data is from the SDFM. so the data format is "signed 16-bit integer". 

    -> But I think the input format for the  RFFT_adc_f32.asm is "unsigned 12-bit integer"

    The rfft_adc_f32 function is built to process input data from a 12-bit ADC, so I believe you would need to either convert it into this format or convert your data into floating-point. Although technically the FFT handle uses a uint16_t pointer, this is likely only because there is no uint12_t datatype. It's not immediately obvious when I look through the source code that it would not accept a 16-bit input, but there should be no harm in testing this out on your device.

    2. Some of the remarks said that input data must be aligned. But I couldn't understant the difference between algned and unaligned data

    This essentially refers to the requirement by some FFT functions that utilize some space optimizations to speed up the FFT itself. To align your data input, you need to allocate it on a boundary in RAM or Flash that is appropriate for the size of data your are processing. For example, if you are processing 1024 16-bit elements in an array, the array and its elements must be allocated at a memory location such as 0xFC00 or 0x5800, where essentially the last 10 bits (2^10=1024) are 0s. This would be a 1K boundary. For a floating-point input of size 32-bits per element, this would need to be a 2K boundary, and so forth. I believe for memory aligned functions, any arrays must be memory aligned to be used properly.

    However, you can forgo using the memory aligned versions of the functions if you wish, simply use the functions with a 'u' at the end of its name.

    3. Input data must be bit-reversed. so it might be performed at the starting point of the "_rfft_adc_f32_Stages1and2and3andBitReverse:"
    From the line 16 to 19, what is the purpose ot the "SUBB     XAR5, #2"

    This is done within the FFT function itself, there is nothing you need to do. This note is simply a description of what is happening in the source file. You are not required to call it yourself. If you still want me to try and figure out the answer, let me know.

    You can look at how the examples utilize the function to see what you need to initialize and what functions to call beforehand. If you have any questions about steps done in the example, I can help clarify.

    Best regards,

    Omer Amir

  • Thank you for the reply. 

    I've been searching for the algorithm, I think it is very similar to Discrete Hartley Transform and Fast Hartley Transform. 

    I'll investigate more and make another thread for further questions. 

    Thank you. 

    Best regards. 

  • Your welcome Miles. I'll mark this thread as resolved for now, but feel free to reply if you have further questions.

    Best regards,

    Omer Amir