Scaling issue of RFFT and IRFFT

Keyur Acharya

Expert 1740 points

Other Parts Discussed in Thread: TMS320F28377D

Hi all,

I am calculating Real FFT and Real IFFT with the help of VCU2.

Device: TMS320F28377D

There are saperate examples given for FFT and IFFT.

What i am doing is

step 1) Take a real signal in timed domain

step2) Calculate RFFT with VCU instructuions

Step3) Calculate IRFFT

Details:

Step2)

CFFT.init(handleCFFT);
CFFT.run(handleCFFT);
CFFT_unpack(handleCFFT);

step3)

CFFT_pack(handleCFFT);
CFFT.run(handleCFFT);
CFFT_conjugate(handleCFFT->pOutBuffer, handleCFFT->nSamples);

But there is some issue with the scaling,

I get the output scaled down. (The scaling factor is approximately 200 times.)

What should i do?

over 9 years ago

0 Vishal_Coelho over 9 years ago

TI__Mastermind 20850 points

Hi,

If you look at the pack function, you will see these two lines at the top

    VSETSHR   #17                                ; VSTATUS.SHIFTR = RIGHT_SHIFT, scale by 4
    VSETSHL   #15                                ; VSTATUS.SHIFTL = LEFT_SHIFT

Basically its the first line that sets the overall scaling to /4. This algorithm was written for the Power line comms application where this scaling was needed to prevent overflow. You could alter the VSETSHR to #16 or #15 and see if it minimizes the scaling issue without causing overflows.

The forward FFT will do a divide by 2 each stage - you will notice that by the #1 at the end of each of the VCFFT instructions, for example

VCFFT9    VR5, VR4, VR3, VR2, VR1, VR0, #1

By changing that to 0 you dont divide by 2. You can adjust the scaling this way but you would have to know the range of the input and the amount of scaling required to prevent overflow.

0 Keyur Acharya over 9 years ago in reply to Vishal_Coelho

Expert 1740 points

Hi vishal,

thanks for the help.

Vishal_Coelho said:

    VSETSHR   #17                                ; VSTATUS.SHIFTR = RIGHT_SHIFT, scale by 4
    VSETSHL   #15                                ; VSTATUS.SHIFTL = LEFT_SHIFT

I have checked this suggestion and output scaling changes with the value of VSETSHR.

But the second method

Vishal_Coelho said:
VCFFT9 VR5, VR4, VR3, VR2, VR1, VR0, #1

I tried changing #1 to #0 in every location where VCFFT is used but i don't see any change in the scaling.

_CFFT_run128Pt_stages1and2Combined:

;; local defines
S12_NBFLY           .set (NSAMPLES / (2*2))      ; Number of 2x2 butterflies
S12_LOOP_COUNT      .set S12_NBFLY - 2           ; Stage 1/2 loop count

    MOVZ      AR0,  *+XAR4[ARG_NSAMPLES]         ; AR0 := bit-reversed index 1
    MOVL      XAR2, *+XAR4[ARG_INBUFFER]         ; XAR2 -> input buffer
    MOVL      XAR1, *+XAR4[ARG_OUTBUFFER]        ; XAR1 -> output buffer

    .lp_amode                                    ; override assembler mode to C28x + C2xLP sysntax
    SETC      AMODE                              ; set AMODE to C2xLP addressing

    NOP       *,ARP2                             ; ARP -> XAR2
    VMOV32    VR0, *BR0++                        ; VR0 := *(AR2 bradd AR0++) | VR0 := I0:R0
    VMOV32    VR1, *BR0++                        ; VR1 := *(AR2 bradd AR0++) | VR1 := I1:R1
    VCFFT7    VR1, VR0, #0                       ; VR2 = I2:R2 <- XAR1
 || VMOV32    VR2, *BR0++                        ;[VR0H:VR0L] := [R0 - R1:R0 + R1] := [VR0L - VR1L:VR0L + VR1L]
                                                 ;[VR1H:VR1L] := [I0 - I1:I0 + I1] := [VR0H - VR1H:VR0H + VR1H]

    VMOV32    VR3, *BR0++                        ; VR3 := I3:R3 <- XAR1
    VCFFT8    VR3, VR2, #0                       ;[VR2H:VR2L] := [R2 - R3:R2 + R3] := [VR2L - VR3L:VR2L + VR3L]
                                                 ;[VR3H:VR3L] := [I2 - I3:I2 + I3] := [VR2H - VR3H:VR2H + VR3H]

    VCFFT9    VR5, VR4, VR3, VR2, VR1, VR0, #0   ;[VR4H:VR4L] := [I0':R0'] := [(I0+I1) + (I2+I3):(R0+R1) + (R2+R3)] := [VR1L + VR3L:VR0L + VR2L]
                                                 ;[VR5H:VR5L] := [I2':R2'] := [(I0+I1) - (I2+I3):(R0+R1) - (R2+R3)] := [VR1L – VR3L:VR0L – VR2L]

    .align    2                                  ; align at 32-bit boundary to remove penalty
    RPTB      _CFFT_run128Pt_stages1and2CombinedLoop, #S12_LOOP_COUNT

    VCFFT10   VR7, VR6, VR3, VR2, VR1, VR0, #0   ; VR0 := I0:R0 <- *(AR2 bradd AR0++)
 || VMOV32    VR0, *BR0++                        ;[VR6H:VR6L] := [I1':R1'] := [(I0-I1) - (R2-R3):(R0-R1) + (I2-I3)] := [VR1H – VR2H:VR0H + VR3H]
                                                 ;[VR7H:VR7L] := [I3':R3'] := [(I0-I1) + (R2-R3):(R0-R1) - (I2-I3)] := [VR1H + VR2H:VR0H – VR3H]

    VMOV32    VR1, *BR0++                        ; VR1 := I1:R1 <- *(AR2 bradd AR0++)
    VCFFT7    VR1, VR0, #0                       ; VR2 := I2:R2 <- *(AR2 bradd AR0++)
 || VMOV32    VR2, *BR0++                        ;[VR0H:VR0L] := [R0 - R1:R0 + R1] := [VR0L - VR1L:VR0L + VR1L]
                                                 ;[VR1H:VR1L] := [I0 - I1:I0 + I1] := [VR0H - VR1H:VR0H + VR1H]

    VMOV32    VR3, *BR0++                        ; VR3 := I3:R3 <- *(AR2 bradd AR0++)
    VCFFT8    VR3, VR2, #0                       ; Save I0':R0' -> XAR1
 || VMOV32    *XAR1++, VR4                       ;[VR2H:VR2L] := [R2 - R3:R2 + R3] := [VR2L - VR3L:VR2L + VR3L]
                                                 ;[VR3H:VR3L] := [I2 - I3:I2 + I3] := [VR2H - VR3H:VR2H + VR3H]

    VMOV32    *XAR1++, VR6                       ; Save I1':R1' -> XAR1
    VCFFT9    VR5, VR4, VR3, VR2, VR1, VR0, #0   ; Save I2':R2' -> XAR1
 || VMOV32    *XAR1++, VR5                       ;[VR4H:VR4L] := [I0':R0'] := [(I0+I1) + (I2+I3):(R0+R1) + (R2+R3)] := [VR1L + VR3L:VR0L + VR2L]
                                                 ;[VR5H:VR5L] := [I2':R2'] := [(I0+I1) - (I2+I3):(R0+R1) - (R2+R3)] := [VR1L – VR3L:VR0L – VR2L]

    VMOV32    *++, VR7, ARP2                     ; Save I3':R3' -> XAR1 | ARP -> XAR2
    ;VMOV32    *XAR1++, VR7, ARP2                ; Save I3':R3' -> XAR1 | ARP -> XAR2
                                                 ;this form causes ARP to be XAR1 not XAR2

_CFFT_run128Pt_stages1and2CombinedLoop:

    VCFFT10   VR7, VR6, VR3, VR2, VR1, VR0, #0   ;[VR6H:VR6L] := [I1':R1'] := [(I0-I1) - (R2-R3):(R0-R1) + (I2-I3)] := [VR1H – VR2H:VR0H + VR3H]
                                                 ;[VR7H:VR7L] := [I3':R3'] := [(I0-I1) + (R2-R3):(R0-R1) - (I2-I3)] := [VR1H + VR2H:VR0H – VR3H]

    VMOV32    *XAR1++, VR4                       ; Save I0':R0' -> XAR1
    VMOV32    *XAR1++, VR6                       ; Save I1':R1' -> XAR1
    VMOV32    *XAR1++, VR5                       ; Save I2':R2' -> XAR1
    VMOV32    *XAR1++, VR7                       ; Save I3':R3' -> XAR1

_CFFT_run128Pt_stages1and2CombinedEnd:
    .c28_amode                                   ; change the assembler mode back to C28x
    CLRC      AMODE                              ; set AMODE back to C28x addressing

Here is the example of one stage of FFT calculation i modified.

Can you please check wether i am doing some thing wrong?

If you want i can give you complete asm file.

Thanks,

0 Vishal_Coelho over 9 years ago in reply to Keyur Acharya

TI__Mastermind 20850 points

Try cleaning the library project before building it.

0 Keyur Acharya over 9 years ago in reply to Vishal_Coelho

Expert 1740 points

Dear vishal,
I have checked that but still the issue is not resolved.
Regards,
K3y4r

0 Vishal_Coelho over 9 years ago in reply to Keyur Acharya

TI__Mastermind 20850 points

Ok. Let me get some details. What is the size of the RFFT you are running?

0 Keyur Acharya over 9 years ago in reply to Vishal_Coelho

Expert 1740 points

Some details
No of points: 128
Controller: TMS230F28377D

Regards,
Keyur

0 Vishal_Coelho over 9 years ago in reply to Keyur Acharya

TI__Mastermind 20850 points

Ok a 128 point real FFT will actually treat the 128 point real data as 64-pts complex and run the 64pt CFFT and then "unpack" the spectrum.

you need to edit the 64-bit CFFT, looks like you were editing the 128 pt CFFT.

0 Keyur Acharya over 9 years ago in reply to Vishal_Coelho

Expert 1740 points

Hi Vishal,
I made a mistake in last answer.
I am calling a 128 cfft function from inside of rfft_265 project.
So samples are 256.

Regards,
Keyur

0 Vishal_Coelho over 9 years ago in reply to Keyur Acharya

TI__Mastermind 20850 points

Ah ok nevermind. For the inverse RFFT, you call the pack and then the ICFFT routine. You can only change the scaling in pack - the ICFFT routines dont divide by 2 each stage, i.e. all the VCFFT instructions have a #0 operand at the end.

0 Keyur Acharya over 9 years ago in reply to Vishal_Coelho

Expert 1740 points

Dear vishal,
I have checked this change which you have suggested but however it is not working for me :(
Can you send me .asm routines or library in which scaling is not done on each stage?

My need is simple, the input and output should be in the same scale.

Regards,
Keyur

C2000™︎ microcontrollers

C2000 microcontrollers forum

Scaling issue of RFFT and IRFFT