Information about sincos function from C28x FPU fastRTS Library and sin cos functions using TMU

user4565873

Expert 1250 points

Other Parts Discussed in Thread: TMS320F28335, CONTROLSUITE

Hello all,

I m looking for the documentation on sincos function from the C28x FPU fastRTS Library to be used on TMS320f28335.

Is the source code available?

I would like to knoiw how it is working.. table, function.. Is it necessary that the angle is betweeen 0 and 2*Pi...

And the same question for the sinus and cosinus fucntion from the TMU module to be used on TMS320F28377X .

Any benchmark, documentation would be higly appreciated.

Thank you!

PA Nicoletti.

over 8 years ago

0 user4565873 over 8 years ago

Expert 1250 points

Hello all,

Can somebody help me with the topic above?

I need information on the sincos functions on the TMS320F28335 (C28x FPU fastRTS Library ), and sinus/cosinus functions of the TMS3290F28377D uisng the TMU.

Is there any technical data on those?

Thank you,
PA Nicoletti.

0 Disona over 8 years ago in reply to user4565873

Genius 3050 points

Hello

I've found source code for FAST RTS in control suite (C:\ti\controlSUITE\libs\math\FPUfastRTS\V100\source):

_sincos:                             ; On entry: R0H = Radian, XAR4 = PtrSin, XAR5 = PtrCos
        MOVIZ     R1H,#0x42A2        ; R1H = 512/(2*pi) = 512/6.28318531 = 81.4873309
        MOVXI     R1H,#0xF983

        MPYF32    R0H, R0H, R1H      ; R0H = Radian * 512/(2*pi)
     || MOV32     *SP++, R4H         ; store R4H on the stack

        MOVIZ     R2H,#0x3C49        ; R2H = (2*pi)/512 = 6.28318531/512
        MOVXI     R2H,#0x0FDB        ;     = 0x3C490FDB or 0.012271846644531
        F32TOI32  R1H, R0H           ; R1H = int(Radian * 512/(2*pi))
        MOVL      XAR6,#_FPUcosTable
        MOVL      XAR7,#_FPUsinTable
        MOV32     ACC, R1H           ; ACC = int(Radian *512/(2*pi))
        AND       @AL,#0x1FF
        LSL       AL,#1
        MOVZ      AR0,@AL            ; AR0 = Index into "sin/cos" table = k
        FRACF32   R0H, R0H           ; R0H = fract(Radian*512/(2*pi))
        MOVIZ     R1H,#0x3E2A        ; R1H = 0.166667 (0x3E2AAAAB)
        MOVXI     R1H,#0xAAAB

        MPYF32    R0H, R0H, R2H      ; R0H = x = fract(Radian*512/(2*pi)) * (2*pi)/512
     || MOV32     R3H, *+XAR6[AR0]   ; R3H = C(k)

        MPYF32    R2H, R1H, R3H      ; R2H = 0.166667*C(k)
     || MOV32     R4H, *+XAR7[AR0]   ; R4H = S(k)

        MPYF32    R1H, R1H, R4H      ; R1H = 0.166667*S(k)
        MPYF32    R2H, R0H, R2H      ; R2H = x*0.166667*C(k)

        MPYF32    R4H, -0.5, R4H     ; R4H = -0.5*S(k)
        MPYF32    R3H, -0.5, R3H     ; R3H = -0.5*C(k)

        MPYF32    R1H, R0H, R1H      ; R1H = x*0.166667*S(k)
     || SUBF32    R2H, R4H, R2H      ; R2H = -0.5*S(k) - x*0.166667*C(k)

        MOV32     R4H, *+XAR6[AR0]   ; R4H = C(k)

        ADDF32    R1H, R3H, R1H      ; R1H = -0.5*C(k) + x*0.166667*S(k)
     || MOV32     R3H, *+XAR7[AR0]   ; R3H = S(k)

        MPYF32    R2H, R0H, R2H      ; R2H = x*(-0.5*S(k) - x*0.166667*C(k))
        MPYF32    R1H, R0H, R1H      ; R1H = x*(-0.5*C(k) + x*0.166667*S(k))
        ADDF32    R2H, R4H, R2H      ; R2H = C(k) + x*(-0.5*S(k) - x*0.166667*C(k))
        SUBF32    R1H, R1H, R3H      ; R1H = -S(k) + x*(-0.5*C(k) + x*0.166667*S(k))

        MPYF32    R2H, R0H, R2H      ; R2H = x*(C(k) + x*(-0.5*S(k) - x*0.166667*C(k)))
     || MOV32     R4H, *+XAR7[AR0]   ; R4H = S(k)

        MPYF32    R1H, R0H, R1H      ; R1H = x*(-S(k) + x*(-0.5*C(k) + x*0.166667*S(k)))
     || MOV32     R3H, *+XAR6[AR0]   ; R3H = C(k)

        ADDF32    R2H, R4H, R2H      ; R2H = S(k) + x*(C(k) + x*(-0.5*S(k) - x*0.166667*C(k)))
     || MOV32     R4H, *--SP         ; Restore R4H from the stack

        ADDF32    R1H, R3H, R1H      ; R1H = C(k) + x*(-S(k) + x*(-0.5*C(k) + x*0.166667*S(k)))

        MOV32     *XAR4, R2H         ; *PtrSin = R2H
        MOV32     *XAR5, R1H         ; *PtrCos = R1H

        LRETR

0 Lori Heustess over 8 years ago in reply to Disona

TI__Guru* 91125 points

I'm glad you found the source code. In addition to the comments, there is documentation in the following directory:
C:\ti\controlSUITE\libs\math\FPUfastRTS\V100\doc

Regards,
Lori

0 Disona over 8 years ago

Genius 3050 points

Considering TMU: sin and cos are calculated with single instructions SINPUF32, COSPUF32.

For use in C-code there are compiler intrinsics. For example "sinA = __sin(angle);" Function "__sin()" is not declared in any header, it's built right into compiler. My test shown ~2 time faster TMU than fast supplement. This code performs 10 "ipark" functions. In one case these functions are implemented with FPU supplement, in the other case - with TMU. Cycles: 966 (FPU) against 438 (TMU).

        timePrev = CpuTimer2Regs.TIM.all;
            iparkf.calc(&iparkf);
            iparkf.calc(&iparkf);
            iparkf.calc(&iparkf);
            iparkf.calc(&iparkf);
            iparkf.calc(&iparkf);
            iparkf.calc(&iparkf);
            iparkf.calc(&iparkf);
            iparkf.calc(&iparkf);
            iparkf.calc(&iparkf);
            iparkf.calc(&iparkf);
        timePost = CpuTimer2Regs.TIM.all;
        timeFPU = timePrev - timePost;

        timePrev = CpuTimer2Regs.TIM.all;
            iparkf.tmuCalc(&iparkf);
            iparkf.tmuCalc(&iparkf);
            iparkf.tmuCalc(&iparkf);
            iparkf.tmuCalc(&iparkf);
            iparkf.tmuCalc(&iparkf);
            iparkf.tmuCalc(&iparkf);
            iparkf.tmuCalc(&iparkf);
            iparkf.tmuCalc(&iparkf);
            iparkf.tmuCalc(&iparkf);
            iparkf.tmuCalc(&iparkf);
        timePost = CpuTimer2Regs.TIM.all;
        timeTMU = timePrev - timePost;

Functions are executed in RAM (but called from FLASH).

void Tiparkf_Calc (Tiparkf* p){
	float sin_a;
	float cos_a;
	sincos(p->angle, &sin_a, &cos_a);
    p->ds = p->de * cos_a - p->qe * sin_a;
    p->qs = p->de * sin_a + p->qe * cos_a;
}



void Tiparkf_TMU_Calc (Tiparkf* p){
    float sin_a = __sin(p->angle);
    float cos_a = __cos(p->angle);

    p->ds = p->de * cos_a - p->qe * sin_a;
    p->qs = p->de * sin_a + p->qe * cos_a;
}

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! P.S.: looks like "sincons()" from FPU.h library doesn't work, it gives wrong results. I suppose because of tables. I should discover more.

Yes, in this tread:

F28377D FPU Tables - C2000 microcontrollers forum - C2000™︎ microcontrollers - TI E2E support forums

e2e.ti.com

We are porting F28335 code to a F28377D that makes use of the FPU Fast RTS library and associated FPU tables in the Boot ROM. I can't seem to find any reference

TI says, that FPU tables were taken out from boot rom, so sincos() from FPU.h doesn't work as previously. If you want to use it, you should upload tables by yourself. This made because of TMU in the MCU, wich works faster.

0 user4565873 over 8 years ago in reply to Disona

Expert 1250 points

Thank you very much Disona,

I 'm of course using TMU on the 28377 exactly for Park computation and so on.

I needed the source code to check that sincos does perform the modulo 2PI for a project on a TMS 28335.

As far as you re very good with FPU, would you suggest the fastest way to perform a modulo "2Pi" using the FAST RTS library?

I dont know if "fmod" makes part of the Library.

It s very helpful ,

Thank you.

PA .

0 user4565873 over 8 years ago in reply to Lori Heustess

Expert 1250 points

Thank you Lori,

I 'll look in the doc for further answers.

Regards,
PA.

0 Disona over 8 years ago in reply to user4565873

Genius 3050 points

I'm glad i could help. To be honestly, I'm not good in FPU or TMU. All the things in the previous post were discovered 15 minutes before writing that post =)
About function "fmod()" it's said: standard "fmod" uses division, and if you will add Fast_RTS library with "fast division", then "fmod" will use that fast division. So I think you can use this library and standard fmod. BUT, since TMU can perform division with one instruction, it's better not to use this fast library. Instead the compiler suggests turning on "relaxed mode" for FPU instead of "strict". This should make the compiler able to insert TMU instructions where possible. So fmod MAYBE will take TMU fast division.

But I guess, there's another way. For example, if you have an angle a=154.213 radian, this is the same as you have 3,4165 radians (modulo 2Pi). In PU it's 3.4165/2Pi = 0.543.

For example, if you have an angle a=154.213 radian, this is the same as you have 3,4165 radians (modulo 2Pi). In PU it's 3.4165/2Pi = 0.543. Since TMU sin/cos work with PU, that will come.
But we can get the same result in other sequence: 154.213 / 2PI = 24.543 PU. Now use "modulo 1" and get 0.543 - the same result. TMU hase 3 cycles instruction to divide a number by 2Pi; it also has 3 cycles instruction to multiply a number by 2Pi. FPU has an instruction to take fractional part.

So maybe we can use smth like the code below. But i really doubt, that will be faster, than "fmod". Also, all of this is just my imagination, and it must be veryfied.

angle = 22.15; // Initial angle
angle = __div2pif32(angle); // Convert to PU
angle = __frac32(angle);    // Take fractional part
angle = __mpy2pif32(angle); // Convert back to radians