This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320F28035: IQMath Sinus/Cosinus run time twice as long as documented

Part Number: TMS320F28035
Other Parts Discussed in Thread: CONTROLSUITE, C2000WARE

Hi

I'm like to calc a sinus and a cosines value with the IQMath lib on a F28035.

But if I measure the run time of my calculation it needs twice as long as written in the documentation.

    // optimization 4, Speed/Size 4: 365 ticks (running out of RAM)
    _iq29 iq29PosRad = _IQ29mpyI32(IQ29_INC_TO_RAD, u16Pos);                //  4
    _iq29 iq29Sin = _IQ29sin(iq29PosRad);                                   // 46
    _iq29 iq29Cos = _IQ29cos(iq29PosRad);                                   // 44
    _iq29 iq29Sinus = _IQ29mpy(iq29Sin, _IQ29(fAmpl));                      //  6
    _iq29 iq29Cosinus = _IQ29mpy(iq29Cos, _IQ29(fAmpl));                    //  6
    _iq29 iq29PwmA = _IQ29mpyI32int(iq29Sinus + _IQ29(1), PWM_PERIOD/2);    // 22
    _iq29 iq29PwmB = _IQ29mpyI32int(iq29Cosinus + _IQ29(1), PWM_PERIOD/2);  // 22 => Summe 150 cycles
    set_pwm(iq29PwmA, iq29PwmB);

I measured the runtime with a timer and with a GPIO.

    GpioDataRegs.GPBTOGGLE.bit.GPIO34 = 1;

    u32Start = CpuTimer0.RegsAddr->TIM.all;

    calc_sincos(fAmplTest, u16PosTest);

    u32Laufzeit = u32Start - CpuTimer0.RegsAddr->TIM.all;

    if(u32MaxLaufzeit < u32Laufzeit)
        u32MaxLaufzeit = u32Laufzeit;

    GpioDataRegs.GPBTOGGLE.bit.GPIO34 = 1;

The code is running inside the pwm ISR. This ISR copied into RAM for execution.

Kind regards

Rene

  • Rene,

    The most likely explanation is the IQmath routines are running from flash.  You should easily be able to verify where these functions are by looking at the .map file for your project. If you scroll right to the bottom of that file you should see the sine & cosine function symbols allocated to addresses in the ROM. If they're not there, look for them higher up and identify which type of memory they are physically in.  You can check the addresses against the memory map on p.50 of the F28035 datasheet.

    Could you try this and let me know what you find?

    If they're in flash addresses, you have two alternatives.  You can either load the IQ functions into RAM in the linker command file, or make use of the built-in IQ functions in ROM. 

    The F28035 has sine & cosine routines in internal ROM which is zero wait-state. You can make use of these by linking in the boot Rom symbol library to your project. See p.17-18 of the IQmath user's guide document for more information on how to do this.  Unfortunately, as I look at the ROM file in C2000Ware and controlSUITE I'm not seeing where the IQmath symbol file is, so I need to ask a colleague about this.  I'll follow up this post when I have more information. 

    Regards,

    Richard

  • Rene,

    Please find attached the .lib file described in the IQmath documentation which you should be able to link into your project.  It should have been in the C2000Ware download.  We'll add it to the next release.

    Hopefully this will solve your problem.  Please let us know how you get along.

    Regards,

    Richard

    2803x_IQmath_BootROMSymbols.zip

  • Hi Richard

    Thanks for the answer.

    I'll already linked that lib into my project.
    (from here C:\ti\c2000\C2000Ware_1_00_03_00\libraries\math\IQmath\c28\examples\bootROM_symbols)

    And if I had a look to the map file I'll found that:

    abs 003ff3f3 __IQ29cos
    0 003f6145 __IQ29mpyI32int
    abs 003ff32e __IQ29sin

    Regards
    Rene
  • Hi Rene,

    Ok, good. I see you are also using a CPU timer to measure the time (u32Laufzeit) - is that where the 365 ticks comes from in the comment? Does that measurement accord with the I/O pin measurement on the scope?

    I'm wondering if the device clock is not correct.

    Regards,

    Richard
  • Yes. The timer measure 365 ticks (u32Laufzeit) and I measure with a oscilloscope 6.232us (376 ticks @ 60MHz) at GPIO34.

    I used the FLASH example (Example_28035_Flash) as starting point for the test and enabled only PWM interrupt 1

    (C:\ti\controlSUITE\device_support\f2803x\v130\DSP2803x_examples_ccsv5\flash_f28035)

  • Rene,

    Can you paste the IQmath allocation from your linker command file into a post so I can see how it is linked please?

    I am seeing 176 cycles for the above code, which accounting for function calling is about what I'd expect. Also, can you let me know how "fAmpl" is defined please?

    Regards,

    Richard
  • Out of the map file

     IQTABLES              003fe000   00000b50  00000000  00000b50  RWIX

     IQTABLES2             003feb50   0000008c  00000000  0000008c  RWIX

     IQTABLES3             003febdc   000000aa  00000000  000000aa  RWIX

    And the linker file

       /* Allocate IQ math areas: */
       IQmath              : > FLASHA      PAGE = 0            /* Math Code */
       IQmathTables        : > IQTABLES,   PAGE = 0, TYPE = NOLOAD
    
      /* Uncomment the section below if calling the IQNexp() or IQexp()
          functions from the IQMath.lib library in order to utilize the
          relevant IQ Math table in Boot ROM (This saves space and Boot ROM
          is 1 wait-state). If this section is not uncommented, IQmathTables2
          will be loaded into other memory (SARAM, Flash, etc.) and will take
          up space, but 0 wait-state is possible.
       */
       /*
       IQmathTables2    : > IQTABLES2, PAGE = 0, TYPE = NOLOAD
       {
    
                  IQmath.lib<IQNexpTable.obj> (IQmathTablesRam)
    
       }
       */
        /* Uncomment the section below if calling the IQNasin() or IQasin()
           functions from the IQMath.lib library in order to utilize the
           relevant IQ Math table in Boot ROM (This saves space and Boot ROM
           is 1 wait-state). If this section is not uncommented, IQmathTables2
           will be loaded into other memory (SARAM, Flash, etc.) and will take
           up space, but 0 wait-state is possible.
        */
        /*
        IQmathTables3    : > IQTABLES3, PAGE = 0, TYPE = NOLOAD
        {
    
                   IQmath.lib<IQNasinTable.obj> (IQmathTablesRam)
    
        }
        */

    And the variables are defined as globals outside the interrupt routine.

    They are unchanged for the test and still zero.

    uint16_t u16PosTest = 0;
    float fAmplTest = 0;

    Map file:

    00008c04     230 (00008c00)     _u16PosTest

    00008c0e     230 (00008c00)     _fAmplTest

  • The linker command file

    F28035.zip

  • Is "u16PosTest" the same as "u16Pos"?
  • Rene,

    What I'm seeing is the lines containing the float to IQ format conversions are where the cycles are going. The _IQ29(fAmpl) routine takes 136 cycles on my machine, which is a lot of the difference between your expected and actual counts. I think the optimizer is helping by re-using this result in the two adjacent code lines. There is also some function calling overhead which is not accounted for in your cycle budget.

    Can you measure the execution of just this line?
    _iq29 v = _IQ29(fAmpl);

    Regards,

    Richard
  • Hi Richard

    your are right the _IQ29(...) needs a lot of time. I made a test with this line and a float and an uint16 as parameter and without this line.

    As you can see it needs 158 ticks.

    Do you have any advice to fix or avoid that?

        // runtime with constant
        // optimization 4, Speed/Size 4: 207 ticks (running out of RAM)
        _iq29 iq29PosRad = _IQ29mpyI32(IQ29_INC_TO_RAD, u16Pos);                //  4
        _iq29 iq29Sin = _IQ29sin(iq29PosRad);                                   // 46
        _iq29 iq29Cos = _IQ29cos(iq29PosRad);                                   // 44
        _iq29 x = 1073741824;
        _iq29 iq29Sinus = _IQ29mpy(iq29Sin, x);                                 //  6
        _iq29 iq29Cosinus = _IQ29mpy(iq29Cos, x);                               //  6
        _iq29 iq29PwmA = _IQ29mpyI32int(iq29Sinus + _IQ29(1), PWM_PERIOD/2);    // 22
        _iq29 iq29PwmB = _IQ29mpyI32int(iq29Cosinus + _IQ29(1), PWM_PERIOD/2);  // 22 => Summe 150 cycles
        set_pwm(iq29PwmA, iq29PwmB);
    
        // runtime with float
        // optimization 4, Speed/Size 4: 365 ticks (running out of RAM)
        _iq29 iq29PosRad = _IQ29mpyI32(IQ29_INC_TO_RAD, u16Pos);                //  4
        _iq29 iq29Sin = _IQ29sin(iq29PosRad);                                   // 46
        _iq29 iq29Cos = _IQ29cos(iq29PosRad);                                   // 44
        _iq29 x = _IQ29(fAmpl);
        _iq29 iq29Sinus = _IQ29mpy(iq29Sin, x);                      //  6
        _iq29 iq29Cosinus = _IQ29mpy(iq29Cos, x);                    //  6
        _iq29 iq29PwmA = _IQ29mpyI32int(iq29Sinus + _IQ29(1), PWM_PERIOD/2);    // 22
        _iq29 iq29PwmB = _IQ29mpyI32int(iq29Cosinus + _IQ29(1), PWM_PERIOD/2);  // 22 => Summe 150 cycles
        set_pwm(iq29PwmA, iq29PwmB);
    
        // runtime with uint16
        // optimization 4, Speed/Size 4: 364 ticks (running out of RAM)
        _iq29 iq29PosRad = _IQ29mpyI32(IQ29_INC_TO_RAD, u16Pos);                //  4
        _iq29 iq29Sin = _IQ29sin(iq29PosRad);                                   // 46
        _iq29 iq29Cos = _IQ29cos(iq29PosRad);                                   // 44
        _iq29 x = _IQ29(u16Pos);
        _iq29 iq29Sinus = _IQ29mpy(iq29Sin, x);                                 //  6
        _iq29 iq29Cosinus = _IQ29mpy(iq29Cos, x);                               //  6
        _iq29 iq29PwmA = _IQ29mpyI32int(iq29Sinus + _IQ29(1), PWM_PERIOD/2);    // 22
        _iq29 iq29PwmB = _IQ29mpyI32int(iq29Cosinus + _IQ29(1), PWM_PERIOD/2);  // 22 => Summe 150 cycles
        set_pwm(iq29PwmA, iq29PwmB);

    Thanks for your help

    Rene

  • Hi Rene,

    Can "fAmpl" be defined as _IQ24? That would avoid having to do the type conversion which is chewing up all those cycles.

    Alternatively, is it feasible to migrate to a floating-point device? The F28069 is architecturally very similar and would bring the benefit of an FPU32. That core has float type conversion operations built into the instruction set so data can be converted over in a couple of cycles.

    Regards,

    Richard
  • Correction: "Can "fAmpl" be defined as _IQ29?".