This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

F28335 execution speed differs with/without JTAG

Other Parts Discussed in Thread: TMS320F28335

Hello,

I'm measuring execution time of my CCS3.3 non-BIOS application on my TMS320F28335 based device and result differs depend of JTAG connected or not. What is the possible reason of this?

Thanks,

Roman

  • Roman,

    You're saying that for the SAME code (I assume running from flash), code execution differs depending on whether you are running standalone, or have CCS connected and click the run button?

    Is the execution faster or slower with CCS/JTAG connected? Maybe the CCS .gel file is configuring the clock/PLL for you and you did not do this in your code. Take a look in the .gel you are using with your CCS connection and see if it is doing anything to configure the chip.

    - David
  • Hello David!

    Thank you for your answer.

    We are using standard gel file for this MCU so I don’t think the reason is in it. We check it by excluding gel from the project, the result is the same.

    Execution speed is increased for about 25% with the JTAG connected. The effect remains in case connection with CCS is closed but the MCU is not power cycled. After power cycle of the MCU the effect disappears until CCS is connected.    

    We have also found that the effect disappears in case we exclude one function in our code. The function is an inverted matrix computation with floating point.

    Thanks  

    Roman

  • Roman,

    Let's think fundamentally here.  It sounds like you have used CCS to flash a program into the device.  You can power-cycle, and the code runs from flash.  It takes a certain amount of time to execute.  If you now connect with CCS (but don't do anything in terms of flash programming or program loading -- this is important!   Do you do anything other than just connect with CCS?), and click run, the code runs faster.  I assume maybe you have a GPIO pin togging so you can measure execution speed?  (Otherwise, how are you determining the faster execution speed?)

    Execution speed could be affected by: (1) flash wait-states, (2) PLL setting, (3) Code isn't really running correctly in one of the cases and you just think it is running faster.

    So, for (1) and (2), you can use CCS to examine the flash registers and PLL/clock setup registers.  Compare the two cases and look for differences.  Any differences would come back to the .gel file as the likely root, although you've already removed the .gel file and that doesn't seem to be the problem.  It is a quick effort to compare the registers however and rule these things out.

    As for (3), your statement about removing a single function and the effect disappears, hmmmm.  This is suspicious.  Is there anything unusual about this function?  Does it execute from RAM?  What if you leave this function in the build, but gut it (remove the contents, leaving the function empty).  Does the effect disappear still?

    Regards,

    David

  • Hello David

     

    Thank you for your answer.

    We measure time through timer 0. We take TIMER0TIM and TIMER0TIMH values at the start and the end of the program cycle and then calculate the difference. To make sure the result is correct we applied GPIO toggling method yesterday as you had advised. The result is about the same: execution of one program cycle takes about 50% (55.2% +- 0.1%) of the cycle time with JTAG connected and program started from CCS.

    After power cycling execution time became unstable and could take up to 80% of the cycle (see the picture).

     

     

    In pictures above P1 is total cycle frequency (cycle time is 1ms), P2 is positive pulse width - the part of the cycle used for computations, P3 is negative pulse width - the idle time.

    So it looks the code isn't really running correctly without CCS and now we need to understand why.

    We measure XCLKOUT frequency with and without JTAG and it is the same so PLL settings are not the reason of the problem. We think flash wait states do not affect it either because in this case execution cycle will became bigger but will not stay unstable.

     We know the effect disappears in case we remove the inverted matrix computation function. This function is executed from RAM. 

    This is the header of this function:

    #pragma CODE_SECTION(CalcInvMatr_dim3, "ramfuncs");

    //M2 = (M1)^-1

    void CalcInvMatr_dim3(float* pfM1, float* pfM2)

    {

                            ……….

    }

    Also we found the effect disappeared in case we left inverted matrix computation function but replaced some input data of this function with zeros.

    My idea is that in some cases with real input data the matrix becomes ill-conditioned and during computation some devision_by_zero like conditions may occur. Is there any exception processing in 28335 FPU? What is the difference between C28x and C27x Modes?

     

    I have attached .gel file used in project but I think it is standard.  6011.f28335.gel

    Thanks,

     Roman

  • Roman,

    >> My idea is that in some cases with real input data the matrix becomes ill-conditioned and during computation some devision_by_zero like conditions may occur.

    >> Is there any exception processing in 28335 FPU?

    There are underflow (LUF) and overflow (LVF) flags in the FPU.  But, software must utilize the flags.  I don't think the floating point routines typically check this.  It is left to the user to be certain the computation they are doing is valid.

    >> What is the difference between C28x and C27x Modes?

    The C27x was an older CPU.  The C28x has a compatibility mode that allowed running C27x object code (OBJMODE bit in ST1 reg).  Don't touch it.  You want to be running in C28x mode (OBJMODE=1), which is where the boot ROM will leave the device before starting execution of your code.

    -----------------

    I'm not sure I agree with your theory of an ill-conditioned math computation causing the problem.  Why would it cause a problem only when running stand-alone, and not with JTAG connected?  If you had a math issue, the problem would occur regardless of standalone or JTAG/CCS operation.  Wouldn't you agree?

    Well, good luck.  It sounds like you're getting there, hopefully.

    Regards,

    David