This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Order of reads for _itoll intrinsic

Guru* 84110 points

I have found it very convenient to read TSCL and TSCH and put them into a 64-bit variable using the _itoll intrinsic. But I would like to know if an implied read order is a sure thing or not.

The prototype for _itoll is

long long _itoll (unsigned src2, unsigned src1);

This implies to me that src1 will be read first and then src2. Is that the defined behavior of this intrinsic?

My experimentation has found that with either Debug or Release configurations, I get that order. It allows me to use

long long llTime = _itoll ( TSCH, TSCL );

and I get the right order of reads. If they read in the opposite order then I can get an incorrect value in TSCH. So that is why I am asking if this is defined and assured behavior.

Regards,
RandyP

  • No.  No particular ordering can be inferred for the arguments to an intrinsic function (or any other function).  To get the guaranteed order, you must have a "sequence point" between the two reads, as follows:

    unsigned int tscl_val = TSCL;
    unsigned int tsch_val = TSCH;
    long long llTime = _itoll(tsch_val, tschl_val);

    This only works because the built-in identifiers TSCL and TSCH are considered volatile.

  • I liked the elegance of a single line, so I am trying to brainstorm for a clean and simple alternative.

    Would this also qualify as a sequence point?

    case 1 said:
    unsigned int tscl_val = TSCL;
    long long llTime = _itoll(TSCH, tschl_val);

    I tried both of the following, and they would not work in some cases of optimization or no optimzation.

    case 2 said:
    long long llTime = (((long long)TSCH)<<16) + TSCL;

    case 3 said:
    long long llTime = TSCL + (((long long)TSCH)<<16);

    Is a series of separate statements the only way to create a sequence point?

    Can a compound statement result in evaluation as an expression? For example,

    case 4 said:
    #define TimerMacro {unsigned int tscl_val = TSCL;unsigned int tsch_val = TSCH;_itoll(tsch_val, tschl_val);}


    long long llTime = TimerMacro;

    Maybe the ";" would be a problem, but that might be solvable, too, if this would work. I am getting over my head in C syntax, though.

    I could try compiling and testing all of these, but the true expected compiler behavior is what really matters.

    Regards,
    RandyP

  • RandyP said:

    Would this also qualify as a sequence point?

    unsigned int tscl_val = TSCL;
    long long llTime = _itoll(TSCH, tschl_val);

    [/quote]

    Yes.  Roughly speaking, sequence points occur at semicolons. 

    RandyP said:
    I tried both of the following, and they would not work in some cases of optimization or no optimzation.

    long long llTime = (((long long)TSCH)<<16) + TSCL;

    case 3 said:
    long long llTime = TSCL + (((long long)TSCH)<<16);

    [/quote]

    There is no sequence point in either of these until the very end, when it's too late.

    RandyP said:
    Is a series of separate statements the only way to create a sequence point?

    No, there are a few other places sequence points occur, such as during a function call (but this is complicated, be sure you understand it before relying on it!).   See the Wikipedia entry on sequence points

    RandyP said:
    Can a compound statement result in evaluation as an expression? For example,

    #define TimerMacro {unsigned int tscl_val = TSCL;unsigned int tsch_val = TSCH;_itoll(tsch_val, tschl_val);}

    long long llTime = TimerMacro;

    [/quote]

    Standard C does not allow statements where expressions are expected.  However, if you enable GCC mode, you can have "statement expressions" with almost exactly the syntax you have above:

    #define TimerMacro ({unsigned int tscl_val = TSCL; \
                         unsigned int tsch_val = TSCH; \
                         _itoll(tsch_val, tschl_val);})
    long long llTime = TimerMacro;

    You should consider writing this as an inline function instead, which will work even when GCC mode is not used:

    static __inline long long TimerMacro(void) 
    { unsigned int tscl_val = TSCL; 
      unsigned int tsch_val = TSCH; 
      return _itoll(tsch_val, tschl_val); }
    long long llTime = TimerMacro();
  • Archaeologist,

    For some reason, whenever the TSCL or TSCH registers are read in discrete statements that end with ';', there is an apparently unnecessary NOP added after each MVC.

    So I have settled on two choices that minimize the cycles for the routine, either a single-line that uses the comma operator or a #define macro. It is odd that either method will take 4-6 cycles and 5-6 instructions; I have no idea why it varies or when some of the code is there. But these are most consistent between Debug and Release, and the 4-6 cycles variation is in Release.

    Comma operator said:
    // this requires uTsclTemp to be declared as a local variable in the scope of this line
    llStartTime   = (uTsclTemp=TSCL,_itoll( TSCH, uTsclTemp ));

    GetTSC macro said:
    #define GetTSC(x) {int uTsclTemp=TSCL;(x)=_itoll( TSCH, uTsclTemp );}

    GetTSC( llStartTime );

    They seem to optimize down a little more if llStartTime is a local variable. I guess the compiler knows it does not need to save it into memory when it is local.

    Thanks for the pointers on sequence points. Let me know if you want to look at the optimization issues.

    Regards,
    RandyP

  • I'm not sufficiently familiar with TSCL, TSCH, and MVC to be able to diagnose the performance issues.  If you want anyone to look at it, you should submit a ClearQuest enhancment request.

  • Sorry for commenting such an old post but...

    Is this really a reliable way to read the TSC registers?

    Are you guarenteed NOT to be interrupted between these two reads of TSCL and TSCH?

    According to SPRUFE8B (TMS320C674x DSP CPU and Instruction Set) section 2.9.14.4 Reading the Counter, TI provides two assembler examples of reading the registers and avoiding being interrupted.

    Aren't we forced to use either of them or manually code something in C that is disabling interrupt?

  • Mads Lind Christiansen said:
    Are you guarenteed NOT to be interrupted between these two reads of TSCL and TSCH?

    No.

    Mads Lind Christiansen said:
    Aren't we forced to use either of them or manually code something in C that is disabling interrupt?

    To have the interrupts disabled, you have to take explicit steps.  Because I am interested in how the compiler handled this, I wrote this macro ...

    // Presumes "result" is of type "long long"
    #define READ_TIMER(result)                        \
       do                                             \
       {                                              \
          unsigned int isr_tmp;                       \
          unsigned int lo_tmp;                        \
                                                      \
          isr_tmp = _disable_interrupts();            \
          lo_tmp = TSCL;                              \
          result = _itoll(TSCH, lo_tmp);              \
          _restore_interrupts(isr_tmp);               \
       } while (0)
    

    The code generated for it is a few cycles shy of perfect.  But it is probably good enough for most situations.

    BTW, I would NOT use the DINT/RINT method shown in the CPU manual.  That re-enables interrupts no matter what.  Not always what you want.  The macro above restores interrupts to the previous setting.  That's what you want.

    Thanks and regards,

    -George

  • Mads,

    If you never read TSCL or call CLK_gethtime() from within an interrupt context, then it does not matter if an interrupt occurs between the reads of TSCL and TSCH. The correct value of TSCH is latched when TSCL is read, so TSCH could be read a long time later and would still be correct. This is true as long as no read of TSCL occurs between them.

    Since these reads are used for benchmarking, it may be safe to expect that no benchmarking would be done in the ISRs. Your system design may vary.

    Regards,
    RandyP

  • Hi George

    Actually the reason for me inquiring was also to let others know that reading of TSCL and TSCH should be done in a protected context.
    As Randy replied they are often used for benchmarking, although I could also find other use for it in our DSP Control code.

    So when NOT using it only for benchmarking means that NOT protecting the reads would also imply

    * that you are only read this registers in one ISR or thread

    * are sure that no one in the future will ever use these registers somewhere else in your application (in another thread/ISR)

    * or make use of any library that uses these registers (like PWRM cpu load measuring or the CLK_gethtime from DSP/BIOS)

     

    But many thanks for the snippet. 

     

    Best wishes,

    Mads

  • George Mock said:

    #define READ_TIMER(result) \

       do                                             \
       {                                              \
          unsigned int isr_tmp;                       \
          unsigned int lo_tmp;                        \
                                                      \
          isr_tmp = _disable_interrupts();            \
          lo_tmp = TSCL;                              \
          result = _itoll(lo_tmp, TSCH);              \
          _restore_interrupts(isr_tmp);               \
       } while (0)
    

          result = _itoll(TSCH, lo_tmp);              \

    [Mod Ed: Thank you for catching this. The code has been corrected in the previous post.]