This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320F28069: The processor halt when upgrade the compiler to TIv20.2.4.LTS from TI v6.1.3

Part Number: TMS320F28069

We have the aggressive interrupt timing in the project. The ADCINT1 and ADCINT2 both has the sampling rate 10us. The counter phase of ADCINT2 is set to 5us which is half of the period 10us. They are pointed to the same interrupt routine. Therefore, the interrupt rate is 5us for both ADCINT1 and ADCINT2. Also, there are other interrupt source such as HRCAP4_INT. 

The project is built with compiler v6.1.3 without problem and working good. However, the processor halt and the code jump into the TINT0_ISR() when building with compiler TIv20.2.4.LTS. 

The following is the question I submitted before and I am still having this issue. 

e2e.ti.com/.../3695953

  • Andy,

    Compiler v6.1.3 dates back to March 2013. There are about 8 years worth of bug fixes, optimizations, and general evolution between the two compiler versions. I don't think that we would be able to easily pinpoint the exact change(s) that might be causing the difference in behavior. Sometimes a minor compiler bug can even mask an equally minor issue with the project configuration.

    If pinpointing the exact change is required, I recommend experimenting with some archived CGT versions to see if you can identify the first CGT version that produces a problem. Then it will be easier to reference the release notes and defect list for the specific differences.

    If the goal is to arrive at a working version of the project using an updated LTS CGT version, we can try to assist with debugging the issue as if it were a program under development.

    -Tommy

  • Hello Tommy,

    Thanks for your reply. We do not really want to pin point why it is not working on old compiler. We may need help finding a solution to make a working version of the project using the latest LTS CGT version. Do you know anyone has the similar issue? Beside reducing the interrupt rate, what suggestions you have for this issue?

    Andy

  • Andy,

    I am not aware of any reports for similar issues. I will try to assist with getting your project working again. Can you provide some more background information?

    Are you using the native C28x hardware-based interrupt handler or do you have a custom software-based interrupt handler?

    Is TINT0 intentionally enabled such that branching to TINT0_ISR() is expected? The ISR includes a for-loop trap so I can see why the CPU would be stuck there once it branches to it. It might help to inspect the PIECTRL register when stuck in the ISR to confirm the interrupt source and the vector table contents in case another interrupt is unintentionally pointing to TINT0_ISR().

    As far as the interrupt pacing goes, if the hardware could keep up with the interrupts in v6.x, there should be a way to keep up with v20.x. It might help to profile the interrupt servicing activity by toggling a GPIO when entering and exiting the ISR. If there is significant difference in the servicing latency or execution time between the CGT versions, it may point to a difference in pragma expectations or levels of optimization.

    -Tommy

  • Tommy,

    There are two adc interrupts ADCINT1 and ADCINT2 for sampling the AC voltages and each has a time base period (TBPRD) 10us. The ADCINT2 has time base phase of half of the period which is 5us. So these two INT are staggering 5us.

    There is HRCAP2_INT and triggered by rising and falling edge of the pulses for measuring the width of the pulse. If there is a pulse which period is 30us and the width is 10us into this HRCAP, the CPU stuck (no problem with compiler v6.1.3). This interrupt will do some width calculation so it takes some execution time. 

    It also include ECAP2_INT which is triggered by CTR_EQ_PRD and CTR_EQ_CMP. The ECap2Regs.CAPx is basically the pulse width and period which mentioned above. I believe this interrupt routine will be exited quick and not expect to impact too much.  

    Thanks for the suggestion that to inspect the PIECTRL register and to profile the interrupt servicing activity by toggling GPIO. I will get those information afterward.

    Thanks,

    Andy

  • Hi Andy, Tommy is our of office but should be able to reply by Tuesday next week.

  • Andy,

    It seems like the worst case alignment of interrupt triggers would potentially require 4 ISRs (ADCINT1, ADCINT2, HRCAP2_INT, and ECAP2_INT) to complete execution within 10us. So at 90MHz, this would be roughly 900 CPU cycles.

    If you reserve about 15% overhead (arbitrarily) for context switching and margin, it would be about 750 cycles. It should be fairly straightforward to measure the ISR execution cycles using the CPU timer to see if you are in the ballpark.

    Am I correct in assuming that the goal is to get the project back up and running with as little modification as possible? If so, I will try to limit my suggestions to this scope of work.

    -Tommy

  • Hello Tommy,

    The ADCINT1 and ADCINT2 keep triggering for every 5us because they are stagger 5us apart. 

    I would say the goal is to get a working version of the project using the latest compiler with as little modification as possible.

    Thanks,

    Andy

  • Hello Tommy,

    The PIECTRL register show the value of 0x0D27 when stuck in the ISR.  

    I measured the execution time of ADCINTx with toggling GPIO. Measured on both project compiled with v6.1.3 and TIv20.2.4.LTS. I didn't notice much time different between both. The timing is varying between 2.xus to 4.xus for both project. Have seen 6.xus occasionally.

    Thank,

    Andy

  • The PIECTRL register show the value of 0x0D27 when stuck in the ISR.  

    0x0D27 maps to the ILLEGAL Operation interrupt, which is usually caused by an invalid instruction:

    You can look at the stack memory to determine the Return address for the ISR context save. This should point to the local memory section with the instruction that caused the ILLEGAL trap. I'm a little fuzzy as to whether it will point to the exact address of the instruction or if it is offset by a few words; it could very well be obvious like all 0s or Fs in memory where instructions are expected.

    I measured the execution time of ADCINTx with toggling GPIO. Measured on both project compiled with v6.1.3 and TIv20.2.4.LTS. I didn't notice much time different between both. The timing is varying between 2.xus to 4.xus for both project. Have seen 6.xus occasionally.

    2-4 us seems like a considerable amount of variation for an ISR that is executing every 5us. Is the source of variation understood?

  • Andy,

    Another place to check for ILLEGAL instructions would be at the RPC address.

    -Tommy

  • Hello Tommy,

    I checked the RPC address when the CPU is halted. It point to somewhere of the initialization function which only execute in power up.

    I also tried to increase the stack size from 1.2k to 4k but it doesn't help. 

  • Andy,

    Can you check to see if the SP pointer is still in a valid memory range when the CPU is halted? I just worked on another issue where nested interrupts ran out of control and caused the SP to overflow, which ultimately caused the CPU to fetch an illegal instruction.

    Can this issue be reproduced on a controlCARD or LaunchPad? If so, we may be able to provide some more hands-on assistance.

    -Tommy

  • The stack is assigned as the following location:

    RAMM0       : origin = 0x000050, length = 0x000500

     .stack              : > RAMM0,      PAGE = 1

    When the CPU is halt, SP point to 0x6E. I believe it still point to valid memory.

    I caught another CPU halt that PC = 0x00000000 and SP = 0x808. It doesn't seem normal to me.

    I tried to set higher optimization level (opt_level=3 and opt_for_speed=5) for building a standalone project with compiler TI v20.2.4.LTS. It was working ok without the issue of CPU halt. For building the project with compiler v6.1.3, the optimization level is opt_level=off and opt_for_speed=2 and working without CPU halt. 

  • Andy,

    I would agree that the SP at 0x808 doesn't seem normal. I think the most common scenario would be overflow. Can you clarify if you are using nested interrupts (like this)?

    The optimization level is an interesting experiment. I would generally expect opt_level=off to be the more reliable setting. I suppose that the higher optimization level would help if the problem is related to the ISRs executing too slowly. Do you see faster or more consistent ISR execution times with optimization?

    CCS allows for different optimization levels at the file level so it might be possible to experiment with opt_level=off on individual source files to see if you are able to identify which file is benefiting from optimization. If that works, you could then split out individual functions until you find the problem function.

    -Tommy

  • Hello Tommy,

    Yes, nested interrupts is used. 

    For the opt_level=3 and opt_for_speed=5, although there is CPU halt happen, found some of the functionality of application breaks.

    Tried to create a separated file and moved HRCAP2_INT and ADCINTx to this file and set opt_level=3 and opt_for_speed=5 on this file only. However, CPU halt still happens.

    Andy

  • Andy,

    I think it would make sense to stay with opt_level=off for debug.

    Can you try logging the ISR nesting levels during execution?

    For example, declare a global array of volatile Uint16 where each array element acts as a counter for each ISR in the system. In the ISRs, increment the counter variable before EINT is executed (preemption enabled), and decrement the counter after DINT is executed (preemption disabled). With no active interrupts, the counters should all be 0.

    When you end up in a CPU halt condition, you can inspect the counter values to see if the nesting may have run away.

    -Tommy

  • Hello Tommy,

    I created the array call iArray to hold the counter in the ADC interrupt which allows nested interrupt there. The pseudocode shown as below.

    After CPU halt, all values in the array are zero.

    __interrupt void SubSourceADCFastISR(void)
    {

    // local variables initialization here

       oldIER = IER;
       IER = M_INT12 | M_INT5;
       PieCtrlRegs.PIEACK.all = PIEACK_GROUP1;
       asm(" NOP"); 
       iArray[j]=iArray[j]++;
       EINT;

       IER = oldIER;

    // Interrupt process here

       DINT;

       iArray[j] = iArray[j]--;
       if(j<1000)
       {
          j++;
       }
       else
       {
          j=0;
       }

    }

  • Andy,

    I was thinking more along the lines of:

    #define IARR_ADC1_NDX   0
    #define IARR_ADC2_NDX   1
    #define IARR_HRCAP_NDX  2
    #define IARR_NUM_ISR    3

    volatile Uint16 iArray[IARR_NUM_ISR];

    __interrupt void ADC1_ISR(void)
    {
        // Code

        iArray[IARR_ADC1_NDX]++;
        EINT;

        // Code

        DINT;
        iArray[IARR_ADC1_NDX]--;

        // Code
    }

    __interrupt void ADC2_ISR(void)
    {
        // Code

        iArray[IARR_ADC2_NDX]++;
        EINT;

        // Code

        DINT;
        iArray[IARR_ADC2_NDX]--;

        // Code
    }

    -Tommy

  • Hello Tommy,

    I created the array for logging the ISR nesting levels as below. These two interrupts are the only interrupts which allow nested interrupt.

    The period of SubSourceADCFastISR is 5us and the period of Timer0_ISR is 1ms.

    I ran in debug mode and check the value of the array once the program is halted.

    iArray[0] = 1

    iArray[1] = 32

    #define IARR_ADC1_NDX   0
    #define IARR_ADC2_NDX   1
    #define IARR_NUM_ISR    2

    volatile Uint16 iArray[IARR_NUM_ISR];

    __interrupt void SubSourceADCFastISR(void)
    {

       //code

       iArray[IARR_ADC1_NDX]++;
       EINT;

       // code

       DINT;
       iArray[IARR_ADC1_NDX]--;

    }

    //1ms timer interrupt

    __interrupt void Timer0_ISR(void)
    {

       //code

       iArray[IARR_ADC2_NDX]++;
       EINT;

       // code

       DINT;
       iArray[IARR_ADC2_NDX]--;

    }

  • Andy,

    A nesting level of 32 for the Timer0 interrupt seems high to me. That would at the very least take a big portion of your stack memory.

    Is the Timer0 interrupt allowed to preempt its own ISR?  If so, I would guess that at some point in time, the Timer0_ISR execution takes longer than 1ms to complete (especially with other ISRs preempting the execution) and subsequent Timer0 interrupts are spawning additional Timer0_ISR instances before the prior instances can complete.

    Is the timing and tick count of Timer0 critical?  It would be interesting to see if the situation improves by slowing down the Timer0 rate (maybe try 10ms), and/or by disabling new Timer0 interrupts between EINT to DINT in Timer0_ISR (either by stopping / resuming the Timer decrement, or through IER masking).

    -Tommy

  • Hello Tommy,

    Tried to stop and resume the Timer0 in Timer0_ISR between EINT to DINT. It does help on this issue and no CPU halt. However, it will cause another issue in our application. We have multiple F28069 to run and the Timer0 in all the F28069 have to be synchronized. So stopping and resuming Timer0 will break the synchronization.

    Also, we have to keep the Timer0 rate at 1ms.

    Andy 

  • Andy,

    The experiment is good supporting evidence that the Timer0_ISR execution time (and subsequent nesting) is a major contributor to the system crashes. Would you happen to know the Timer0_ISR nesting level for the v6.1.3 compiler?

    I suppose that you have two strategies to choose from to resolve the issue:

    1. Try to find the difference in code generation between compilers:
      • Are some variables / functions placed in Flash instead of RAM (the .MAP file can be useful)
      • Are there dependency differences between compilers (ex: FPU library location or version)
      • Is there a significant difference in generated ISR code (especially with Timer0_ISR)?
    2. Try to further optimize the system / code so that Timer0_ISR can complete within 1ms

    -Tommy

  • Hello Tommy,

    Compiled the code (5us ADCINT rate) with v6.2.11 and v6.4.0.

    Compiled with v6.2.11 (released in Feb-05-2015), running without issue.

    Compiled with v6.4.0 (released in Nov-19-2014), CPU halt.

    Is there any critical different between these two versions for this issue?

    And not sure why the v6.4.0 (Nov-19-2014) is release earlier than v6.2.11 (Feb-05-2015).

    Thanks,

    Andy

  • Andy,

    Can you double-check the compiler version numbers? I do not see exact matches for them...

    Did you mean v6.2.10 and v6.4.11?

    -Tommy

  • Hello Tommy,

    The compiler version are v6.2.11 and v6.4.0. I downloaded it from 

    http://software-dl.ti.com/codegen/non-esd/downloads/download_archive.htm

    Which are under the first one of "6.2.x Release" and the last one of "6.4.x Release".

    Thanks,

    Andy 

  • Andy,

    I'm glad that you were able to find that site. I was not sure if it was visible to the public.

    It looks like 6.4.x introduced a number of performance enhancements over 6.2.x so there was probably overlap between 6.4.x preliminary releases vs 6.2.x bugfix releases.

    Would you be able to profile the execution times of the ISRs without interrupt nesting to see if there is a noticeable difference between the two compilers?

    It will be much easier for the compiler team to help if you are able to narrow down the area of interest to a single function (or code snippet) that can be submitted as a compiler test case to the CCS Forum.

    -Tommy

  • Hello Tommy,

    Without interrupt nesting, the application won't work. 

    For the code snippet, I inserted three INT routines. Please let me know if it is good enough.

    Thanks,

    Andy

    // HRCAP2 INT, interrupted by a pulse which has 10us on time and 20us off time
    __interrupt void TimeMonitorPulse(void)
    {
    
    	if(HRCap2Regs.HCIFR.bit.RISE)
    	{
    		// Code
    	}
    	else if(HRCap2Regs.HCIFR.bit.FALL)
    	{
    		// Code
    	}
    	else if(HRCap2Regs.HCIFR.bit.COUNTEROVF)
    	{
    		// Code
    	}
    
    	// Code
    	
    	EALLOW;
    	HRCap2Regs.HCICLR.all = 0x1F;
    	PieCtrlRegs.PIEACK.all = PIEACK_GROUP4;
    	EDIS;
    
    }
    
    // 5us ADC INT
    __interrupt void SubSourceADCFastISR(void)
    {
    	DINT;
    
    	unsigned int oldIER;
    
    	oldIER = IER;
    	// allow XINT3 and HRCAP4 to interrupt
    	IER = M_INT12 | M_INT5;
    	PieCtrlRegs.PIEACK.all = PIEACK_GROUP1;
    	asm(" NOP"); //Wait for PIEACK to exit the pipeline
    	EINT;
    
    	if(1 == AdcRegs.ADCINTFLG.bit.ADCINT1)
    	{
    		// grab reading from AdcResult.ADCRESULTx
    		AdcRegs.ADCINTFLGCLR.bit.ADCINT1 = 1;		//Clear ADCINT1 flag reinitialize for next SOC
    	}
    	if(1 == AdcRegs.ADCINTFLG.bit.ADCINT2)
    	{
    		// grab reading from AdcResult.ADCRESULTx
    		AdcRegs.ADCINTFLGCLR.bit.ADCINT2 = 1;		//Clear ADCINT2 flag reinitialize for next SOC
    	}
    
    	// Code
    	
    	IER = oldIER;
    
    	DINT;
    }
    
    
    // 1ms Timer0 INT
    __interrupt void SubSourceSafetyAndChannelStateSlowISR()
    {
    	// manage interrupts to allow select ones to interrupt SlowISR
    	oldIER = IER;
    	// always allow INT1 (ADC) and INT11 (CLA) and HRCAP4 to interrupt and timer 1
    	IER = M_INT1 | M_INT5 | M_INT13;
    	// only allow INT4 (ECAP2) to interrupt if it's already enabled
    	if(oldIER & M_INT4)
    		IER |= M_INT4;
    	PieCtrlRegs.PIEACK.all = PIEACK_GROUP1;
    	asm(" NOP"); //Wait for PIEACK to exit the pipeline
    	EINT;
    
    
    	// Code
    
    	IER = oldIER;
    
        DINT;
    }
    

  • Tommy,

    I attached two screen shot which are the GPIO toggling in the 1ms Timer 0 INT. The high time indicate the execution time of the interrupt. The 1st one is captured when running the code compiled with v6.2.11. The 2nd one is captured when running the code compiled with v6.4.0 before the CPU halted. In the 2nd screen shot, I saw some high time is longer than 1ms. 

    Thanks,

    Andy 

  • Andy,

    The compiler team will only be interested in specific differences in code generation, so whether the overall application works or not is irrelevant to them.

    I see that there is about 20us of time difference for the TIMER0 ISR execution between compilers. That should be good enough for a start.

    -Tommy

  • Hello Tommy,

    Do you still like me to provide code snippet? Please instruct me to provide code snippet that shows specific differences in code generation.

    Thanks,

    Andy

  • Andy,

    At this point, you will want to engage with the compiler team directly on the CCS E2E Forum.

    They will most likely ask for you to submit a compiler test case using these instructions.

    -Tommy

    PS. I am out of the office until 8/16.

  • Andy,

    Yes, please start a new question in the CCS forum and let them know that you are seeing different execution times between the compiler versions. They will be able to assist with debug.

    It's strange that you are not able to see the instructions. I am attaching a PDF capture.

    -Tommy