TMS320F28069: The processor halt when upgrade the compiler to TIv20.2.4.LTS from TI v6.1.3

Andy Fung

Part Number: TMS320F28069

We have the aggressive interrupt timing in the project. The ADCINT1 and ADCINT2 both has the sampling rate 10us. The counter phase of ADCINT2 is set to 5us which is half of the period 10us. They are pointed to the same interrupt routine. Therefore, the interrupt rate is 5us for both ADCINT1 and ADCINT2. Also, there are other interrupt source such as HRCAP4_INT.

The project is built with compiler v6.1.3 without problem and working good. However, the processor halt and the code jump into the TINT0_ISR() when building with compiler TIv20.2.4.LTS.

The following is the question I submitted before and I am still having this issue.

e2e.ti.com/.../3695953

over 2 years ago

0 tlee over 2 years ago

TI__Guru 62975 points

Andy,

Compiler v6.1.3 dates back to March 2013. There are about 8 years worth of bug fixes, optimizations, and general evolution between the two compiler versions. I don't think that we would be able to easily pinpoint the exact change(s) that might be causing the difference in behavior. Sometimes a minor compiler bug can even mask an equally minor issue with the project configuration.

If pinpointing the exact change is required, I recommend experimenting with some archived CGT versions to see if you can identify the first CGT version that produces a problem. Then it will be easier to reference the release notes and defect list for the specific differences.

If the goal is to arrive at a working version of the project using an updated LTS CGT version, we can try to assist with debugging the issue as if it were a program under development.

-Tommy

0 Andy Fung over 2 years ago

Intellectual 875 points

Hello Tommy,

Thanks for your reply. We do not really want to pin point why it is not working on old compiler. We may need help finding a solution to make a working version of the project using the latest LTS CGT version. Do you know anyone has the similar issue? Beside reducing the interrupt rate, what suggestions you have for this issue?

Andy

0 tlee over 2 years ago in reply to Andy Fung

TI__Guru 62975 points

Andy,

I am not aware of any reports for similar issues. I will try to assist with getting your project working again. Can you provide some more background information?

Are you using the native C28x hardware-based interrupt handler or do you have a custom software-based interrupt handler?

Is TINT0 intentionally enabled such that branching to TINT0_ISR() is expected? The ISR includes a for-loop trap so I can see why the CPU would be stuck there once it branches to it. It might help to inspect the PIECTRL register when stuck in the ISR to confirm the interrupt source and the vector table contents in case another interrupt is unintentionally pointing to TINT0_ISR().

As far as the interrupt pacing goes, if the hardware could keep up with the interrupts in v6.x, there should be a way to keep up with v20.x. It might help to profile the interrupt servicing activity by toggling a GPIO when entering and exiting the ISR. If there is significant difference in the servicing latency or execution time between the CGT versions, it may point to a difference in pragma expectations or levels of optimization.

-Tommy

0 Andy Fung over 2 years ago in reply to tlee

Intellectual 875 points

Tommy,

There are two adc interrupts ADCINT1 and ADCINT2 for sampling the AC voltages and each has a time base period (TBPRD) 10us. The ADCINT2 has time base phase of half of the period which is 5us. So these two INT are staggering 5us.

There is HRCAP2_INT and triggered by rising and falling edge of the pulses for measuring the width of the pulse. If there is a pulse which period is 30us and the width is 10us into this HRCAP, the CPU stuck (no problem with compiler v6.1.3). This interrupt will do some width calculation so it takes some execution time.

It also include ECAP2_INT which is triggered by CTR_EQ_PRD and CTR_EQ_CMP. The ECap2Regs.CAPx is basically the pulse width and period which mentioned above. I believe this interrupt routine will be exited quick and not expect to impact too much.

Thanks for the suggestion that to inspect the PIECTRL register and to profile the interrupt servicing activity by toggling GPIO. I will get those information afterward.

Thanks,

Andy

0 Joe Prushing over 2 years ago in reply to Andy Fung

TI__Genius 11971 points

Hi Andy, Tommy is our of office but should be able to reply by Tuesday next week.

0 tlee over 2 years ago in reply to Andy Fung

TI__Guru 62975 points

Andy,

It seems like the worst case alignment of interrupt triggers would potentially require 4 ISRs (ADCINT1, ADCINT2, HRCAP2_INT, and ECAP2_INT) to complete execution within 10us. So at 90MHz, this would be roughly 900 CPU cycles.

If you reserve about 15% overhead (arbitrarily) for context switching and margin, it would be about 750 cycles. It should be fairly straightforward to measure the ISR execution cycles using the CPU timer to see if you are in the ballpark.

Am I correct in assuming that the goal is to get the project back up and running with as little modification as possible? If so, I will try to limit my suggestions to this scope of work.

-Tommy

0 Andy Fung over 2 years ago in reply to tlee

Intellectual 875 points

Hello Tommy,

The ADCINT1 and ADCINT2 keep triggering for every 5us because they are stagger 5us apart.

I would say the goal is to get a working version of the project using the latest compiler with as little modification as possible.

Thanks,

Andy

0 Andy Fung over 2 years ago in reply to Andy Fung

Intellectual 875 points

Hello Tommy,

The PIECTRL register show the value of 0x0D27 when stuck in the ISR.

I measured the execution time of ADCINTx with toggling GPIO. Measured on both project compiled with v6.1.3 and TIv20.2.4.LTS. I didn't notice much time different between both. The timing is varying between 2.xus to 4.xus for both project. Have seen 6.xus occasionally.

Thank,

Andy

0 tlee over 2 years ago in reply to Andy Fung

TI__Guru 62975 points

Andy Fung said:
The PIECTRL register show the value of 0x0D27 when stuck in the ISR.

0x0D27 maps to the ILLEGAL Operation interrupt, which is usually caused by an invalid instruction:

You can look at the stack memory to determine the Return address for the ISR context save. This should point to the local memory section with the instruction that caused the ILLEGAL trap. I'm a little fuzzy as to whether it will point to the exact address of the instruction or if it is offset by a few words; it could very well be obvious like all 0s or Fs in memory where instructions are expected.

Andy Fung said:
I measured the execution time of ADCINTx with toggling GPIO. Measured on both project compiled with v6.1.3 and TIv20.2.4.LTS. I didn't notice much time different between both. The timing is varying between 2.xus to 4.xus for both project. Have seen 6.xus occasionally.

2-4 us seems like a considerable amount of variation for an ISR that is executing every 5us. Is the source of variation understood?

0 tlee over 2 years ago in reply to tlee

TI__Guru 62975 points

Andy,

Another place to check for ILLEGAL instructions would be at the RPC address.

-Tommy

0 Andy Fung over 2 years ago in reply to tlee

Intellectual 875 points

Hello Tommy,

I checked the RPC address when the CPU is halted. It point to somewhere of the initialization function which only execute in power up.

I also tried to increase the stack size from 1.2k to 4k but it doesn't help.

0 tlee over 2 years ago in reply to Andy Fung

TI__Guru 62975 points

Andy,

Can you check to see if the SP pointer is still in a valid memory range when the CPU is halted? I just worked on another issue where nested interrupts ran out of control and caused the SP to overflow, which ultimately caused the CPU to fetch an illegal instruction.

Can this issue be reproduced on a controlCARD or LaunchPad? If so, we may be able to provide some more hands-on assistance.

-Tommy

0 Andy Fung over 2 years ago in reply to tlee

Intellectual 875 points

The stack is assigned as the following location:

RAMM0 : origin = 0x000050, length = 0x000500

.stack : > RAMM0, PAGE = 1

When the CPU is halt, SP point to 0x6E. I believe it still point to valid memory.

I caught another CPU halt that PC = 0x00000000 and SP = 0x808. It doesn't seem normal to me.

I tried to set higher optimization level (opt_level=3 and opt_for_speed=5) for building a standalone project with compiler TI v20.2.4.LTS. It was working ok without the issue of CPU halt. For building the project with compiler v6.1.3, the optimization level is opt_level=off and opt_for_speed=2 and working without CPU halt.

0 tlee over 2 years ago in reply to Andy Fung

TI__Guru 62975 points

Andy,

I would agree that the SP at 0x808 doesn't seem normal. I think the most common scenario would be overflow. Can you clarify if you are using nested interrupts (like this)?

The optimization level is an interesting experiment. I would generally expect opt_level=off to be the more reliable setting. I suppose that the higher optimization level would help if the problem is related to the ISRs executing too slowly. Do you see faster or more consistent ISR execution times with optimization?

CCS allows for different optimization levels at the file level so it might be possible to experiment with opt_level=off on individual source files to see if you are able to identify which file is benefiting from optimization. If that works, you could then split out individual functions until you find the problem function.

-Tommy

0 Andy Fung over 2 years ago in reply to tlee

Intellectual 875 points

Hello Tommy,

Yes, nested interrupts is used.

For the opt_level=3 and opt_for_speed=5, although there is CPU halt happen, found some of the functionality of application breaks.

Tried to create a separated file and moved HRCAP2_INT and ADCINTx to this file and set opt_level=3 and opt_for_speed=5 on this file only. However, CPU halt still happens.

Andy

0 tlee over 2 years ago in reply to Andy Fung

TI__Guru 62975 points

Andy,

I think it would make sense to stay with opt_level=off for debug.

Can you try logging the ISR nesting levels during execution?

For example, declare a global array of volatile Uint16 where each array element acts as a counter for each ISR in the system. In the ISRs, increment the counter variable before EINT is executed (preemption enabled), and decrement the counter after DINT is executed (preemption disabled). With no active interrupts, the counters should all be 0.

When you end up in a CPU halt condition, you can inspect the counter values to see if the nesting may have run away.

-Tommy

0 Andy Fung over 2 years ago in reply to tlee

Intellectual 875 points

Hello Tommy,

I created the array call iArray to hold the counter in the ADC interrupt which allows nested interrupt there. The pseudocode shown as below.

After CPU halt, all values in the array are zero.

__interrupt void SubSourceADCFastISR(void)
{

// local variables initialization here

oldIER = IER;
IER = M_INT12 | M_INT5;
PieCtrlRegs.PIEACK.all = PIEACK_GROUP1;
asm(" NOP");
iArray[j]=iArray[j]++;
EINT;

IER = oldIER;

// Interrupt process here

DINT;

iArray[j] = iArray[j]--;
if(j<1000)
{
j++;
}
else
{
j=0;
}

}

0 tlee over 2 years ago in reply to Andy Fung

TI__Guru 62975 points

Andy,

I was thinking more along the lines of:

#define IARR_ADC1_NDX   0
#define IARR_ADC2_NDX   1
#define IARR_HRCAP_NDX 2
#define IARR_NUM_ISR    3

volatile Uint16 iArray[IARR_NUM_ISR];

__interrupt void ADC1_ISR(void)
{
    // Code

    iArray[IARR_ADC1_NDX]++;
    EINT;

    // Code

    DINT;
    iArray[IARR_ADC1_NDX]--;

    // Code
}

__interrupt void ADC2_ISR(void)
{
    // Code

    iArray[IARR_ADC2_NDX]++;
    EINT;

    // Code

    DINT;
    iArray[IARR_ADC2_NDX]--;

    // Code
}

-Tommy

0 Andy Fung over 2 years ago in reply to tlee

Intellectual 875 points

Hello Tommy,

I created the array for logging the ISR nesting levels as below. These two interrupts are the only interrupts which allow nested interrupt.

The period of SubSourceADCFastISR is 5us and the period of Timer0_ISR is 1ms.

I ran in debug mode and check the value of the array once the program is halted.

iArray[0] = 1

iArray[1] = 32

#define IARR_ADC1_NDX 0
#define IARR_ADC2_NDX 1
#define IARR_NUM_ISR 2

volatile Uint16 iArray[IARR_NUM_ISR];

__interrupt void SubSourceADCFastISR(void)
{

//code

iArray[IARR_ADC1_NDX]++;
EINT;

// code

DINT;
iArray[IARR_ADC1_NDX]--;

}

//1ms timer interrupt

__interrupt void Timer0_ISR(void)
{

//code

iArray[IARR_ADC2_NDX]++;
EINT;

// code

DINT;
iArray[IARR_ADC2_NDX]--;

}

0 tlee over 2 years ago in reply to Andy Fung

TI__Guru 62975 points

Andy,

A nesting level of 32 for the Timer0 interrupt seems high to me. That would at the very least take a big portion of your stack memory.

Is the Timer0 interrupt allowed to preempt its own ISR? If so, I would guess that at some point in time, the Timer0_ISR execution takes longer than 1ms to complete (especially with other ISRs preempting the execution) and subsequent Timer0 interrupts are spawning additional Timer0_ISR instances before the prior instances can complete.

Is the timing and tick count of Timer0 critical? It would be interesting to see if the situation improves by slowing down the Timer0 rate (maybe try 10ms), and/or by disabling new Timer0 interrupts between EINT to DINT in Timer0_ISR (either by stopping / resuming the Timer decrement, or through IER masking).

-Tommy

0 Andy Fung over 2 years ago in reply to tlee

Intellectual 875 points

Hello Tommy,

Tried to stop and resume the Timer0 in Timer0_ISR between EINT to DINT. It does help on this issue and no CPU halt. However, it will cause another issue in our application. We have multiple F28069 to run and the Timer0 in all the F28069 have to be synchronized. So stopping and resuming Timer0 will break the synchronization.

Also, we have to keep the Timer0 rate at 1ms.

Andy

0 tlee over 2 years ago in reply to Andy Fung

TI__Guru 62975 points

Andy,

The experiment is good supporting evidence that the Timer0_ISR execution time (and subsequent nesting) is a major contributor to the system crashes. Would you happen to know the Timer0_ISR nesting level for the v6.1.3 compiler?

I suppose that you have two strategies to choose from to resolve the issue:

Try to find the difference in code generation between compilers:
- Are some variables / functions placed in Flash instead of RAM (the .MAP file can be useful)
- Are there dependency differences between compilers (ex: FPU library location or version)
- Is there a significant difference in generated ISR code (especially with Timer0_ISR)?
Try to further optimize the system / code so that Timer0_ISR can complete within 1ms
- File-level compiler optimization of specific functions for speed
- Hand-optimize code for speed: https://software-dl.ti.com/C2000/docs/optimization_guide/index.html
- Use DMA to create a buffer of batched ADC conversions for less frequent CPU involvement
- Use CLA to handle tasks in parallel with CPU

-Tommy

0 Andy Fung over 2 years ago in reply to tlee

Intellectual 875 points

Hello Tommy,

Compiled the code (5us ADCINT rate) with v6.2.11 and v6.4.0.

Compiled with v6.2.11 (released in Feb-05-2015), running without issue.

Compiled with v6.4.0 (released in Nov-19-2014), CPU halt.

Is there any critical different between these two versions for this issue?

And not sure why the v6.4.0 (Nov-19-2014) is release earlier than v6.2.11 (Feb-05-2015).

Thanks,

Andy

0 tlee over 2 years ago in reply to Andy Fung

TI__Guru 62975 points

Andy,

Can you double-check the compiler version numbers? I do not see exact matches for them...

Did you mean v6.2.10 and v6.4.11?

-Tommy

0 Andy Fung over 2 years ago in reply to tlee

Intellectual 875 points

Hello Tommy,

The compiler version are v6.2.11 and v6.4.0. I downloaded it from

http://software-dl.ti.com/codegen/non-esd/downloads/download_archive.htm

Which are under the first one of "6.2.x Release" and the last one of "6.4.x Release".

Thanks,

Andy

0 tlee over 2 years ago in reply to Andy Fung

TI__Guru 62975 points

Andy,

I'm glad that you were able to find that site. I was not sure if it was visible to the public.

It looks like 6.4.x introduced a number of performance enhancements over 6.2.x so there was probably overlap between 6.4.x preliminary releases vs 6.2.x bugfix releases.

Would you be able to profile the execution times of the ISRs without interrupt nesting to see if there is a noticeable difference between the two compilers?

It will be much easier for the compiler team to help if you are able to narrow down the area of interest to a single function (or code snippet) that can be submitted as a compiler test case to the CCS Forum.

-Tommy

0 Andy Fung over 2 years ago in reply to tlee

Intellectual 875 points

Hello Tommy,

Without interrupt nesting, the application won't work.

For the code snippet, I inserted three INT routines. Please let me know if it is good enough.

Thanks,

Andy

// HRCAP2 INT, interrupted by a pulse which has 10us on time and 20us off time
__interrupt void TimeMonitorPulse(void)
{

	if(HRCap2Regs.HCIFR.bit.RISE)
	{
		// Code
	}
	else if(HRCap2Regs.HCIFR.bit.FALL)
	{
		// Code
	}
	else if(HRCap2Regs.HCIFR.bit.COUNTEROVF)
	{
		// Code
	}

	// Code
	
	EALLOW;
	HRCap2Regs.HCICLR.all = 0x1F;
	PieCtrlRegs.PIEACK.all = PIEACK_GROUP4;
	EDIS;

}

// 5us ADC INT
__interrupt void SubSourceADCFastISR(void)
{
	DINT;

	unsigned int oldIER;

	oldIER = IER;
	// allow XINT3 and HRCAP4 to interrupt
	IER = M_INT12 | M_INT5;
	PieCtrlRegs.PIEACK.all = PIEACK_GROUP1;
	asm(" NOP"); //Wait for PIEACK to exit the pipeline
	EINT;

	if(1 == AdcRegs.ADCINTFLG.bit.ADCINT1)
	{
		// grab reading from AdcResult.ADCRESULTx
		AdcRegs.ADCINTFLGCLR.bit.ADCINT1 = 1;		//Clear ADCINT1 flag reinitialize for next SOC
	}
	if(1 == AdcRegs.ADCINTFLG.bit.ADCINT2)
	{
		// grab reading from AdcResult.ADCRESULTx
		AdcRegs.ADCINTFLGCLR.bit.ADCINT2 = 1;		//Clear ADCINT2 flag reinitialize for next SOC
	}

	// Code
	
	IER = oldIER;

	DINT;
}


// 1ms Timer0 INT
__interrupt void SubSourceSafetyAndChannelStateSlowISR()
{
	// manage interrupts to allow select ones to interrupt SlowISR
	oldIER = IER;
	// always allow INT1 (ADC) and INT11 (CLA) and HRCAP4 to interrupt and timer 1
	IER = M_INT1 | M_INT5 | M_INT13;
	// only allow INT4 (ECAP2) to interrupt if it's already enabled
	if(oldIER & M_INT4)
		IER |= M_INT4;
	PieCtrlRegs.PIEACK.all = PIEACK_GROUP1;
	asm(" NOP"); //Wait for PIEACK to exit the pipeline
	EINT;


	// Code

	IER = oldIER;

    DINT;
}

0 Andy Fung over 2 years ago in reply to Andy Fung

Intellectual 875 points

Tommy,

I attached two screen shot which are the GPIO toggling in the 1ms Timer 0 INT. The high time indicate the execution time of the interrupt. The 1st one is captured when running the code compiled with v6.2.11. The 2nd one is captured when running the code compiled with v6.4.0 before the CPU halted. In the 2nd screen shot, I saw some high time is longer than 1ms.

Thanks,

Andy

0 tlee over 2 years ago in reply to Andy Fung

TI__Guru 62975 points

Andy,

The compiler team will only be interested in specific differences in code generation, so whether the overall application works or not is irrelevant to them.

I see that there is about 20us of time difference for the TIMER0 ISR execution between compilers. That should be good enough for a start.

-Tommy

0 Andy Fung over 2 years ago in reply to tlee

Intellectual 875 points

Hello Tommy,

Do you still like me to provide code snippet? Please instruct me to provide code snippet that shows specific differences in code generation.

Thanks,

Andy

0 tlee over 2 years ago in reply to Andy Fung

TI__Guru 62975 points

Andy,

At this point, you will want to engage with the compiler team directly on the CCS E2E Forum.

They will most likely ask for you to submit a compiler test case using these instructions.

-Tommy

PS. I am out of the office until 8/16.

0 Andy Fung over 2 years ago in reply to tlee

Intellectual 875 points

Tommy,

So should I start a new question on https://e2e.ti.com/support/tools/code-composer-studio-group/ccs/f/code-composer-studio-forum?

Also the link http://software-dl.ti.com/ccs/esd/documents/sdto_cgt_How-to-Submit-a-Compiler-Test-Case.html doesn't work. Can you send me the instruction again?

Thanks

Andy

0 tlee over 2 years ago in reply to Andy Fung

TI__Guru 62975 points

Andy,

Yes, please start a new question in the CCS forum and let them know that you are seeing different execution times between the compiler versions. They will be able to assist with debug.

It's strange that you are not able to see the instructions. I am attaching a PDF capture.

-Tommy

C2000™︎ microcontrollers

C2000 microcontrollers forum

TMS320F28069: The processor halt when upgrade the compiler to TIv20.2.4.LTS from TI v6.1.3