I have a situation where my firmware is not working, and when I step through the code I see that T3 will sometimes be trashed (usually loaded with 0, but sometimes with a large number like 0x7FFE). At first, I dug deep into the called subroutine, but after turning up nothing I noticed that T3 might be trashed by a different subroutine each time.
The big picture: This is a C5506 on a custom board that runs other C55 test firmware perfectly. I am not using DSP/BIOS. I am only using CSL. The code is a mixture of C and assembly, including some code from dsplib 2.40. My interrupts are all very short, are written entirely in C, do not call assembly, and do not write to arrays. Basically, the interrupts are pure C code that updates global volatile variables, some 32-bit, some 16-bit. All data is in DARAM, except for one read-only 8K DMA buffer in SARAM. All code is in SARAM. I am using the large model and function-level optimization. The linker command file skips the memory-mapped registers, so T3 is probably not being clobbered in memory. One of the first things that I do in my C startup code is turn off C54CM via a custom assembly routine that I wrote. All code compiles without warnings because I have dealt with each of the warnings like CPU_116 until the compile/assemble process is clean.
The problem is that one of my critical non-interrupt functions has a for loop from 0 to 7, and the C compiler uses T3 as a counter to determine how many times the loop is repeated. When T3 gets trashed with a value like 0x7FFE, that function takes several thousands of cycles longer to execute, and possibly trashes memory. This critical function is called approximately twice per USB frame, so it's being called at 2 kHz.
I use a number of Texas Instruments assembly subroutines in this non-interrupt function, such as rfft(), and a few assembly subroutines that I created myself, such as a square and sum vector function and a 32-bit sqrtv() function. But each of these assembly subroutines always pushes T2 and T3 on the stack if it alters those registers.
This tells me that either I am trashing my stack, or the interrupts are trashing the stack, or the interrupts are trashing T3 directly.
When stepping through my critical function, I usually step over the fft or sqrt functions, and I can see in the debugger that T3 is changed while stepping over a single assembly CALL instruction. This really makes no sense to me.
One thing I tried was to move my stack allocation to the opposite end of DARAM memory. The idea here was that if any of my regular C arrays were overflowing (out-of-bounds writes), then originally the next object in memory was the stack. So, I moved the stack very high in memory such that I should notice other data being trashed before the stack, but this made no changes. That seems to leave interrupts as the culprit.
Although I have seen T3 change when stepping over a CALL, I also have stepped into the assembly functions, and one time I caught the following instruction alter T3: OR #0120h, mmap(ST1_55), which should be equivalent to BSET SXMD, BSET SATD. Reading about some of the status register mode bits, it seems that T3 may be affected by the C54CM mode bit, but I am unclear whether this could affect my program. As I mentioned, I clear C54CM right away in my program, so C54xx mode should never be enabled, and yet I still have lingering suspicions that T3 might somehow be affected by C54CM or some other mode.
Although I have never caught my code jumping into invalid memory (does the C55xx actually use the stack for return addresses?), it does seem to be the case that if I let my code run long enough then halting it with the debugger will show the Program Counter is in the middle of nowhere, but this could be a false reading if the debugger has somehow become confused by register trashing.
Note: If I comment out the single call to rfft() within my for (c = 0; c < 8; c++) loop, then everything runs fine (except that I'm processing time-domain samples instead of frequency domain). This makes it seem like cfft() or cbrev() or unpack() are trashing T3 directly, but they all properly save T3 on the stack. Surprisingly, I have also caught sqrt_16() killing T3, and the same story there is that the assembly seems to be properly saving T3 on the stack.
Anyone have any clues? Is there anything special about T3 with regard to C54xx Compatibility Mode, or any other mode for that matter?
Can I rely on the compiler to save all allocated registers that are used in a C interrupt function, provided that I use both the #pragma INTERRUPT and the C compiler interrupt type for the function? The assembly output of the C compiler sure seems to be producing valid interrupt routines that save context on the stack, so I don't really see how the interrupts could be trashing registers.
One possibility is that I have an out-of-bounds array write which is trashing the stack, but I have reviewed nearly all of my code and have not found anything obvious.
I guess what I am looking for here is to hear whether anyone is aware of something that I might be overlooking. I have studied reams of TI DSP documentation, and I am quite familiar with the rules for mixing assembly and C, particularly that T0 and T1 are not preserved across a subroutine CALL, but T2 and T3 are supposed to be preserved. As I said, I have witnessed T3 get trashed when stepping over a number of different CALL instructions in this loop, so what do I suspect next? The debugger? The compiler? Interrupt context? ?