Hello,
I am working on a single core of a custom TMS320C6474 board (rev 2.1 silicon). (The other two cores are being held suspended by the CCS v4 debugger.)
I am running DSP/BIOS 5, I have two TSK objects and one SWI, and in addition I have the ethernet stack and what ever it adds in the way of SWIs, HWIs and TSKs.
I am having problems with internal exceptions. These are mostly Resource Conflicts "RCX", but occasionally Opcode "OPX" or Instruction Fetch "IPX" Exceptions.
My application runs various DSP algorithms on data held in DDR2 RAM. One TSK contains the DSP code to do this, the other just initialises and monitors the SWI that copies dummy data into an input buffer. (The SWI would normally take real ADC data from SRIO, but enabling SRIO causes even more exceptions - so that is disabled currently).
I previously had an internal resource conflict exception "RCX" running the same code but on the C6748. This was due to bugs in the TI fast math libraries. http://e2e.ti.com/support/dsp/tms320c6000_high_performance_dsps/f/115/t/133991.aspx#481095
In this case I have removed the fastrts64x.lib library but my problems have not gone away.
I have a break point on EXC_dispatch that is catching the exceptions when they occur. In my current build, the exception seems to always be RCX with NRP/ERP pointing to the same address - a CALL instruction. The call is to a function in another C file - not to a TI library. It takes in the order of a minute for the exception to occur, so I am certain that the CALL instruction code has run many times already (and hence I don't want to pout a breakpoint on it). I think that something else must be happening to trigger the exception.
I have been working on this for a long time now, and with previous RCX exceptions I have found by trial and error that making very small changes to the code will cure the current exception and move the problem elsewhere.
Things I am not sure about:
- the NRP/ERP point to the instruction where the exception happened, and this is a CALL instruction - Has the CALL executed yet? e.g. did the exception occur before of after the function was called? (If is it after then I need to consider the called function.)
- what are the possible causes of the RCX exception? The problem with the TI library code was it used a register and did not allow enough cycles before returning - so I know that re-using registers can be an issue. Are there other causes - the documentation is not very clear about the exceptions.
- could interrupts be causing the exceptions? Previously this code ran OK on the same silicon in a loopback mode where I did not have an SWI feeding in data. (Note that this loopback test used ethernet to send and receive data OK).
- generally - how am I supposed to go about debugging this?
Looking forward to some helpful suggestions - Regards Paul