TMS570LC4357: Code to enable Cross Trigger Interface causing Data Abort exception

Marcio Matsunaga

Part Number: TMS570LC4357
Other Parts Discussed in Thread: HALCOGEN

Hello,

I have a code extracted from a script (.gel) sent here at TI forum to enable the CTIs.

But sometimes (not always!) when running the code line showed below, I have a Data Abort Exception (a Synchronous External Abort).

When that happens, if I simply turn off and then on the power supply of the dev. board, it all starts working perfectly (i.e. no exceptions and the CTI is enabled).

The code that causes the excpetion is:

*(int *) 0xffa07fb0 = 0xc5acce55; /* Unlock writes to CTI */

Any thoughts on what could be the source of this issue?

Thank you,

Marcio

over 7 years ago

0 Chuck Davenport over 6 years ago

TI__Guru 59540 points

Hello Marcio,

I am checking with some of our experts and will get back with you shortly.

0 Marcio Matsunaga over 6 years ago in reply to Chuck Davenport

Prodigy 120 points

Thanks Chuck.

Just to add some more info. Today I run into this problem again, and managed to get a bit more data.

So when the Data Abort Exception happens, I tried the button "System Reset Switch" (S3 in the Hercules Dev. kit board) but it doesn't solve the problem. The TMS570LC still gets the Data Abort Exception.

So I needed to use the PORRST (Power-On Reset, S4 button in the kit) to stop having the Data Abor Exception. Which makes sense since disconnecting the power-supply also solves the problem.
And also explains why I can't solve this problem via the debugger reset.

That means a POR reset is necessary to removes the TMS570LC from the state that causes the Data Abort Exception. Still seems mysterious to me, but hope this info can help you and your team a bit more.

Just a quick copy and paste from the Hercules Dev. Kit Guide regarding the 2 resets available:

======================================================
2.6 S4, Power On Reset Switch
TMS570LC43 MCU has two resets: warm reset (nRST) and power-on reset (nPORRST). Switch S4 is a
momentary switch that asserts power on reset to the TMS570LC4357 device. The nPORRST condition is
intended to reset all logic on the device including the test and emulation circuitry.

2.7 S3, System Reset Switch
Switch S3 is used to assert a warm reset the TMS570LC4357 device. Warm reset does not reset any test
or emulation logic. The reset signal from window watchdog will also assert a warm reset to the MCU. The
warm reset can be invoked by pushing nRST button, or by RESET signals from XDS100 CPLD, ARM
JTAG SREST.
======================================================

Best Regards,
Marcio

0 Chuck Davenport over 6 years ago in reply to Marcio Matsunaga

TI__Guru 59540 points

Hello Marcio,

My apologies for the delay in getting back with you as I have been doing a little bit of research on the ECT IP you are using. Note that there is significantly more information on this IP in the CoreSight Components TRM on the ARM website (ARM Information Center).

For my general information, can you provide a link to the post from which your code was originated so I can have a closer look at it? The use of the keyed write to the register seems inconsistent with the CoreSight Component TRM based on an initial look.

0 Marcio Matsunaga over 6 years ago in reply to Chuck Davenport

Prodigy 120 points

hello Chuck,

Last week I was OoO that's why it took me a while to answer back.

Here's the link to the post with the script I mentioned before:

https://e2e.ti.com/support/microcontrollers/hercules/f/312/p/517887/1882641#1882641

It's a .gel script. Since I'm not using the Code Composer Studio (CCS) I just used the code to enable the CTIs during the initialization of the TMS.

Regards,

Marcio

0 Chuck Davenport over 6 years ago in reply to Marcio Matsunaga

TI__Guru 59540 points

Hello Marcio,

I spoke to our R5 expert about this topic (sorry for the delay since he was out of office for a period of time). He mentioned that you will need to add the CTI frame to the MPU configuration (will require a new block definition) in Halcogen in order to be certain no aborts are triggered due to an illegal access.

0 Marcio Matsunaga over 6 years ago in reply to Chuck Davenport

Prodigy 120 points

Hi Chuck,

So does it mean that a nRESET (Warm Reset) does not set the MPU to a known state?

Which other components are not reset by the warm reset?

Regards,

Marcio

0 Chuck Davenport over 6 years ago in reply to Marcio Matsunaga

TI__Guru 59540 points

Marcio,

MPU configuration is done by SW. It is my understanding that the MPU is configured by default within Halcogen and the init is called as part of the startup code. The address range for the CTI is not necessarily included in this default configuration and will need to be added.

0 Marcio Matsunaga over 6 years ago in reply to Chuck Davenport

Prodigy 120 points

Hi Chuck,

yes MPU is configured by SW, but at reset, normally, all internal modules are set back to their reset condition (i.e. a deterministic state). So even before any instruction is executed, in my understanding, all modules should be in a known, documented configuration.

Doing a quick research here at TI forum I found, for ex, this post by Jean-Marc:
e2e.ti.com/.../304680

Extract:
"Any write to these bits other than "01" will cause a software reset.
The CPU will jump back to the reset vector, and all internal modules will be back to their reset condition."

And this other post by Charles Tsai:
e2e.ti.com/.../1974415

Extract:
"System reset is a warm reset that pretty much reset the entire device except a few analog components."

So then, my question would be: What's the reset state of the MPU module?
And why sometimes the MPU, after a warm reset, causes an exception and sometimes, after a warm reset, it doesn't cause exceptions when the CPU access the same address.

Regards,
Marcio

0 Chuck Davenport over 6 years ago in reply to Marcio Matsunaga

TI__Guru 59540 points

Hello Marcio,

Excuse my confusion about your question. For sure the MPU is reset by the nRST/warm reset to its default state. As a default, it should be disabled prior to any code to configure and enable it.

If you are receiving intermittent aborts from this data access, I don't currently have a good explanation. Can you provide more details on the status of the debug connection with respect to the resets under which the abort occurs.

My theory is this is related to the debug signal propagation in the device and when you perform a warm reset, the debug connection isn't lost/reset but when you perform a hard reset (POR) it is meaning that the enabling of the CTI happens when no debug signal is present vs when it is.

0 Marcio Matsunaga over 6 years ago in reply to Chuck Davenport

Prodigy 120 points

Hi Chuck,

The debug is being done through a jtag connection. The debug connection is never lost. From the debugger I don't see any difference when I perform a warm reset or a POR.

But also, it seems like the problem is not related to the activation of the CTI itself. It seems related to a write to the address 0xffa07fb0, since the exception occurs at this point. (i.e. before any command to actually enable CTI)

I'm not even sure this problem was caused by the MPU since I had a Synchronous External Abort, and the MPU can only generate 3 types of faults: Background, Permission and Alignment.

So even with MPU enabled, if the memory zone for the address 0xffa07fb0 wasn't mapped into a MPU region I'd get a Background fault. And this was not the case.

What could be causing a Synchronous External Abort when writing to a memory address? Something that wouldn't be cleared with a warm reset. But would be cleared with a POR.

Thanks,

Marcio

0 Chuck Davenport over 6 years ago in reply to Marcio Matsunaga

TI__Guru 59540 points

Hi Marcio,

Generally speaking, synchronous aborts are not cleared with a warm reset so that they can be processed after a fresh reset incase there was a transient error causing it. To clear it you would need to service the synchronous abort by reading the FSR and clearing the synchronous abort flag and the captured error address by reading it.

Once you have read the captured error address, you can dig deeper into why it occurred. i.e., unimplemented memory, uncorrectable error/bus error, etc. If the captured error address corresponds with an of 0xffa07fb0, it is in the CTI1 frame. I haven't looked at the CTI register definitions in the ARM documentation, but it is possible that this is an unimplemented address location and, therefore, results in an abort when accessed.

So to summarize, the warm reset vs POR reset behavior you are seeing regarding the abort being cleared is expected. What is a little puzzling is why you don't get any new abort after a POR due to the access to the same register address. This might be explained by my theory in my last post in that when you perform a POR the device will disconnect from the Debugger and when a warm reset occurs you don't. One way to test this is to perform the POR, connect to the device (or re-establish the connection by doing something in the debugger like a halt), after reconnecting, perform another soft/warm reset and see if the abort now shows up with the debugger connected.

Note, even with the debugger physically connected to the board, a POR will disconnect from the emulator briefly during power on reset. This is noticed in CCS (I know your not using that tool) by a message that the device has gone through a reset.

0 Marcio Matsunaga over 6 years ago in reply to Chuck Davenport

Prodigy 120 points

Hi Chuck,

It makes sense about the synchronous aborts not being cleared with a warm reset.

Last time I had this problem I looked into the error address and it corresponded to a write to the exact address 0xffa07fb0 (the CTI one). It may look like this is an unimplemented address. But, after a POR, the writing to this address work (i.e. don't generate an abort) and I see the effect of the CTI being enabled (meaning, the RTI counter freezes when CPU halts. And just to be sure I had disabled CTI and the counter doesn't freeze. So I'm positive CTI is indeed working).

So from your theory, the only way to actually clear and properly enable CTI (in case the abort exception happens) is to disconnect the device from the debugger. Which can't be achieved by either physically disconnecting the debugger or by a POR (which also can only be done by physically pushing POR button on the dev. board).

Obs: just to highlight a point. The write access to the address 0xffa07fb0 is not to actually enable the CTI. It's just to unlock writes to CTI (so it can then be enabled by writing into another address). So it could be that CTI has nothing to do with this issue.
I belive if the TMS570 was having problems to unlock writes to the CTI address it wouldn't give a data abort exception related to a write in that address. In fact, if TMS570 was having issues to unlock writes to the CTI it means that the write to the address 0xffa07fb0 was indeed successful.

Anyways, this is concerning since in case we run a batch of tests, it may eventually get an exception that can't be cleared by software, thus requiring someone to manually reset the device.
I'd never seen a situation like this, specially considering we're not running anything super complex at this stage of development.

Any workarounds on how to get away from having to physically go to the dev. board and push the POR button?

Thanks,
Regards,
Marcio

0 Marcio Matsunaga over 6 years ago in reply to Marcio Matsunaga

Prodigy 120 points

Sorry, just to clarify a point in my last message. We can indeed reset this data abort exception by software.

Like you mentioned, I can create a code to clear this error flag and reset the CPU again.

But even if I clear this data abort exception flag by software and execute a warm reset, probably the data abort exception will happen again, right. Since clearing the flag doesn't actually solves the root cause of the abort. And apparently the only way to "solve" the root cause of the abort is by physically going to the board.

0 Chuck Davenport over 6 years ago in reply to Marcio Matsunaga

TI__Guru 59540 points

Hello Marcio,

You are correct in that SW can disable the abort by clearing the flag and reading the FSR data. I believe the second part is also required to read the address from the FSR that resulted in the abort.

In regard to the issue happening again, I am not certain if it is really this or not. It might be worth while to discuss directly with ARM since it is their IP that is causing this issue. One other possibility is that there is a time lag required after writing to the write enable register? Maybe this is why there is some intermittence to the issue you are seeing? This last bit is a little bit of a reach so it may not have any impact, just a suggestion to put some NOPs after the write to the CTI register also please check what mode you are in when you make this write. i.e., supervisor mode, privilege mode, user mode, etc to insure you are in the correct access mode for this register.

0 Marcio Matsunaga over 6 years ago in reply to Chuck Davenport

Prodigy 120 points

Hi Chuck,

Thanks for your effort. I understand it's hard to debug something we can't replicate and have so little information about.

I'll look into the clearing flag code and possible timing issues when accessing CTI regsiters.

Anyways, it's important to leave this issue registered here in the forum in case someone else happens to run into this same issue.

Best Regards,

Marcio

0 Chuck Davenport over 6 years ago in reply to Marcio Matsunaga

TI__Guru 59540 points

Marcio,

I agree 100% that it is important to keep the issues/troubles that we find listed here in the Forum. It serves as a handy reference for anyone experiencing the same issue as well as a data repository for us to use for areas of improvement in either performance, execution or simply documentation. It truly is beneficial for both TI and our customers. Thank you for discussing your issue here.

0 Etienne Alepins over 6 years ago in reply to Marcio Matsunaga

Intellectual 775 points

Hi Chuck,

Please note that the fact that a debugger reset does not clear the CTI error is because on the TI TMS570LC HDK, the SRST JTAG signal is routed to the nRST (warm reset) pin of the TMS570LC4357. The cold reset pin (POR_RESET) of the CPU is only activated upon a real cut of power or upon the POR_RESET push button activation. Why is the HDK connecting the JTAG SRST to the warm reset feature rather than to the cold reset feature? I has the effect that it is not possible to perform remotely a cold reset: we need to get physically to the board to do a cold reset.

Thanks.

0 Etienne Alepins over 6 years ago in reply to Etienne Alepins

Intellectual 775 points

Any answer?

0 Chuck Davenport over 6 years ago in reply to Etienne Alepins

TI__Guru 59540 points

Hello Etienne,

Etienne Alepins said:
Why is the HDK connecting the JTAG SRST to the warm reset feature rather than to the cold reset feature? I has the effect that it is not possible to perform remotely a cold reset: we need to get physically to the board to do a cold reset.

The is the recommended and intended hook up. The normal use case is to have the EVM sitting next to you at a bench and not to be located remotely. Certainly if you wish to have a power on reset each time you select the system reset in the debugger, you could do a small modification of the board to enable this configuration.

0 Etienne Alepins over 6 years ago in reply to Chuck Davenport

Intellectual 775 points

Hi,

Thanks. And do you recommend as well that in our final product the JTAG reset be connected to the cold reset of the CPU? The warm reset feature would then not be connected.

0 Chuck Davenport over 6 years ago in reply to Etienne Alepins

TI__Guru 59540 points

Hello Etienne,

I cannot make such a recommendation. You would need to identify the specific need of your application. In generally, the JTAG should be protected during the application runtime and there are safety mechanisms describing how to do this within the safety manual.

Warm reset is a critical feature in our safety implementation. It allows the restart of the application without loss of historical reference as to what caused the reset in the event it was caused by an error condition. As such, on a restart/warm reset, you would check the ESR to determine the reset source and then check for captured error conditions if the reset was a warm reset. In some of our example applications where we are connected to the TPS65381 PMIC, the nRST is connected to the reset of the TPS where the TPS can assert reset upon exceeding the nERROR count or a watchdog failure. nRST is also a bidirectional pin where an external device can check the status of the devices to see if it is through reset or being held in reset. In short, there are many uses of the nRST pin/warm reset feature that should be considered relative to your system needs.

Arm-based microcontrollers

Arm-based microcontrollers forum

TMS570LC4357: Code to enable Cross Trigger Interface causing Data Abort exception