This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

unable to access the DAP - system reset does not help

Other Parts Discussed in Thread: TMS570LC4357, UNIFLASH

Hello,

We are working with a custom board around the TMS570LC4357. 

The board worked fined for quite some time, but after trying to launch a debug session from a linux workstation (CCS 6), we are no longer able to properly operate the board. 

A short time after powering up the board, we can connect to the target just fine, but after a few seconds, the debugger disconnects from the target with the "unable to access the DAP" error message. We found out that if we cool the board, the time during which we have access to the processor is longer. We never managed to clear the flash during this time, CCS just crashes when we try to erase the flash and uniflash stops with an error message reporting that it's unable to read from address 0x08000000. 

Using the AHB-AP we still have access to the system memory map. Through this interface we tried to perform a system reset using SYSECR register. The system reset is executed (update of the SYSESR), but we still cannot connect to the Cortex R5.

We noticed that after powering up the board we are able to access the debug APB bus (reading the debug ROM for instance), but it stops working after some time.

After power up the software is running (the EMIF has been configured, we can see the PC changing). After a system reset, the EMIF is not reinitialized.

We've checked all the power supplies to the processor and the OSCIN input. 

We've read most of the threads on the forum regarding this issue without finding a solution, any help would be greatly appreciated. 

In particular, is there a way to clear the flash using only the AHB-AP? 

it would seem that the debug APB is locked somehow, any idea on what could trigger this? 

Best regards,

Mathieu 

 

  • do you have an external watchdog or other chip asserting nPORRST?
  • We have a voltage supervisor. We checked it, it does not trigger a PORRST.
  • Have you been keeping CCS updated? Early on there was an issue w. caching that could prevent erase but honestly this doesn't sound like it.

    Not really sure I've seen what you're talking about before but how much time are you talking about with regard to 'cooling down' and how hot is the device actually getting? Are you working in some sort of environmental chamber for example?

    Do you have the JTAG clock frequency set high? It needs to be << 10MHz.

    How long are you talking about when it comes to being able to not erase the flash?
    It shouldn't take long at all to erase the flash maybe about 30 seconds .. are you actually not able to run that long without crashing?
  • We are currently using CCS 6.0.1.00040.
    The device is not really hot. We cool it down by blowing cold air to it.
    We tried setting the JTAG clock as low as 0.5MHz without success.
    We never managed to start the flash erase procedure. In the few seconds we can connect to the target after power up, we can only read the processor internal registers. As soon as we try to dump the content of the SRAM memory using the debugger, we get the disconnect message.

    Mathieu
  • Mathieu

    So forget flash erase.

    Can you simply connect to the micro and keep it connected?
  • For a few tens of second in ambient conditions. It disconnects afterwards. Early disconnection can be triggered by trying to dump an area of memory.

  • It shouldn't disconnect like that without a power on reset.
    You said you probed power on reset and don't see any pulses? Maybe you can double check.
    The only other thought is that maybe you are triggering the VMON internally if the power supply is dipping. Has to be something like this.
  • We will double check. But after the connection fails we cannot reconnect again without power cycling the board, if there is a PORST I expect we would be able to reconnect straight away right?
    Is there a register reporting VMON status?
  • not really if it's causing a power on reset, the reset status registers get cleared on power on reset.
    you could look at the nRST Pin though to see if it is pulsed low ever during this sequence.

    I can't think of anything else though that would be causing the sort of instability you are seeing unless your software is doing something like turning the MCU clocks off and even then it shouldn't be happening if you just connect and halt the processor.

    What does the symbolization on the front of the part say BTW? (part #, and other #'s?)
    Want to make sure of the device you are using.
  • Here are the markings of the chip we are using:

    TMP570

    4357BZWTQQI

    YFB-59C0QHW

    I agree that once the processor is halted, it should be stable! 

    Mathieu

  • That looks ok.

    Do you think you could use the debugger to dump memory at 0xF0080160 for maybe 3-4 words...

    There were some misprogrammed TMP4357's that got out and the symptom I'm told is that these
    may have not working EMIFs but maybe there is some other relation here. Would be good to know what the data is at that address and the couple addresses around it.

    Also how many parts act like this? [just one, or many?]

    -Anthony
  • Hello I have dumped the 0xF0080160 memory:

    There is only one part like that.

    Best regards,

    Mario

  • Hi Mario,

    That's interesting ... it's a partial match with the 63443; the next word would need to have a 23 or 24 in order for it to be the known issue. Still interesting that it's only one part behaving like this ..

    Will need to think this through and maybe it would be good to get the same dump from a few of the parts that are behaving correctly for you.
  • Hi Anthony,

    Here you are the content of the memory at the same address of another board which is working.

    I hope this may help to found the source of the error.

    Best regards,

    Mario

  • Hello Anthony,

    I did not receive any answer from my last message. Do you have more information about this issue?
    Now I have a second board which has a similar behavior. While programming the Flash with CCS I started having some single errors but after a retry I got a correct Flash programming. Some days after it was impossible to re-program the Flash Memory, even to Erase it with CCS.
    It was also impossible to connect to the processor for several hours.

    The difference with this board is that a day after, I tried to Erase it with UniFlash and it works.
    Then I tried to program it with CSS and it works just after being erased by UniFlash, but once it has been rewritten it is impossible to re-program it again. Thus I am forced to Erase with UniFlash and the program with CCS (not the most comfortable way to work).

    I add you the content of the memory address requested before:

    Start address 0xf0080160

    77169302 00063443 7A180004 00350018 22222A2A
    030221A1 02540200 0200F019 01000020 00FF0014
    07D00010 00000FA0 019501C0 0081071B 09040BFF
    E14CA104 FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF
    5CFCE33C 120F012C 46F20B21 2C0000FF 00120012
    0012FFFF 0002FFFF 0014FFFF FFFFFFFF FFFFFFFF
    FFFFFFFF FFFFFFFF 544D5035 37304C43 34333537
    425A5754 51513100 00000000 00000000 00000000
    05560522 0514FFFF 36733193 2550FFFF 05190546
    0000FFFF 35273562 0000FFFF 05380529 FFFFFFFF
    35953438 FFFFFFFF 05350525 FFFFFFFF 34593347
    FFFFFFFF 05670633 FFFFFFFF 39524605 FFFFFFFF
    FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF
    FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF
    FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF
    FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF
    FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF
    FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF
    FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF
    FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF
    FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF
    FFFFFFFF FFFFFFFF FFFFFFFF 0777012F 05CD00E9
    09A6018E FFFFFFFF 07BF012F 062300E9 09E1018E
    FFFFFFFF 076C012F 05E100E9 0987018E FFFFFFFF

    I hope you can help me to clarify the reason of these failures. I have 3 other boards and I don't want them to break down one after the other, otherwise my development planning will be impossible to held.

    Mario.
  • Mario,

    There memory dumps are ok; they do not indicate any misprogrammed units.

    What exactly is the current state?   Are you still seeing the 'disconnect' issue originally reported, but have just found that using UniFlash is a workaround?  Or has the issue evolved to just the CCS v.s. UniFlash one.

    If you're getting different results w. CCS versus UniFlash it makes sense to me to check the installed component versions since they should be just two different GUI tools on top of the same components.    Maybe one installation is out of date or configured differently.

    First I'd compare how the 'On Chip Flash' and 'Flash Settings' are configured between the two tools.   (First screenshot)

    Then I'd check the "Installation Details" under the about menu and compare the critical versions of the components like debug server flash, hercules emulation, ti emulators, ...  (Second screenshot).

    If CCS is showing out of date components i'd either try updating it or reinstalling it.

    Good luck with this one.

    -Anthony

    Note:  the below flash settings are not ideal for the TMS570LC4357 and in fact they are just what I am working w. but the point is to compare your

    uniflash settings that let you erase against your ccs settings that do not let you erase...

  • Hi Anthony,

    No, the "disconnect" problem of the first board was impossible to solve. Neither with Uniflash nor with CCS we were able to erase or re-programme the Flash.

    The behaviour descrbed on my previous message was observe on a different board. As I said we manufactured 5 boards. The currebt status is: the first board has the "disconnect problem, the second one has the reprogramming Flash problem described above, the third board started failing also on Friday evening and now has also de Flash programming problem, the other two are working (we cross the fingers)I will verify the version of both and compare also the configuration parameters. I will come back to you once I have this information. Thank you for your advice.

    Best regards,

    Mario

  • Hello Anthony,

    We are still fighting off this issue. 

    We solved our problem with the board having a different behaviour when programming with Uniflash or CCS, an update to the latest version solved it. 

    However, we now have another board with a problem very similar to the one described in the original post. 

    We can connect to the to the DAP, use the system view to access the full chip memory space. We then connect to the CPU, are able to set the PC, step the processor, analyse the status of the registers, but as soon as we try to display the contents of the flash memory or the SRAM memory using the memory browser, the debugger disconnects from the CPU. We are not able to reconnect to the CPU without a power cycle of the board. A system reset does not help (performed by writing the SYSECR, and checked with SYSESR). We noticed that after the crash the APB debug bus is no longer accessible via the DAP (APB_view), we get only '?' characters in the memory browser. (which would explain the debugger disconnection). We are also able to trigger the crash by reading the APB debug memory space using the DAP system view (reading at address 0xFFA0_0000).

    Our current train of thoughts is that when the debug APB bus is "stressed" by large accesses (either to dump a part of memory using the CPU debugger or by dumping the debug APB bus memory area using DAP system view), the APB bus enter a deadlock situation and is no longer responsive. The weird thing is that this board worked fine for a few months before failing.

    Any suggestion/ideas would be greatly appreciated.

    Mathieu 

  • Have you looked at the routing of the JTAG signals? If the total signal length from the scan controller to the IC is long (>6") and not properly terminated, reflections on TCK or RTCK can cause this type of intermittent problem. Also, crosstalk from TDI, TMS or TDO onto TCK or RTCK can have the same effect.
  • Hello Bob,

    We thought about that, but since we are able to operate flawlessly the DAP in system view (excluding the debug APB bus memory area) we discarded it. Also I guess we would see the same problem on our other boards.
    Also we tried operating the JTAG at 1MHz instead of 10MHz and didn't see a difference.

    Mathieu
  • OK, but operating slower does not solve the problem of reflections of crosstalk generating a glitch on the TCLK/RTCLK lines. The fact that you can access the DAP continuously without issue though does preclude this being a JTAG hardware problem. It does seem like the APB bus is hung. I will pass this thread around to some additional colleagues here at TI.
  • Hello Bob,

    Any update on this point?

    Mathieu
  • No, Thursday and Friday are holidays at TI here in the US.
  • Hello,

    We continued our investigation on this problem and are able to report the following  observations:

    • When performing a CPURST by writing 0x1 in the CPURSTCR, the system enters a state in which we are able to set and read the core registers using the debugger but no other registers, as shown in the attached screenshot
    • When the CPU is running (looping in the internal SRAM), we are able to display the memory contents of the debug APB via the DAP system view without crashing the system. 
    • Using the expression analysis window to dump part of the internal memory does not crash the system, but using the memory browser does.

    It looks like our system is no longer tolerant to "fast" debug actions... Could it be that when using the "stall" function of the debug interface, the APB bus ends up in a deadlock? 

    Mathieu

  • I am anxious to understand this problem. What do you mean by "stall" function of the debug interface?
  • Glad I got your interest picked! :)
    I am referring to the mechanism described in the Cortex R5 TRM (ARM DDI 0460D) section 12.4.4. In order to speed up the data transfers between the CPU and the debugger, the CPU debug interface can be configured to operate in "stall" mode in which the APB bus is stalled until data is available in the debug RX (or TX) register. My idea is that something went wrong in the CPU during a debug operation in "stall" mode that resulted in the APB bus being "hung" on the last transfer.
    It's just an idea....
  • Interesting. Do you change the DBGSCR register in your code? When I read this register in one of my projects (uses CCSV 6.1 and an XDS110 scan controller) the ExtDCCmode bits were still in the default, 00b, non-blocking mode.
  • We don't change the DBGSCR in our code. (in fact we were mostly unware of these mechanisms before starting the investigation). 

    From what I gather from the documentation, I expect that the debugger is only enable the "fast" or "stall" mode for brief periods of time during a memory analysis command.