This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TM4C1290NCPDT: Supporting trace on the TM4C129

Part Number: TM4C1290NCPDT
Other Parts Discussed in Thread: EK-TM4C123GXL, , SEGGER

I am working on updating a TM4C123 design to a TM4C129 and trying to make as many improvements as possible in the process.  To date we have used the ICDI debugger from an EK-TM4C123GXL to load software and debug (single-step, set breakpoints, view memory, etc.).  We use a Tag-Connect cable with a customized pinout which includes the UART0 signals, reset, and power for the target.  It has served us well, but I wonder what we are leaving on the table with regard to debugger speed and trace capabilities.  So I am trying to figure out whether we should switch to or add a different debug connector and which debugger hardware would let us take advantage of the new capabilities.

I have been looking at the XDS110 and XDS200.  It looks like they use the 0.05” pitch 20-pin TI CTI connector and pinout, with adapters for other connectors.  Most of the examples I have seen for TM4C129 boards seem to use the 0.05” pitch 10-pin Cortex debug connector, which omits a number of the signals.  What would I give up by using the smaller connector with either of those debuggers?  It looks like you can access the Embedded Trace Macrocell (ETM) with any XDS debugger (though not with ICDI).  [Correction: The XDS110 can access the ETB, not ETM, on MCUs which have one; the TM4C129 does not.]  Does the ETM give backtrace capabilities (after a crash) with CCS? 

Does the XDS200 provide significant advantages over the XDS110 when used with the TM4C129?  I see that it has active JTAG clocking while the XDS110 does not, but I don’t think the TM4C129 supports that anyway.  Is the XDS200 faster even without active JTAG clocking?  Is that a big advantage when loading software, or is the load time already dominated by the time it takes to program the MCU’s flash after it is transferred?  Are there functional advantages of the XDS200?

If we took the leap to the XDS560v2, would the 20-pin connector support all of the additional features it would provide, or does that require a 60-pin connector?  Or maybe the TM4C129 doesn’t support those features at all?

 Thanks,

Steve

  • Hello Steve,

    Going to answer this a little backwards to start because I don't know all the answer here but for starters:

    If we took the leap to the XDS560v2, would the 20-pin connector support all of the additional features it would provide, or does that require a 60-pin connector?  Or maybe the TM4C129 doesn’t support those features at all?

    There are no added benefits provided by using a larger than 10-pin header. The TM4C1290NCPDT does not have additional JTAG connections to plug into with more advanced headers.

    Our JTAG User's Guide walks through the header pin outs and TM4C12x connections in Sections 3.1 and 3.2 and you'll see that all the TM4C12x connections are accounted for with the 10-pin header: https://www.ti.com/lit/an/spma075/spma075.pdf?ts=1636579798982

    There is a note about EMU0/EMU1 being dependent on target device and I am not sure why it is mentioned like that as those pins are not used for TM4C devices.

    Most of the examples I have seen for TM4C129 boards seem to use the 0.05” pitch 10-pin Cortex debug connector, which omits a number of the signals.  What would I give up by using the smaller connector with either of those debuggers?  It looks like you can access the Embedded Trace Macrocell (ETM) with any XDS debugger (though not with ICDI).

    As alluded to above, no functionality would be lost.

    I see that it has active JTAG clocking while the XDS110 does not, but I don’t think the TM4C129 supports that anyway.  Is the XDS200 faster even without active JTAG clocking?  Is that a big advantage when loading software, or is the load time already dominated by the time it takes to program the MCU’s flash after it is transferred

    I believe the flash programming is the main bottle neck but the SW Tools team can comment otherwise if I am wrong.

    Does the ETM give backtrace capabilities (after a crash) with CCS? 
    Does the XDS200 provide significant advantages over the XDS110 when used with the TM4C129?
    Are there functional advantages of the XDS200?

    I will need to defer to our SW Tools experts for these questions. I will ask them to comment on this thread.

    Best Regards,

    Ralph Jacobi

  • It looks like you can access the Embedded Trace Macrocell (ETM) with any XDS debugger (though not with ICDI)

    I think the only XDS debugger which supports ETM is the XDS560v2 PRO TRACE Receiver & Debug Probe. However, looking at the product page Cortex-M devices are *not* listed in the supported devices for ARM core pin trace (ETM).

    The XDS110 and XDS200 do support SWO Trace for Cortex-M devices. SWO Trace is low-bandwidth trace compared to ETM, but does allow tracing without having to instrument the code.

    Segger have the J-Trace PRO Cortex-M , but looking at the Full J-Link/J-Trace Support CCS doesn't support SWO or ETM trace with a J-Trace.

    Trace Analyzer User’s Guide has some information about trace in CCS, albeit it was last updated in 2014.

  • J-Trace that Chester mentioned is a good option for using ETM.   There is a version called J-Trace Pro Cortex M.  You would not be able to display the trace data inside CCS but SEGGER provides a tool that can be used for that called Ozone.

    You would need to put the appropriate header on your board.  Typically the Arm Cortex 20pin header would be what you would use.  Basically, it is the Arm Cortex 10pin header with extra pins for trace.  You could connect an XDS110 to the debug side of the header and do the majority of your debugging in CCS.  Then when you want to do trace you would disconnect the XDS110 and connect your Segger J-Trace.

    For TM4C I would not bother with an XDS200, the XDS110 is good enough.

    Regards,

    John

  • Thanks for the follow up, Ralph.  I'm still trying to sort through things.

    Ralph Jacobi said:

    There are no added benefits provided by using a larger than 10-pin header. The TM4C1290NCPDT does not have additional JTAG connections to plug into with more advanced headers.

    I agree that there are no more JTAG connections, but for trace it might be a different story...

    Ralph Jacobi said:

    There is a note about EMU0/EMU1 being dependent on target device and I am not sure why it is mentioned like that as those pins are not used for TM4C devices.

    System Design Guidelines for the TM4C129x Family of Tiva C Series Microcontrollers (SPMA056) says:

    The TM4C129x family of microcontrollers includes ARM's Embedded Trace Macrocell (ETM) for instruction trace capture. Trace data is output on pins TRD0-3 and clocked with TRCLK.

    Tiva TM4C1290NCPDT Microcontroller DATA SHEET (DS-TM4C1290NCPDT-15863.2743 SPMS429B) Table 25-3 shows that pin 44 (PF2) can be assigned to function TRD0, with the description "Trace Data 0".  TRD1, TRD2 and TRD3 are there as well.

    So perhaps the trace data is available, but on pins named TRD0-3 rather than EMU...?  

  • Thanks, Chester.  I conflated the ETM and a ETB and ended up thinking that the TM4C129 had both.  It appears to me now that it does have an ETM but does not have an ETB.  So the info about the XDS110 being able to access the ETB isn't relevant.  I'm thinking that the ETM is what puts trace info on the TRD0-3 pins (see my response to Ralph Jacobi).

    The TM4C1290NCPDT does support "Serial Wire Trace" (SWO), through the ATB (ARM trace bus) interface.  It appears to use a subset of the same pins as JTAG, so should be supported using the 10-pin connector.  As you noted, it should be readable using a XDS110.  It looks like the XDS debugger creates a virtual COM port which you then tell your debug software to read from (per the Trace Analyzer User’s Guide, SPRUHM7B).  That guide suggests that SWO trace supports four configurations in CCS:

    • Statistical function profiling
    • Data variable tracing
    • Interrupt profiling
    • Custom core trace

    It says that the PC trace configuration, which it suggests helps with figuring out what happened leading up to a crash, is not supported.  But in section 3.3.12 it talks about the custom core trace configuration which is supported.  It isn't clear to me how it is different (other than supporting only a single core).  Does it also provide some sort of backtrace capability?  That is what I am most interested in.  If SWO trace will do it (even if with a limited buffer size and slow speed), the 10-pin connector and XDS110 will be enough for what I care about.  Otherwise, I think I'll keep pursuing ETM support as a way to get there.

  • Thanks, JohnS, for the info on J-Trace and Ozone.  The price for the hardware is pretty steep, but I've burned that much in time searching for a bad bug, so it probably makes sense.  At least Ozone is included free.

    Regarding connectors, for space reasons I would like to stay with a 0.05" pitch.  The J-Trace PRO comes with a Cortex-M Trace Reference Board which appears to have such a connector (https://www.segger.com/products/debug-probes/j-trace/).  I presume its pinout is the one shown here: https://www.segger.com/products/debug-probes/j-link/accessories/adapters/19-pin-cortex-m-adapter/.  But that pinout doesn't show the extra (besides JTAG/SWD) pins I expected to be needed for full hardware trace.  I must be missing something.

    I found a similar pinout in Figure 26 of https://www.ti.com/lit/an/spma056/spma056.pdf, it appears that TI suggests connecting the trace signals on pins 14, 16, 18 and 20.  Are they not used by the J-Trace debugger?  Maybe the adapter I found just isn't connecting them, but the debugger's included 0.05" pitch cable does.

  • That adapter looks like it is for connecting an old style ARM20pin to the newer Cortex 20/19 header.  The ARM20 does not support trace so those pins would not be connected.

    The J-Trace probe has both Cortex 20 and ARM20 connectors and cables.

    In the picture above you can see the small cortex20 (0.05 pitch) connector on the left and then the old style ARM20 (0.1 pitch) on the right.  You don't want to be using an adapter.  You want to put the 0.05 Cortex20 header on your board and then connect it to the J-Trace with the supplied cable.  No need for an adapter.

    Regards,

    John

  • Hi Steve,

    I was looking through the JTAG section and for some reason that isn't even mentioned there, you have to find it in the pinout like you did which is why I did not realize that.

    I did some more digging and this prior post from Bob Crosby gives some detail about what to expect with these pins: https://e2e.ti.com/support/microcontrollers/arm-based-microcontrollers-group/arm-based-microcontrollers/f/arm-based-microcontrollers-forum/642010/tm4c123gh6pm-about-the-embedded-trace-macrocell-and-jtag-swd/2370532#2370532

    I don't really have more details myself as I've never tried to use that interface and it's not really well documented. Since it is ETM Trace macro-cell perhaps Arm would have more documentation on the use cases.

    Best Regards,

    Ralph Jacobi

  • Thanks again, guys, for your help figuring all of this out. I'm going to try to summarize things based on what you have written and what I have been able to dig out of the various docs. Please correct me if I get it wrong. I left a few questions, but I may have to wait and test to get answers. If anyone out there has the ability to update documentation, this seems like a good place to start :)

    • It seems that the 10-pin 0.05" pitch Cortex Debug Connector is very commonly used and handles everything except ETM.
    • The 20-pin 0.05" pitch Cortex + ETM connector adds the connections needed for full external trace, which can be utilized with a J-Trace or maybe XDS560v2 PRO TRACE (check whether it is supported). You can plug into just half of it with a 10-pin cable. It has lots of ground connections arranged so the ribbon cable will have ground between the high speed signals, which probably helps reduce crosstalk and EMI.
    • Both of those connectors (and only those) are documented in "System Design Guidelines for the TM4C129x Family of Tiva C Series Microcontrollers" (SPMA056) from 2013. Perhaps TI should have stopped there...
    • "Using TM4C12x Devices Over JTAG Interface" (SPMA075) shows JTAG connections for four different debug connectors including the 10-pin Cortex Debug Connector. But even though it was updated later (2016) it doesn't include the Cortex + ETM connector, which is arguably the next most useful. And on the two 20-pin connectors it does include, it omits the trace connections, which may lead to someone using one of those connectors but failing to realize the potential benefits (trace) enabled by the extra pins (the omission does clarify which pins are essential for _JTAG_ specifically, which might be their point).
    • The XDS110 debugger natively uses the CompactTI 20-pin connector (CTI-20), which uses a different pinout but the same physical connector (keyed differently) as the Cortex + ETM connector. The XDS110 includes adapters for both of the Cortex connectors. The CTI-20 pinout includes four GPIO pins, but no pins for trace, which makes some sense on the XDS110 since it supports GPIO and not trace. Perhaps it would make sense to connect those GPIO lines to the pins on the TM4C129 which do support trace so you could make an adapter and use a J-Trace if you ever wanted to (the included adapter would not work for that, per Table 4 in sprui94). Or just use the Cortex + ETM connector instead and get compatibility with the 10-pin connector as a bonus.

    Regarding a UART connection to the target:

    • The ICDI debuggers built into the TM4C123 and TM4C129 Launchpads support a UART connection. The target MCU's UART0 is connected to the debug MCU which provides access to it using a USB-based virtual COM port. In my experience, this UART is much faster (at 115200 baud) than the default method which printf() uses to transfer console I/O. And it is possible to set it up to be interrupt driven and use a large SRAM buffer, in which case it can be used while causing only minimal delays in program execution (search for "UART_BUFFERED" in uartstdio.h). I have used it extensively and found it surprising that UART connections aren't included on every debug connector.
    • SPMA056 mentions that "On some Tiva C microcontroller development kits, the 2x10 0.05 in pitch connector is used, however PA1 (U0TX) is connected to pin 14 (TRD0) and PA0 (U0RX) is connected to pin 16 (TRD1) in order to provide a debug UART interface to TI's on-board ICDI." I think the result is that the first 10 pins are compatible with the Cortex Debug Connector, and the second half adds UART0 connections instead of the trace signals, making it incompatible with the standard Cortex + ETM connector (a couple of jumpers could make it switchable). I don't know of an external debugger which uses that pinout, but in theory it would provide debug+UART.
    • The XDS110 does not have UART connections on its native 20-pin connector, but it does on a separate 12-pin connector (along with the four GPIO signals and some other stuff). The example I saw using it utilized jumper wires.
    • It seems to me that two pins could be added to the Cortex headers (in positions 0 and -1) with the UART signals. Future debuggers could utilize 12 or 22 pin cables rather than 10 or 20 and connect to them, while existing debuggers could still connect normally. The key would still work to prevent incorrect connections. I'm going to arbitrarily suggest that pin 0 be UART data transmitted by the target MCU to the host and pin -1 be from the host to the target MCU.

    There are apparently several other alternatives to using UART0 for console and debug messages. Some info about these came from https://mcuoneclipse.com/2016/10/17/tutorial-using-single-wire-output-swo-with-arm-cortex-m-and-eclipse/ and the pages it links to. I haven't used any of them:

    • If the target supports USB and your software works enough to set it up, it may be possible to send console/debug messages that way.
    • If using a Segger J-Link (or presumaby J-Trace), you can do bi-directional I/O using their Real Time Transfer (RTT) feature.
    • If using SWD rather than JTAG for debugging, debug messages can be output through SWO. This should work with the TM4C129 and XDS110. If this works well enough, I probably won't care much about having a UART connection, although this method is output-only (OK for debug messages, not for a command-line interface). I think it might support outputting different kinds of messages on separate "channels" which would be handy. Can anyone speak to how this method compares with using UART0, specifically regarding transfer speed and the ability to make it run without blocking (interrupt driven output from a buffer)? Looking at the implementaion of function port_wait() here, https://software-dl.ti.com/ccs/esd/documents/xdsdebugprobes/emu_swo_trace.html, it seems clear that it could be polled. But does it have a hardware-fifo? Is there an interrupt which can fire when the fifo empties (or the current transfer finishes)? I can't seem to find much in the MCU data sheet about the ITM (Instrumentation Trace Macrocell) hardware which issues the logging messages. Is there another reference document for it somewhere? Has someone written a modified printf (similar in concept to UARTprintf for the UART) which utilizes it? In the sample code, the format for calling function ITM_put_string is rather ugly, but perhaps it would be straightforward to wrap in something nicer.

    So it seems to me that the primary reason one might choose the CTI-20 connector is that you don't need an adapter to use it with the XDS110 (and similar) debuggers. But most folks use the 10-pin Cortex Debug Connector, and if you want trace, you can use the 20-pin Cortex + ETM connector and either plug into just half of it to get the same debug interface as the 10-pin version, or you can plug into the whole thing for debug + trace (with a debugger which supports both). None of those (except a non-standard modification to the Cortex + ETM pinout) support a UART connection for debug/console messages, but there are other options for that (separate wiring or several options using the debug hardware interface).

    Whatever trace features are available with SWO, which is completely separate from the external hardware-based trace mentioned above, can be accessed using any of the debugger connectors. You have to use SWD rather than JTAG mode and a debugger which supports SWO trace (such as the XDS110).

    Did I get all of that right? Did I miss anything important? Is anyone using SWO trace to output debug messages? Does anyone know if the "custom core trace" configuration in CCS which is supported through SWO supports backtrace after a crash, either with CCS or some other debugger software?

    My current plan is to implement the 20-pin Cortex + ETM connector (perhaps with added UART pins) and use an XDS110 with an adapter. Since the SWD/SWO signals are relatively slow, I think the adapter will be OK. I'll try SWO trace and see what it gives me. If I ever need hardware-based trace, I should be able to get a J-Trace and plug it in directly. Does that make sense? Would I be better off with a low-end J-Link than the XDS110 for debugging, either because the hardware is different or because of the associated software capabilities?

    Thanks again for your help.

  • Steve,

    The XDS560v2 PRO TRACE does not support the ETM on TM4C. 

    My personal opinion would be to start with the XDS110.  It is cheap.  With your plan for the 20pin cortex you can just connect the 110 to the debug half of the header.  The 110 comes with a cTI20 to Cortex 10 adapter.  I use those all the time.  Then if the need arises you have the option of purchasing the SEGGER J-Trace Cortex M and connecting it to the full 20pin header.

    As you found the standalone XDS110 has a little breakout board that has pins that you can connect a UART to.  So you could just expose a couple pins on your board and then connect wires to the breakout board and not have to worry about messing with pins on the standard Cortex 20.

    Regards

    John

  • Can anyone speak to how this method compares with using UART0, specifically regarding transfer speed and the ability to make it run without blocking (interrupt driven output from a buffer)? Looking at the implementaion of function port_wait() here

    CCS/MSP-EXP432E401Y: Statistical Function Profiling using a XDS110 causes CCS to hang if try any select a Sampling Interval of 832 cycles (or lower) has some information about the SWO baud rate seen in previous tests. With a XDS110 and a TM4C129 device:

    • Under Windows the SWO baud rate was set to 15 Mbaud
    • Under Linux the SWO buad rate was set to 4 Mbaud

    Not sure if anything has changed since that previous investigation to increase the available baud rate of the XDS110.

    The referenced thread also highlighted that the XDS110 had higher SWO trace performance than a XDS200.

    As for can the SWO can be interrupt driven for sending messages, according to Cortex M3 ITM trace on the ARM forum the answer is No.

  • Thanks, Chester, for following up on that.  It is kind of a shame that there isn't an interrupt for it.  Perhaps 15Mbaud is fast enough for me to not care that it blocks a bit.  And when something crashes, it is nice if all of the messages which have been printed have actually been output and aren't just sitting in a buffer waiting to be sent.  If it is really important for it to be non-blocking, I suppose a timer interrupt could pull from a buffer and feed the ITM; it might just have to fire really often (assuming there is little/no hardware fifo).  If it fires often enough, it will essentially become blocking (except to higher-priority interrupts).  Fun!

  • Thanks, JohnS.  Going that way leaves my options open; I can spend more money for trace if/when it is justified.  

    Since I last wrote, I found a bunch of YouTube videos which pretty much confirm the things we figured out earlier in this thread.  They were more straightforward than all of the written documentation I found.  The videos from Lauterbach were especially good (I'll provide a link in another message).  I think TI would do well to update their existing documentation (not just create something additional) to clarify things.  A table showing what is and isn't supported by various debuggers and connection types would be a good starting point.  At a minimum, it should include info like this:

    • ICDI debuggers (like the discontinued XDS100 and the debugger built into the TM4C12x Launchpads):  Very slow.  Support JTAG debugging, but not SWD and therefore not SWO trace.  Requires at least the four JTAG pins, which are most commonly made accessible using the 10-pin 0.05" Cortex debug connector (but see the info below on the 20-pin Cortex+ETM connector).
    • SWD debuggers (like the XDS110 and XDS200).  These also support JTAG debugging, but are commonly used in SWD mode which provides similar debugging functionality using only two of the four pins that JTAG uses.  Either of the connectors mentioned above can support both JTAG and SWD.
      • Some Cortex-M3/M4 MCUs (but not the TM4C12x) provide an ETB (embedded trace buffer) which stores (a limited amount of) instruction trace information in the MCU's internal memory.  It can be used for things like tracing back from a crash to find which code path was taken immediately before the crash.  The ETB info can be read using the SWD pins, so no additional hardware is necessary.  Do not confuse the ETB with the ETM (more on that below).
    • The TM4C129 supports SWO trace, which provides a number of useful features (unfortunately not including instruction trace).  It is available only when using SWD (not JTAG) mode for debugging because it repurposes one of the JTAG signals which is not needed for SWD to be the SWO output.  So if you wire the debug connector for JTAG, you can use SWD and SWO.  The XDS110 and XDS200 both support SWO; the XDS100 and the TM4C12x Launchpad's ICDI debuggers do not.  For more info about what you can do with SWO trace, see https://www.youtube.com/watch?v=HG_i_uln6Es
    • The TM4C123 does support instruction trace using five additional signals from the MCU's Embedded Trace Macrocell (ETM), namely TRD0-3 and TRCLK.  There is no room for them on the 10-pin Cortex debug connector, but there is on the 20-pin Cortex+ETM connector.  See SPMA056 for information about wiring either connector.  Conveniently, half of the 20-pin connector is exactly the same as the 10-pin connector, so you can plug a 10-pin debug connector into one side of the 20-pin header if you aren't using the ETM trace.

    If using the 10-pin connector (or half of the 20-pin), the XDS110 debugger is probably the best choice from TI, perhaps even preferable to the (more expensive) XDS200.  A SEGGER J-Link debugger is another popular choice.  Pros and cons...???

    If using the 20-pin Cortex+ETM connector, the above debugger options will work for debugging but will not provide instruction trace. 

    • TI apparently doesn't sell a debugger which will do trace, so you have to look to a third party for that.  One advantage of using a third-party debugger is that it may work with other brands of Cortex-M MCUs as well.  Some companies also seem to sell debuggers which work on both Cortex-M and other (perhaps multi-core) MCUs.
    • The SEGGER J-Trace Pro Cortex M will provide trace and is seemingly very common.  While basic debugging should work in CCS, to take advantage of trace you may need to use SEGGER's Ozone software (no cost for that if you have the debugger).
    • Keil similarly makes a debugger specifically for Cortex-M MCUs (the ULINK-Pro).  They also make a toolchain (IDE, compiler, etc.) which is an alternative to CCS.  I don't know if CCS works with their debugger.
    • Lauterbach makes the uTrace debugger for Cortex-M MCUs.  It works with their TRACE32 software.  I don't know if it works with CCS.
    • There are probably others.  Sorry if I missed you.

    Note that the XDS110 uses a 20-pin connector which has a different pinout than the Cortex+ETM connector.  It does not generally include the ETM signals, so to support trace you should use one of the Cortex connectors and pinouts instead.  The XDS110 does come with adapters which will work with either of the Cortex connectors.

    It would be nice to have a speed comparison of all of the debugger options, but I don't have that info.

  • For anyone trying to dig deeper, retrace my steps, or update the docs (please!), these URLs lead to most of the TI info I found:

    These were referenced by others in this thread:

    And I just recently found these:

  • Does anyone know if the "custom core trace" configuration in CCS which is supported through SWO supports backtrace after a crash, either with CCS or some other debugger software?

    The limitation of the Cortex-M ITM which outputs SWO messages is that the only automated PC sampling operation is "statistical" sampling where the current PC is sampled at a regular interval. This is suitable for profiling to see which the percentage of time seen in different functions, but not sufficient to allow the backtrace to be determined as may miss some branches.

    I found Cortex-M Trace Training and ARM-ETM Training from Lauterbach with more information on tracing which explains this.

  • Thanks.  It all makes sense once you sort through it, but it is rather confusing until then:

    • If the MCU has an ETB, it can output instruction trace info with just the two SWD pins.  But the TM4C129 doesn't have an ETB.
    • Using the SWO pin, you can do "SWO Trace", which can sample the PC but only periodically so it isn't useful for instruction trace.
    • If the MCU has an ETM, it can drive an additional clock line and four trace lines with info which allow an external debugger to do instruction trace.  But TI's debuggers don't support it, the 20-pin TI CTI connector which TI's XDS110 debugger uses doesn't support those signals, and the 60-pin connector used by TI's debuggers which do support trace (for other MCUs) isn't commonly supported by the third-party debuggers which do support trace on the TM4C129.

    Thanks for helping me figure it all out!

    Steve