This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CCS/DRA750: DSP crashes if trying to start Hardware Trace Analyzer

Part Number: DRA750

Tool/software: Code Composer Studio

A customer is facing issues if trying to use Hardware Trace Analyzer to collect function profiling information for DSP1 of the DRA75x. His full-blown application is running on the target system when using CCS v8.3.0 to connect to the DSP1. The used CCS target configuration was set up for the DRA75x_DRA74x and the DSP1 initialization GEL file was removed. If now starting the Hardware Trace Analyzer, CCS reports the below shown failure. Sometimes this error message pops up, sometimes not.

In any case the DSP crashed afterwards. Console prints of GEL functions and the final error messages are captured in this file:

PCtrace_failure.zip

I could reproduce this problem with my CCS installations (version 8.3.0 as well as 9.0.1) on the customer HW, but everything works fine for me on the J6EVM using GEL files for the board initialization. My current assumption is that some initialization may be missing on the customer board that's required for starting the Hardware Trace Analyzer.

Can anybody clarify what may cause this 'EnablePort failed' error when starting HTA? Probably the DSP crash is just the final result of some failing DSP accesses.

Best regards,

Manfred

  • Manfred,

    Thanks for reporting this. Can you or your customer capture the Trace logs as mentioned in the reference below?
    processors.wiki.ti.com/.../Troubleshooting_CCSv7

    This may shed some additional light on the problem.

    The error message is compatible with the scenario where the device's debug subsystem is powered down in the middle of a debug session.

    One additional question: when the GEL files were removed, did you keep the ones in the ICEPICK and DAP cores? These will be necessary for a proper Trace session.

    Regards,
    Rafael
  • Rafael,

    we kept all GEL files of the DRA75x_DRA74x target configuration except those of the DSP1 core. I will provide the trace logs after Easter.

    Regards,
    Manfred
  • Rafael,

    trace logs of this error case are attached.

    TI-trace.zip

    Regards,

    Manfred

  • Manfred,

    Thanks for sending the logs. I will provide them to the Emulation team for analysis.

    It seems we have a similar scenario: I removed the C66xx_DSP1 GEL file (which also loads two others more) and simply ran a PC Trace capture without problems. I am using the Vayu EVM and the same XDS560STM Traveller.

    One detail that may be happening in your customer's board is power. I am powering the EVM with a bench power supply that also measures current. When running the GEL and enabling the different subsystems on the device, I notice its ammeter showing spikes here and there during the various initialization routines. A long shot, I know, but may be one difference that is isolated to the custom board.

    One additional idea: can you try to experiment with a lower TCLK speed? A custom board may present a slight variance in the JTAG data path and potentially cause some issues with the data traffic. That is a bit harder to verify, but it is an easy parameter to change.

    I will keep trying to "break" my board here and report back any findings.

    Regards,
    Rafael
  • Manfred,

    Two additional details that could be useful to try and isolate the issue:
    - Do you have the Device ID of the device present on the custom board? That can be obtained by reading the first register when the ICEPICK core is connected.
    - I suspect the .ccxml file you are using is similar to the one attached, is that so?

    One additional detail: could you perhaps ask your customer to close CCS and wipe the temporary trace data files? Or perhaps even try with a different workspace? That sometimes helps with some of these hard to track issues (although you are experiencing the same issues as well). Details at the section "Debug" of the Troubleshooting page below:
    software-dl.ti.com/.../ccs_troubleshooting.html

    One last note: between tests, are you power cycling the board? I sometimes run into issues when I attempt a Trace connection which fails and keeps failing until the board is fully reset.

    Regards,
    Rafael
  • Rafael,

    attached below is a new trace file recorded after all temporary trace files were deleted before.

    TI-trace(2).zip

    I asked the customer to test with reduced JTAG clock frequency of 5MHz but results are still pending.

    When comparing the customers trace logs with my own ones done on a J6EVM, I can always see the same kind of problems. First error is a write access to unlock CS_TF which fails. Afterwards I can also see errors when reading from some address (0x01BC013C).

    M     17:58:14:544 | Kelvin export: Start EnablePort()

    M     17:58:14:544 | Kelvin export: Start ConfigureTraceFunnel()

    M     17:58:14:544 | cTools: Protected Target Adapter memory write pushed to Q - page 0 address 0x54164fb0 length 4 value 0x0.

    M     17:58:14:544 | cTools: Protected Target Adapter - WaitUntilProcessed.

    M     17:58:14:544 | cTools: Protected Target Adapter [Spectrum Digital XDS560V2 STM TRAVELER Emulator_0/C66xx_DSP1] processing request pReq = 0x60a7b1f0, CurrentThread = 33980.

    M     17:58:14:544 | cTools: TargetAdapter vptr = 0x5180ec2c

    M     17:58:14:544 | cTools: Protected Target Adapter processing write memory request- page 0 address 0x54164fb0.

    M     17:58:14:544 | cTools: TargetAdapter vptr[eMemWrite] = 0x5159f520

    M     17:58:15:065 | cTools: Protected Target Adapter processing write memory request- page 0 address 0x54164fb0 done status -2.

        E 17:58:15:065 | Kelvin export: Unable to unlock CS_TF

    M     17:58:15:065 | cTools: Protected Target Adapter memory read pushed to Q - page 0 address 0x1bc013c length 4.

    M     17:58:15:065 | cTools: Protected Target Adapter - WaitUntilProcessed.

    M     17:58:15:065 | cTools: Protected Target Adapter [Spectrum Digital XDS560V2 STM TRAVELER Emulator_0/C66xx_DSP1] processing request pReq = 0x60a7b5e0, CurrentThread = 33980.

    M     17:58:15:065 | cTools: TargetAdapter vptr = 0x5180ec2c

    M     17:58:15:065 | cTools: Protected Target Adapter processing read memory request- page 0 address 0x1bc013c.

    M     17:58:15:065 | cTools: TargetAdapter vptr[eMemRead] = 0x5159f470

    M     17:58:16:117 | cTools: Protected Target Adapter processing read memory request- page 0 address 0x1bc013c value[0] 0x6080b878 done status -2.

        E 17:58:16:117 | cTools2000: Memory read failed

    I think it would be most important to understand why this write access to unlock CS_TF could fail.

    Best regards,

    Manfred

  • Customer now reported no change of these problems after reducing the JTAG clock frequency to just 5MHz.
  • Manfred,

    Thanks for the details. The Emulation team did not yet return to me.

    However, one critical piece of information still missing is the Device ID - in the past we were caught by different Device IDs that caused issues with the tool's functionality, especially in this case where the devices on the development kits work.

    Would you mind sending this information?

    Also, are you able to use Trace in another core of the customer's device?

    Thank you,
    Rafael
  • Rafael,

    below is the DRA75x Device ID information you asked for.

    DIE_ID[31:0]   - 0x1500C005

    ID_CODE        - 0x2B99002F

    DIE_ID[63:32]  - 0x016B10B0

    DIE_ID[95:64]  - 0x094E0906

    DIE_ID[127:96] - 0x3FE80122

    PROD_ID        - 0x2E6408F0

    When trying to enable PC Trace for DSP2, the same error message is shown: "Could not run analyzer on C66xx_DSP2. Cause: EnablePort failed."

    The customer couldn't stop one of the A15 cores and try to enable PC Trace on this core because in that case the complete system is always restarted. Looks like there's still some watchdog mechanism which wasn't disabled for testing.

  • Manfred,

    Please apologize for the delay; releases prevented me from working on this.

    I couldn't yet source a SiRev 2.0 board to definitely test this in my desk, but the tools certainly are properly configured to support this ID_CODE via masks - otherwise we would have heard from other groups already.

    In your tests are you using a board with the same ID_CODE as the customer?

    Regards,
    Rafael
  • Manfred,

    Thanks again for the patience. I was able to source a suitable board and run Trace on the C66x DSP.

    However, make sure the Trace temporary files are complete erased, so you can start from a fresh standpoint. These temporary places are:

    %HOMEPATH%/.TI-trace

    %TEMP%/.TI

    I am using CCSv9.0.1 with all stock components, and the DRA7x EVM Rev H (Si Rev 2.0) and a XDS ProTrace.

    This rules out any inherent issue between CCS and the Si Revision 2.0. 

    With this, I am not entirely sure how to proceed from here, unless the project is shared by the customer (and it runs on a known good hardware such our EVMs). 

    Regards,

    Rafael

  • Rafael,

    thanks for your additional tests and attempts to reproduce the problem.

    Meanwhile I have the suspicion the problem with enabling PC trace for the DSP could be somehow related to usage of MMU0 inside the DSP subsystem. The customer is using this MMU to prevent the DSP of accessing memory ranges which are not allocated to the DSP. For testing the MMU was disabled before enabling PC trace and now the trace could be started but DSP crashed sometime later after resuming program execution. I think that crash may be caused by missing address translations when MMU0 is inactive.

    Could that mean CCS needs to execute some register accesses via DSP1 in order to enable trace for this core? If that's really the case, could you provide the information which additional address ranges should be enabled in MMU0 to not block needed register accesses for enabling trace functionality of the DSP?

    Regards, Manfred

  • Manfred,

    I still couldn't find a reliable way to reproduce this, but your theory makes a lot of sense: the ability to have access to the TBR may have been impaired by the enabling of the MPAX.

    Looking at table 2.2 of the DRA74x TRM you can see the address of the Debug Subsystem registers, including the CT_TBR.  

    How is the MPAX configured in the customer code? This may bring some light to the specific root cause.

    Regards,

    Rafael

  • Manfred,

    Sorry about the delay; I was out last week but got a reply from the develper. 

    The address ranges that they would need to open up on the MMU would be:

     C66xx_0 or C66xx_1

    1] CSTF : 0x5416_4000 – 0x5416_4FFF

    2] Trace Buffer : 0x5416_7000  - 0x5416_7FFF

    3] TPIU  : 0x5416_3000  - 0x5416_3FFF

    4] PLL (for TPIU Sink) : 0x0231_0000 – 0x0231_1000

     

    In addition to the above, the following are used but part of the DSP private memory space:

    1] ADTF : 0x01D0_F000 – 0x01D0_FFFF  (this is part of the DSP private memory space_

    If your customer still runs into a problem after opening up the ranges above, please have them create a debug server log file so we can see what accesses are failing.

    Regards,

    Rafael

  • Hi Rafael,

    I think it should be possible to add MMU translations for address ranges 1-3. I don't know how access to ranges 4) and 5) could work because these addresses are all reserved in the DSP memory map of DRA750. Is it really required to access this PLL (for TPIU Sink) and ADTF by the DSP or will that be done just by the DebugSS?

    Regards, Manfred

  • Manfred,

    I am trying to find a testcase that I could test all this - at this point I am covering the aspects from a pure theoretical perspective.  

    Just like you, I also found out these addresses belong to a reserved memory space. In this case, I suspect they belong to the memory space of the ICEPICK router (or debugSS) and therefore may not require to be remapped - the TRM does not mention the TPIU and the ADTF anywhere else other than the DebugSS, whose clock is enabled via the register CM_EMU_CLKSTCTRL and is controlled via a series of other CLK registers. 

    Were you able to make Trace operational in the DSP core just by enabling 1~3 above? 

    Regards,

    Rafael

  • Manfred, 

    I couldn't find a MMU example for C66x cores to try and experiment with its configurations and validate this proposition. 

    Therefore I am stuck, unless you have code that you could share offline. 

    Given this scenario, I will close this thread at this time, but please feel free to reopen if you get a breakthrough - I will certainly do the same. 

    I apologize for the inconvenience,

    Rafael

  • Rafael,

    I found some time to look into this problem again. Meanwhile I was able to reproduce the failure of starting PC Trace by configuring the DSP MMU0 and only enabling all address ranges used by my test program. If now starting PC Trace, I'm seeing exactly the problem reported by my customer. If adding another TLB entry to allow accesses to address range 0x54100000.. 0x541FFFFF, it's now possible to successfully start PC trace.

    But in my opinion that shouldn't be the final solution because it requires an additional entry to the customers MMU configuration, means that requirement for using PC Trace has to be documented, and a change in customer source code is required just for using a debugging feature of the device and CCS.

    What worked for me as well (instead of using this additional TLB entry) was disabling the MMU0 in register DSP_SYS_MMU_CONFIG, starting PC Trace, and afterwards enabling the MMU0 again. I think that should be doable by CCS if configuring the Hardware Trace Analyzer for one of the DSP cores of DRA7xx devices. What do you think?

    Best regards,

    Manfred

  • Manfred,

    At first glance it is certainly possible - unfortunately I lack the proper test case here to try this out, but it would be certainly interesting to verify.

    For a definitive fix I would have to check with the dev team - this may well be applicable to other devices as well, therefore guaranteeing a more generic/broad solution. 

    I will return to this thread. 

    Regards,

    Rafael 

  • Rafael,

    I've generated a very minimal test case for reproducing this problem with PC Trace startup. The complete CCS project including the debug binary is attached below. For reproducing the problem just follow these steps.

    1. In CCS launch a target configuration to connect to a J6 EVM
    2. Connect to CortexA15_0
    3. Scripts -> DRA7xx MULTICORE Initialization -> DSP1SSClkEnable_API
    4. Connect to DSP1 and load the provided 'PCTrace_testcase.out' program
    5. Break program execution after function dsp0_mmu0_config()
    6. If now trying to start PC Trace, you should get the error message "Could not run analyzer on C66xx_DSP1. Cause: EnablePort failed."

    There's already a possible fix implemented in the MMU configuration which can be enabled in 'mmu.c' by setting PCTRACE_FIX to '1'.

    Alternatively you can also test my preferred option to solve this problem by keeping PCTRACE_FIX set to '0', using the same breakpoint after execution of dsp0_mmu0_config(), and setting register DSP_SYS_MMU_CONFIG (DSP local address is 0x01D00018) to '0' before starting PC Trace. After PC Trace came up, you need to set DSP_SYS_MMU_CONFIG to '1' again to enable the MMU and continue the DSP program execution.

    Best regards,

    Manfred

    https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/81/PCTrace_5F00_testcase.7z

  • Manfred,

    Please apologize; I missed your last reply. 

    Let me try your testcase and I will report back with my findings. 

    Regards,

    Rafael

  • Manfred,

    I will be able to take a look at this today. Sorry for the delays. 

    Regards,

    Rafael

  • Manfred,

    Thanks again for the patience and for sending the testcase. I filed today the bug number DBGTRC-5152. In about half hour, please check its status in the link SDOWP in my signature below. 

    I was able to reproduce this issue in both the DRA750 and DRA750P. I will try to test this in TDA320 as well. 

    I apologize for the inconvenience,

    Rafael