TMS320DM8168: DDR data verification error at 0x80000003 on DM8168 (1 board out of 40)

Part Number: TMS320DM8168

Tool/software:

Hi Team,

  1. We designed a board using the processor TMS320DM8168CCYGA2 (nearly 10 years ago) and the board is in production. Recently we built 40 boards.
  2. Out of those 40 boards, on one board we are facing an issue while uboot flashing. Rest 39 boards are flashed successfully without any issue.
  3. When we try to flash the processor through Code Composer Studio, we first load the NAND writer setting file. At that instant, CCS reports a data verification error at address 0x80000003.                                                    In 39 boards, the same GEL file and NAND writer binary load successfully and NAND flashing works.
  4. Our question is: Why does CCS report a verification error at address 0x80000003 during NAND writer file load?
  5. I have also attached the screenshot of the error.

We would appreciate your guidance on resolving this issue. Looking forward to your helpful response.

Environment

Processor: TMS320DM8168CCYGA2

External RAM: MT41K128M16JT-125 IT: K (DDR3)

Version: CCS Version: 10.1.1

JTAG Probe: Texas Instruments XDS100v2 USB Debug Probe

Already checked below things:

  • We have already checked and compared all the power rails, clock signals and reset lines in the faulty board – everything looks fine.
  • We replaced the DDR termination resistors. Later we replaced DDR device too.
  • We replaced the series resistors of JTAG & EMU interfaces.

 

 

 

  • Hi,

      If it is the same firmware and tool chains (e.g. CCS, XDS100 and etc) being used to program the SoC then we can rule out issues related to firmware and tool chains.

      Can you do a ABA swap test if you think this is a SoC level issue? 

    The A-B-A Swap Method is a simple cross check test, which can confirm the observed issue is not systemic.

    • A-B-A Swap Method
      (1) Remove the suspected component (A) from the original failing board.
      (2) Replace the suspected component (A) with a known good component (B) and check if the original board now works properly.
      (3) Mount the suspected component (A) to a known good board and see if the same faliure occurs on the good board.

    Step 3 is important because it helps us to exclude any possibility that the issue is caused by a systemic issue or the interaction of multiple slightly bad components on a good board.

  • Hi,

    Thank you quick response! As you rightly mentioned, the tool, environments, versions, firmware everything works with 39 other boards, so this is purely hardware issue. Here, we need your assistance in understanding the error. Looking at the error messages, we felt that it is related to DDR0. Hence replaced all the termination resistors and DDR devices, but that did not help. We did not touch resistors and DDR device on DDR1 interface. Do you think we need to replace DDR1 components too?

    Or do you think these error messages give any lead on whih section can be addressed and checked?

    We built 40 boards and 39 are delivered to customer. So we have only 1 board, that is faulty board. We dont have any yeild board to swap with. Anyways we can replace the processor with new processor.

  • Anyways we can replace the processor with new processor.

    Hi,

      Yes, this is what I meant by ABA swap test. Replace the fail board with a known good processor. If it works then it is most likely an isolated issue to the processor, not the resistors and DDR device on the board. Swap the suspected processor with a known good board (one of the 39) and if it fails then it further affirms an issue with the processor. 

  • That is what the issue is! We delivered 39 boards to customer and we cant get them back! A-B-A swap means SWAP the processors between 2 boards. All we have is 1 board, that is the faulty 40th board.

    Only thing we can do is remove the processor and assemble a fresh new processor.

    Because the processor is costly and because the process of assembly is also costly & critical, we wanted to first confirm at discrete level. What if the issue still persit after processor replacement!? What if the issue is with some resistor or capacitor or low density BGA IC like DDR IC? Replacing resistors/capacitors/DDR IC is cheaper than replacing a processor. Hence what we are looking for is : We need to make sure there is no issue with any discretes or other smaller ICs before touching the processor IC.