This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320C6678: How to resolve DSP helloworld exception

Part Number: TMS320C6678
Other Parts Discussed in Thread: SYSBIOS, CDCM6208

Hi,

We are using the four TMS320C6678 DSP processor in our design and trying to perform Ethernet communication between the processors through onboard ethernet switch.

Using the Helloworld project, able to send /receive the packet between the DSP's.

However, Helloworld running in one of the DSP gets crashed after a long run of about 1Hour. Tried to ping from another system, but failed. 

A4=0x30 A5=0x0
C66xx_16] A6=0x1 A7=0x0
A8=0x807e7520 A9=0x0
A10=0x807f0a58 A11=0x2b
A12=0x807f6508 A13=0x8004ecd4
A14=0x0 A15=0x8004ef28
A16=0x8004eca2 A17=0x0
A18=0x8004ec98 A19=0x0
A20=0x0 A21=0x0
A22=0x0 A23=0x0
A24=0x0 A25=0x1
A26=0x0 A27=0x2005020
A28=0x0 A29=0x43
A30=0x42 A31=0x1
B0=0x0 B1=0x1
B2=0x1 B3=0x80910000
B4=0x0 B5=0x90
B6=0x100 B7=0x6c
B8=0x73 B9=0x3a
B10=0x33 B11=0x807e7520
B12=0x33 B13=0x807b4958
B14=0x807f91d8 B15=0x8004eca0
B16=0x30 B17=0x8004ef14
B18=0x58 B19=0x3a
B20=0x40 B21=0x2e5
B22=0xf B23=0x0
B24=0x0 B25=0x3000
B26=0x3000 B27=0x0
B28=0x0 B29=0x3
B30=0x2 B31=0xffffffff
NTSR=0x1020d
ITSR=0xf
IRP=0x807e12c0
SSR=0x0
AMR=0x0
RILC=0x0
ILC=0x0
Exception at 0x80910000
EFR=0x2 NRP=0x80910000
Internal exception: IERR=0x18
Opcode exception
Resource conflict exception
ti.sysbios.family.c64p.Exception: line 248: E_exceptionMin: pc = 0x80910000, sp = 0x8004eca0.
To see more exception detail, use ROV or set 'ti.sysbios.family.c64p.Exception.enablePrint = true;'
xdc.runtime.Error.raise: terminating execution

Have checked the DDR and seems fine. 

Guide me to analyse this crash dump and get to know the reason for the exception. 

Gone through this https://e2e.ti.com/support/processors/f/791/t/163824?How-do-I-resolve-Internal-Resource-Conflict-Exceptions- but different from mine.

Note: The same HelloWorld running on other DSP's without any issue.

  • Hi,

    Please clarify what Processor SDK RTOS release for this issue? Are you running the NIMU_emacExample_EVMC6678C66BiosExampleProject? What kinds of traffic between C6678 and host? 

    Helloworld project, able to send /receive the packet between the DSP's. >>>>>>> C6678 to C6678 test? hello world is a UDP echo test, how do you do continuous test for hours? Will ping works for hours?

    Helloworld running in one of the DSP >>>>>>>>> what core you run the application? core 0? Do you mean you have several boards doing the same test and one of the board crashed?

    The same HelloWorld running on other DSP's without any issue. >>>>>>> What do you mean other DSP? Other C6678 boards?

    Regards, Eric

  • what Processor SDK RTOS release for this issue? 

    mcsdk_2_01_02_06
    pdk_C6678_1_1_2_6
    ndk_2_21_01_38

    Are you running the NIMU_emacExample_EVMC6678C66BiosExampleProject? What kinds of traffic between C6678 and host? 

    I am running C:\ti\mcsdk_2_01_02_06\examples\ndk\hello world project.

    Helloworld project, able to send /receive the packet between the DSP's. >>>>>>> C6678 to C6678 test?
    hello world is a UDP echo test, how do you do continuous test for hours? Will ping works for hours?

    I am not using UDP ECHO TASK. Created UDP client /server task in HelloWorld project and running one DSP as UDP server and another DSP as UDP client. We have onboard 16 PORT ethernet switch which connects all the ethernet ports of DSP processors. One port of ethernet switch is connected to local Lan. From windows machine, we are pinging to DSP IP and checking the link status continuously.

    Helloworld running in one of the DSP >>>>>>>>> what core you run the application? core 0? 

    I am running in core 0.

    Do you mean you have several boards doing the same test and one of the board crashed?

    The code is getting crashed in all the boards.

    The same HelloWorld running on other DSP's without any issue. >>>>>>> What do you mean other DSP? Other C6678 boards?
    

    Have Four DSP (DSP1, DSP2, DSP3, DSP4) on the same board. Except for DSP3, HelloWorld is working fine between other DSP's ( DSP2<->DSP4, DSP1<->DSP2, DSP4 <->DSP1)

    Both server and client HelloWorld application is crashing in DSP3. 

  • Hi,

    MCSDK is obsolete and not supported. Please migrate to latest Processor SDK RTOS for C6678 from http://software-dl.ti.com/processor-sdk-rtos/esd/C667x/latest/index_FDS.html.

    In your board with 4 DSPs, what is the difference of software running on DSP3 other than DSP1, 2, 4? Why DSP3 is crashed but others still working? Or you saw the crash can randomly happen on any DSPs?

    For debug SYSBIOS crash, there are some tips here:

    http://software-dl.ti.com/processor-sdk-rtos/esd/docs/latest/rtos/index_how_to_guides.html#exception-handling

    video for debugging:

    Regards, Eric

     

  • In all the DSP, same ibl and the same application is running. 

    Application in DSP3 alone crashes while the application in other DSP's running continuously.

    I went through the link for debugging the exception. Loaded the HelloWorld object and symbol file into external memory. When started ROV before executing the .out, it throws me below message,

    XDC PATH or XDC Tools location not set. 

    Set Project/global RTSC preference and relaunch ROV. 

    I am using ccs-5.3 and XDC tool 3.23.4.60.

     

  • Hi,

    See some SYSBIOS ROV discussions here:

    https://e2e.ti.com/support/tools/ccs/f/81/t/149431

    Regards, Eric

  • I need to load symbol files along with object file to debug. 

    I couldn't able to find symbol file in the project. Could you tell me the symbol file extension to look into the HelloWorld project? 

  • Hi,

    Is your application booted from some media, like NOR/NAND flash? Or, why you need to load symbol to debug? I thought if you use the CCS/JTAG to load the application in DSP no-boot mode is simpler to debug.

    Anyway, if you want to load symbol, the file is still your executable (*.out), just use "Load symbols" in CCS, instead of "Load Program".

    Regards, Eric

  • We were booting application from NOR but could not see any exception statements in UART. so we removed the application from NOR and booted through CCS. 

    The link which you shared for debugging the exception shows to load symbols and .out in memory. I am just following the same. 

    And ROV still through same error message even after enabling products in preference.  

    Is there any other way to know why the exceptions occurred. 

  • Hi,

    Attached crash screenshot of our application. The crash and exception address is different each time. 

    Kindly share your observation 

    crash-1 

    crash-2

    Exception of Crash-2 

  • Hi,

    Is it always C66x_16 crashed? What is special for this DSP given all run the same code?

    For the crash_1, it looks to be an initialization problem that PA add_mac timeout. If you have this SOC power on reset properly so the PA is idle, you should not come into this timeout. The heap free, not sure if this is the side effects of PA add Mac failure.

    For the crash_2, you need look into B3 and NRP, check the tips in this E2E: 

    https://e2e.ti.com/support/legacy_forums/embedded/tirtos/f/355/t/150087

    Regards, Eric

  • Hi Eric,

    The HelloWorld seems to run constantly when the pll flag is set to one in the EVM_init() function if that is turned off the program crash after some time. 
    In other DSP's PLL flag is turned off and working fine with the application.
    Below is our PLL configuration  in IBL,

    /* Main PLL: 100 MHz reference, 1GHz output */
    ibl.pllConfig[ibl_MAIN_PLL].doEnable      = 1;
    ibl.pllConfig[ibl_MAIN_PLL].prediv        = 1;
    ibl.pllConfig[ibl_MAIN_PLL].mult          = 20;
    ibl.pllConfig[ibl_MAIN_PLL].postdiv       = 2;

    ibl.pllConfig[ibl_MAIN_PLL].pllOutFreqMhz = 1000;

    ibl.pllConfig[ibl_NET_PLL].doEnable       = 1;
    ibl.pllConfig[ibl_NET_PLL].prediv         = 1;
    ibl.pllConfig[ibl_NET_PLL].mult    = 21;
    ibl.pllConfig[ibl_NET_PLL].postdiv        = 2;

    ibl.pllConfig[ibl_NET_PLL].pllOutFreqMhz  = 1050;  
    Can you let me know how this makes a difference?  
    Do you have a suspect on the DSP's processor and its internal memory? 
  • Hi Eric,

    Is their any chip level testing to check the DSP to make sure processor is working fine. 

     

  • Hi,

    Are you able to move the code into L2 and/or MSMC, without using DDR for the test? Where did you initialize the DDR PLL and how many times? The failure happened in the DDR and I wonder if the DDR is stable? 

    Regards, Eric

  • I am initializing the PLL in IBL alone. In EVM_init, I have set the PLL flag to 0.

    Also, have checked full 1G DDR memory and its work fine. This Executed the code from MSMC Ram. 

    Do you have any suspect on PLL configuration? 

  • Hi,

    I am still not clear:

    1. If the failure can happens on other DSPs than DSP 3? Or it only happens on DSP3?

    2. "The HelloWorld seems to run constantly when the pll flag is set to one in the EVM_init() function if that is turned off the program crash after some time. " "I am initializing the PLL in IBL alone. In EVM_init, I have set the PLL flag to 0.". =======> can you clarify:

    a. PLL setting in IBL, EVM_init with PLLflag = 0, DSP3 crash?

    b. PLL setting in IBL, EVM_init with PLLflag = 1, DSP3 crash?

    c. PLL setting in IBL, EVM_init with PLLflag = 0, DSP1, 2, 4 crash?

    d. PLL setting in IBL, EVM_init with PLLflag = 1, DSP1, 2, 4 crash?

    3. Also, have checked full 1G DDR memory and its work fine. This Executed the code from MSMC Ram. ==========> what if hello world moved into MSMC or L2 without using DDR?

    In general, you can only initialize the PLL once. 

    Regards, Eric


  • Hi,

    1. If the failure can happens on other DSPs than DSP 3? Or it only happens on DSP3?

    Failure happens only on DSP3, not on other DSP's.

    2. "The HelloWorld seems to run constantly when the pll flag is set to one in the EVM_init() function if that is turned off the program crash after some time. " "I am initializing the PLL in IBL alone. In EVM_init, I have set the PLL flag to 0.". =======> can you clarify:

    We initialize PLL either in IBL or in the application through EVM_init function, not in both. For debugging this crash, we kept DSP in no-boot mode and initialized the PLL in the application. 

    a. EVM_init with PLLflag = 0, DSP3 crash?

    yes

    b. EVM_init with PLLflag = 1, DSP3 crash?

    No, crash. 

    c. EVM_init with PLLflag = 0, DSP1, 2, 4 crash?

    no crash

    d. EVM_init with PLLflag = 1, DSP1, 2, 4 crash?

    no crash

    3. Also, have checked full 1G DDR memory and its work fine. This Executed the code from MSMC Ram. ==========> what if hello world moved into MSMC or L2 without using DDR?

    We tried the HelloWorld application from MSMC and DDR too. In both the way, the HelloWorld application works If we keep the PLL flag to 1 and crashes if we set the PLL flag to 0.

    The same IBL and application are working fine in other DSP's with PLL flag off. 

  • Hi,

    Thanks for clarification! So this is the DSP3 problem only. Do you have multiple boards showing the same problem or only one board? Also if any hardware problem to this DSP3, like power and clock jitter?

    >>>>>the HelloWorld application works If we keep the PLL flag to 1 and crashes if we set the PLL flag to 0.>>>> 

    >>>>>

    We initialize PLL either in IBL or in the application through EVM_init function, not in both. For debugging this crash, we kept DSP in no-boot mode and initialized the PLL in the application. 

    a. EVM_init with PLLflag = 0, DSP3 crash?

    yes

    b. EVM_init with PLLflag = 1, DSP3 crash?

    No, crash. >>>>>>>

    Still need some clarification on DSP3:

    1. SBL + EVM_init with PLLflag = 0 ======>crash? 

    2. SBL + EVM_init with PLLflag = 1 ======> no crash? This is double PLL programming and should avoid

    3. For no-boot mode, I assume you DON'T use gel for PLL setup:  

    3a. EVM_init with PLLflag = 0, I guess you can't run the NDK as clock is too low

    3b. EVM_init with PLLflag = 1, crash or not?

    Regards, Eric

  • Hi Eric,

    If any hardware problem to this DSP3, like power and clock jitter?

    Same Power and (core,pa,ddr) clock from clock driver is driven by all the DSP's on board . Out of which DSP3 alone is showing crash.

    Today, we tune DDR PLL to work at lesser clock 400MHz instead of 666.6 MHz and helloworld application seems to work for prolonged time.

    And i guess thats the reason when i set PLL flag to 1, helloworld worked without crash.


    If clock jitter is issue, then it should affect all the DSP's on the board.Why DSP3 alone.



    Can you provide your suggestion on this,










     

  • Hi,

    I'm from the software side and you may provide details how the clock are generated to each SOC for our hardware people to look at. You can also comment if issue happened on one board or multiple boards.

    Regards, Eric 

  • HI,

    This is happening in all of our production boards. We are using 4 CDCM6208  clock generator to generate Core/Pass clock, SRIO clock, Pcie Clock, MCM clock. Below is our DDR Clock section. 

    Please share your observation,

  • Hi,

    I will let the hardware expert to comment the DDR clock portion.

    You mentioned that when code moved from DDR to MSMC, it also crashed. The MSMC is clocked from core PLL (CLK/2), not DDR PLL. So there still something disconnected. Is any clock issue to SOC3 core PLL?

    Do you have a comparison of the IBL PLL code and EVM_init() with pllflag = 1 code, if they do the same thing?

    Regards, Eric

  • Vidya,

    I read though this series of posts.  I did not see an answer to Eric's question about additional boards.  How many boards did you produce?  How many have you tested?  Do they all have instability with DSP3?

    Tom

  • Hi Tom,

    We did the production of 10 boards as the initial release. 

    All the 10 boards have DSP3 issue. 

    Voltage and clock measured are the same in all the 10 Boards. 

  • We have configured the Core to 1GHz and sysclk measures 166.6MHz. 

    Both the PLL code is the same in IBL and evm_init. 

    The clock source is common for all the DSP's. We doubt how one DSP is not able to work properly when other DSP's are able to work perfectly.

  • Hi,

    As you have 10 boards and all have DSP3 failure running the same code, it is a hardware issue rather than software. 

    Are you able to isolate this is only a DDR PLL issue? You indicated that reducing the DDR clock speed helped. I would suggest you totally disable the DDR initialization during this test for DSP3, and look at the application memory map to make sure nothing is placed into DDR3, and to see if the test crash or not. 

    Next, scope captures of the clocks into a good and bad DSP also help to see if any clock distribution issue into DSP3. Also, a capture of the CORECLKOUT to verify that clock PLL programmed correctly.

    Regards, Eric

  • Hi Eric,

    As you suggested, we checked the clocks around DSP3 with the one working. DSP3 clocks seem good and we dint find any voltage drop during the failure case. 

    Since the other DSP's are working fine from the same source input, we are not able to narrow down the cause for DSP3 failure.

    what could be the reason for DSP3 DDR instability to work at 666.6 MHz. 

    Could you suggest on this, 

  • Vidya,

    Have you proven that the failure is related to DDR on DSP3?  Eric mentioned that you moved the code to MSMC and it still failed.  Can you re-run this test where you also skip the DDR initialization to prove that your application is absolutely not using DDR?  If we can prove that the intermittent fault is due to the DDR implementation, that is a different path for debug.

    Tom

  • Tom,

    we are able to run a simple application without crashing when using MSMCRAM excluding DDR memory.

    Our actual application requires huge memory as it tests all interfaces (SRIO, PCIe, Ethernet) and its bandwidth. 

    Even we made this application into micro-level, testing one interface at a time i.e., testing only SRIO excluding pice, ethernet. Even this crashes. 

    I suspected that memory is not able to handle when we do continuous write/read cycle so tried at lower DDR clock and the application worked fine. 

    Both the DSP and the DDR chip supports 400,533,666,800 MHz. And the same DDR chip is been used for all the DSP's. 

  • Vidya,

    OK, that does provide indication that the problem is probably in the DDR layout.  This also correlates with the result that DSP3 fails on all 10 boards while the other 3 run robustly.  The KeyStone I DDR3 interface bring-up Application Report (SPRACL8) provides step by step guidance and links to other documents and tools to help commission a robust DDR interface.  Please provide a length report for each DSP showing that the routing rules have been met.  We will need to see the PHY_CALC sheet fora each DSP as well.

    Tom

  • Hi Tom,

    I am colleague of Vidya who worked on DDR3 Interface. I Have attached the length matching spreadsheet of all DSP and DDR3_PHY spreadsheet of DSP3 along with this mail.

    DDR3_Length_Match.xls

    DSP3_DDR3_PHY.xlsx

    Regards,

    Avinash N

  • Avinash,

    Can you provide the same for one of the working DSPs?

    Tom

  • Hi Tom,

    I have attached the DSP1 DDR3 PHY spreadsheet  along with this mail for your reference.

    Regards,

    Avinash NDSP1_DDR3_PHY.xlsx

  • Avinash,

    From your spreadsheets, it appears that you have the length matching and the PHY CALC sheets populated correctly.  The next this to check is the trace width and spacing rules, making sure that there are no nearby circuits coupling into the DDR area and making sure that the routes have proper reference planes and decoupling for return currents, where needed.  You will need to review the layout for these issues.

    Tom

  • Hi Tom & Eric,

    Have forwarded your suggestions to our team. 

    Our hardware and the design team is looking into the DDR area.

    Will update you on this. 

    Thanks for the timely response. 

  • Vidya, Avinash,

    Please keep us posted on your progress.

    Tom

  • Vidya, Avinash,

    I recommend that we close this thread.  If you have feedback in the next few weeks, you can post to this thread and it will re-open.  If it has locked, you can open a linked thread.

    Tom

  • Hi Tom & Eric,

    We designed a customized board with TMS320C6678(1GHz) and DDR3(Datarate - 800MHz). we customized the GEL File and tested the DDR3, It was working fine. EVM IBL is designed for 1333MHz, Due to this issue IBL is getting crashed.

    How to change the IBL for DDR3 800 MHz data rate (i.e, 400MHz clock)?

    Regards,

    Avinash N

  • Hi,

    I saw you opened another E2E: https://e2e.ti.com/support/processors/f/791/t/907992. I followed up there and I'm closing this thread.

    Regards, Eric