This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

NDK 2.0 crashes evm648

Hello,

I am evaluating the network throughput of NDK 2.0. I am seeing lots of crashes.

I  have already installed the DM648 patch.

 

My set up:

I connected my dm648 eval board and  PC host to a 1 G NETGEAR swtich.

I  ran the client.prj sample code  with code composer. After a network connection is established,  I ran the winapp "recv 192.168.14.53" on my PC host.

The winapp stopped running after a while and the DSP kernel  seemed to crash  and stuck in a loop(see below):

.......--------Message from DOS window------------------

510 Requesting 46720 bytes...receive...passed ---23827200 bytes/s

520 Requesting 46720 bytes...receive...passed ---24294400 bytes/s

--------------------------------------------------------------------------------------------

 

I also tried runnng  the iperf application and I also see the crashes.

I ran client.prj with code composer and ran "iperf -c 192.168.14.53 -p 1001 -n 81920000" from my PC host.

DSP also stucks in the same loop after I tried the same command for 5 -30 times.

My question: What causes the DSP code  to stop working?

 

 

---------------------------DSP start up message ----------------

TCP/IP Stack 
colorMnmtNetBase NC_NetStart
Using MAC Address: 00-21-ba-2e-0a-12
cpsw_MDIO_Init
SetPhyMode:000021E1 Auto:1, FD10:64, HD10:32, FD100:256, HD100:128, FD1000:8192 LPBK:0
cpsw_MDIO_Init
SetPhyMode:000021E1 Auto:1, FD10:64, HD10:32, FD100:256, HD100:128, FD1000:8192 LPBK:0
 EMAC should be up and running 
EMAC has been started successfully
Registeration of the EMAC Successful
cpsw_MDIO_FindingState: PhyNum: 0
cpsw_MDIO_FindingState: PhyNum: 1
cpsw_MDIO_PhYReset(0)
Enable Phy to negotiate external connection
NWAY Advertising: FullDuplex-1000 FullDuplex-100 HalfDuplex-100 FullDuplex-10 HalfDuplex-10 
cpsw_MDIO_PhYReset(1)
Enable Phy to negotiate external connection
NWAY Advertising: FullDuplex-1000 FullDuplex-100 HalfDuplex-100 FullDuplex-10 HalfDuplex-10 
Network Added: If-1:192.168.14.53
Service Status: Telnet   : Enabled  :          : 000
Service Status: HTTP     : Enabled  :          : 000
 Negotiated connection: FullDuplex 1000 Mbs
Link Status: 1000Mb/s Full Duplex on PHY 1

 

----------------------------DSP stucks in the following loop-------------------

E40FAFE4 1FFF3C12 ||         CALLP.S2      _HWI_disable (PC-1568 = 0xe40fa9c0),B3
E40FAFE8          C$L1:
E40FAFE8 0002A120            BNOP.S1       C$L1 (PC+8 = 0xe40fafe8),5
E40FAFEC 00000000            NOP          
E40FAFF0 00000000            NOP          
E40FAFF4 00000000            NOP          
E40FAFF8 00000000            NOP          
E40FAFFC 00000000            NOP          
E40FB000          CLK_TIMEFXN, _CLK_gethtime, CLK_F_gethtime:
E40FB000 000C0362            B.S2          B3
E40FB004 022803E2            MVC.S2        TSCL,B4
E40FB008 021018F0            OR.D1X        0,B4,A4
E40FB00C 00004000

-------

 

 

 

  • That loop looks like the "abort" loop.  When BIOS finds critical problem, it calls SYS_abort().  Can you look in your LOG window and see if there's an error message?  If you are in CCSv3, you can get to this via  BIOS tools menu.  Look at LOG_system log.   If you are using CCSv4, then use the "ROV" tool and traverse to the LOG module and look for LOG_system.  The error string should hopefully give a hint at the source of the problem.

  • Hi Karl,

    Thanks for your reply.

    How do I enable LOG_system?

    I saw LogTrace and DVTEcent_Log, but there was no LOG_System log.

    From the "Execution Grap Data" window, I got the following error message:

     960461 SYS abort called with message '***LOCK NOT CALLED IN TSK CONTEXT'

    Any idea?

    Ludy

     

    960446 TSK: ready tskNdkStackTest (0xe40ff39c)

    960447 TSK: running tskNdkStackTest (0xe40ff39c)

    960448 SWI: end KNL_swi (TSK scheduler) (0xe40ff92c) state = done

    960449 SWI: begin KNL_swi (TSK scheduler) (0xe40ff92c)

    960450 SWI: end KNL_swi (TSK scheduler) (0xe40ff92c) state = done

    960451 SWI: begin KNL_swi (TSK scheduler) (0xe40ff92c)

    960452 SWI: end KNL_swi (TSK scheduler) (0xe40ff92c) state = done

    960453 SWI: begin KNL_swi (TSK scheduler) (0xe40ff92c)

    960454 SWI: end KNL_swi (TSK scheduler) (0xe40ff92c) state = done

    960455 SWI: begin KNL_swi (TSK scheduler) (0xe40ff92c)

    960456 SWI: end KNL_swi (TSK scheduler) (0xe40ff92c) state = done

    960457 SWI: begin KNL_swi (TSK scheduler) (0xe40ff92c)

    960458 TSK: blocked tskNdkStackTest (0xe40ff39c) on <unknown handle> SEM

    960459 TSK: running dynamic TSK (0xe40ff334)

    960460 SWI: end KNL_swi (TSK scheduler) (0xe40ff92c) state = done

    960461 SYS abort called with message '***LOCK NOT CALLED IN TSK CONTEXT'

     

  • Ludy --

    The "LOCK NOT CALLED IN TSK CONTEXT" error you show above is usually caused by an ISR or SWI function calling one of the RTS functions that is only allowed from thread context.   BIOS registers a lock/unlock functions with the RTS library.  The RTS library has a few functions that are not reentrant and need to use this lock to protect from multiple threads corrupting some global data structures.

    The most common suspect is and ISR function calling printf().  printf() internally acquires a lock.

    I am not familiar with the ethernet driver for DM648, but I did a quick review of the code and it looks like there might be some printf() calls that could be made from ISR context.

    If you are able to rebuild the ethernet driver, I would suggest changing the printf() calls in the driver to 'LOG_printf()' calls.  Or, you can place breakpoints on the error conditions and see if one of these is to blame.  Look for 'printf()' in the hal/evmdm648\ethss_dm648\*.c.  

    The ones I suspect are in red below.  Can you place a breakpoint on those and see if you hit them?  And then see if this abort occurs soon after that?

    Regards,
    -Karl-

    Normal 0 false false false MicrosoftInternetExplorer4

    [ethss_dm648]% grep printf *.c

    cpsw_miimdio.c:#define mdioPrintf printf

    csl_emac.c:        printf("Error in SetMacCfg\n");

    csl_emac.c:            printf("\nWARN: DDC_cpsw3gSetSwitchFlowControl: Invalid i

    n port %d", hInPort->portNum);

    csl_emac.c:        printf("InitTx Channel : Unable to allocate %d BDs for channe

    l %d.%d BDs already in use\n",

    csl_emac.c:        printf("InitTx Channel : Unable to allocate %d BDs for channe

    l %d.%d BDs already in use\n",

    csl_emac.c:        printf("NetChOpen: Channel number invalid \n");

    csl_emac.c:        printf("NetChOpen: %s Channel %d already initialized\n",

    csl_emac.c:        printf("NetChOpen: Error in initializing %s channel %d",

    csl_emac.c:        printf("NetChOpen: Error enabling channel %d in %d direction\

    n",

    csl_emac.c:                         printf("\nERROR:cpdmaOpen: Error enabling ch

    annel %d", channel);

    ethdriver.c:    printf("Tx: %d Rx: %d FatalError: %d \n", Status.DmaStatus.txPen

    ding, Status.DmaStatus.rxPending, Status.DmaStatus.errPending);

    ethdriver.c:            printf("Since EEPROM MAC address is Zero we use th MAC A

    ddress = %02x-%02x-%02x-%02x-%02x-%02x\n",

    ethdriver.c:        printf("EMAC OPEN Returned error \n");

    ethdriver.c:    printf(" EMAC should be up and running \n");

    ethdriver.c:        printf("EMAC Close Returned error %08x\n",i);

    ethdriver.c:        printf("EMAC_setReceiveFilter Returned error %08x\n",i);

    ethdriver.c:            printf("EMAC_sendPacket() returned error %08x\n",i);

    ethdriver.c:                printf("Error configuring the EMAC Hardware: %d \n",

     retVal);

    ethdriver.c:        printf("Error setting up Tx/Rx Interrupts \n");

    nimu_eth.c:        printf ("EMAC has been started successfully\n");

    nimu_eth.c:        printf ("Error: Unable to allocate private memory data\n");

    nimu_eth.c:        printf ("Error: Unable to allocate memory for the EMAC\n");

    nimu_eth.c:        printf ("Error: Unable to register the EMAC\n");

    nimu_eth.c:    printf ("Registeration of the EMAC Successful\n");

  • Hi Karl,

    I couldn't set break point at any of the printf statements you mentioned.

    Was it because the lib code was optimized?

    I can set  break points at the begining line of the individual function calls.

    FYI:

    The crash problem was fixed after

     I  replaced all printf() with LOG_printf(...) in nimu_the.c , ethdriver.c  and csl_emacs.c 

    and rebuilt the driver.

    This fixed the SYS_ABORT problem.

    Thanks,

    Ludy