This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CC2530: All devices lost in Network after Power cycle - ZIGBEE-LINUX-SENSOR-TO-CLOUD_1.0.1

Part Number: CC2530
Other Parts Discussed in Thread: Z-STACK

In the log corresponding to seriallog.txt.9, I performed several resets on the CC2530 in order to try to recover from a non-responsive situation.

I only managed to recover after a Power Cycle, but then the network was empty while I had several devices associated before cycling the power.
=> What is the reason ?

Further, while performing HW resets using the CCDebugger reset connected to the CC2530, I noticed that there seemed to be some communication coming back from the CC2530 (seriallog.txt.9 ), but the gateway did not manage to start with that.   I do not know what to look for yet in the serial communication to know if it the CC2530 responds sufficiently so that the gateway considers it is alive.

'seriallog.txt.10' corresponds to the communication after the power cycle, and the Zigbee gateway starts up, but all the devices are gone!
=> Why did they disappear?

I am trying to understand the causes so that I can implement countermeasures in deployed setups.

PowerCycleNeeded.zip

  • Hi,

    le_top said:
    I had several devices associated

    How many is several devices?

    What Z-Stack configurations do you have on the CC2530 ZNP ? Is this the default or what modifications have been made?

    This is a relevant app note: 

    Regards,

    Toby

  • Thanks for the application note, I did read it.

    Regarding the number of devices associated, DbHistoryFull/20201108_170430.DbDeviceInfo.csv in the provided zip file shows the following valid entries:

    Device Database
    IeeeAddress,NwkAddress,Status,MfgId,EP_Count,ParentIeeeAddress,CapInfo
            CC:CC:CC:FF:FE:3A:2F:33 , 0x69F6 , 0x01, 0x1246 , 1, 00:12:4B:00:10:22:82:77, 0x00
            00:0D:6F:00:0E:8A:75:6F , 0xE58D , 0x01, 0x1163 , 1, 00:12:4B:00:10:22:82:77, 0xBE
            00:12:4B:00:01:DD:7A:D7 , 0xE6A4 , 0x01, 0x0000 , 3, 00:12:4B:00:10:22:82:77, 0x0E
            00:12:4B:00:01:DD:78:72 , 0x3F73 , 0x01, 0x0000 , 3, 00:12:4B:00:10:22:82:77, 0x0E
            70:B3:D5:B1:30:01:01:1E , 0x92A4 , 0x00, 0x1234 , 1, 00:12:4B:00:10:22:82:77, 0xBE
            00:15:8D:00:04:7B:83:69 , 0x46F4 , 0x00, 0x1037 , 1, 00:12:4B:00:10:22:82:77, 0xBE
    

    zigbeegw.20201108210510.log confirms 0 devices after the power cycle (I did not include that database state):

    [2020-11-08 21:17:21.494,917] [NWK_MGR/LSTN] PKTBODY:                                                          cmdId = NWK_GET_DEVICE_LIST_REQ
    [2020-11-08 21:17:21.495,333] [NWK_MGR/LSTN] MISC1  : Processing Get Device List Request.
    [2020-11-08 21:17:21.497,396] [NWK_MGR/LSTN] MISC1  :  found 0 Device Records

    I am using "Koenkk"'s build for recent trials (github ) .

    I have carefully reviewed the changes to the source code for that build, I applied the missing corrections proposed on the dedicated page for that, and compiled that version as well.  I'll be using that self built version again after applying the fix for the GP related startup issue.
    I'll do so when less of my attention is consumed by the Gateway - I intend to setup an environment where I can easily launch the IAR debugger in case the ZNP seems to be failing.

  • It looks like the database is actually opened successfully (in zigbeegw.20201108211647):

        [2020-11-08 21:17:18.516,840] [NWK_MGR/MAIN] MISC1  : Opened Database 'DbDeviceInfo.csv', 0 records
        [2020-11-08 21:17:18.523,423] [NWK_MGR/MAIN] MISC1  : Opened Database 'DbEndpoints.csv', 0 records

    These files are deleted when a hard reset is performed (nwkmgrsrv.c: processNwkZigbeeSystemResetReq --> nwkMgrDatabaseReset). Hard reset will also clear any NV items on the ZNP (in zstackpb.c: processSysResetReq --> znpReset), so communication between this ZNP (post-reset) and the previous network will be nullified.

  • Thank you Toby

    However, I did not do a hard_reset - I would not have reported this in that case.

    I've added extra information with regards to the RESET_IND from the device, and I grepped the log for these items as well as the information about the NWK_MGR startup that reveals normal startup, sort_reset or hard_reset to confirm that I did not mistakingly do a hard reset.

    Here is the result of that grep:

    $ egrep -a -ir '(reset_|NETWORK_MGR)' DevicesLostAfterPowerCycle20201108_211721.log
    [2020-11-08 17:49:30] starting NETWORK_MGR, cmd ' ./NWKMGR_SRVR_arm 127.0.0.1:2536 -v 0xFFFFFFFF ' on Sun Nov 8 17:49:30 UTC 2020[2020-11-08 17:49:30.894,980] [NWK_MGR/MAIN] INFO   :  there are 2 args
    [2020-11-08 17:49:34] Startup phase 3 completed successfully, server started (NETWORK_MGR_PID=783) on Sun Nov 8 17:49:34 UTC 2020
    [2020-11-08 21:08:43] starting NETWORK_MGR, cmd ' ./NWKMGR_SRVR_arm 127.0.0.1:2536 -v 0xFFFFFFFF ' on Sun Nov 8 21:08:43 UTC 2020[2020-11-08 21:08:43.921,159] [NWK_MGR/MAIN] INFO   :  there are 2 args
    [2020-11-08 21:08:48] Startup phase 3 completed successfully, server started (NETWORK_MGR_PID=3790) on Sun Nov 8 21:08:47 UTC 2020
    [2020-11-08 21:09:41.630018] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:10:01.184318] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:10:20.795598] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:10:39.181625] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:10:48] starting NETWORK_MGR, cmd ' ./NWKMGR_SRVR_arm 127.0.0.1:2536 -v 0xFFFFFFFF ' on Sun Nov 8 21:10:48 UTC 2020[2020-11-08 21:10:48.410,194] [NWK_MGR/MAIN] INFO   :  there are 2 args
    [2020-11-08 21:10:52] Startup phase 3 completed successfully, server started (NETWORK_MGR_PID=4297) on Sun Nov 8 21:10:52 UTC 2020
    [2020-11-08 21:11:07.777859] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:11:16] starting NETWORK_MGR, cmd ' ./NWKMGR_SRVR_arm 127.0.0.1:2536 -v 0xFFFFFFFF ' on Sun Nov 8 21:11:16 UTC 2020[2020-11-08 21:11:16.495,576] [NWK_MGR/MAIN] INFO   :  there are 2 args
    [2020-11-08 21:11:32.120617] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:11:39] starting NETWORK_MGR, cmd ' ./NWKMGR_SRVR_arm 127.0.0.1:2536 -v 0xFFFFFFFF ' on Sun Nov 8 21:11:39 UTC 2020[2020-11-08 21:11:39.832,457] [NWK_MGR/MAIN] INFO   :  there are 2 args
    [2020-11-08 21:11:43] Startup phase 3 completed successfully, server started (NETWORK_MGR_PID=4509) on Sun Nov 8 21:11:43 UTC 2020
    [2020-11-08 21:11:59.263805] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:12:18.203730] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:12:38.325682] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:12:46] starting NETWORK_MGR, cmd ' ./NWKMGR_SRVR_arm 127.0.0.1:2536 -v 0xFFFFFFFF ' on Sun Nov 8 21:12:46 UTC 2020[2020-11-08 21:12:46.230,491] [NWK_MGR/MAIN] INFO   :  there are 2 args
    [2020-11-08 21:12:50] Startup phase 3 completed successfully, server started (NETWORK_MGR_PID=4786) on Sun Nov 8 21:12:50 UTC 2020
    [2020-11-08 21:13:05.284661] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:13:23.461206] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:13:43.344988] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:14:03.374505] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:14:23.085424] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:14:43.006739] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:15:00.239011] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:15:09] starting NETWORK_MGR, cmd ' ./NWKMGR_SRVR_arm 127.0.0.1:2536 -v 0xFFFFFFFF ' on Sun Nov 8 21:15:09 UTC 2020[2020-11-08 21:15:09.614,768] [NWK_MGR/MAIN] INFO   :  there are 2 args
    [2020-11-08 21:15:23.961940] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:16:59.770980] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:17:08.781509] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:17:18] starting NETWORK_MGR, cmd ' ./NWKMGR_SRVR_arm 127.0.0.1:2536 -v 0xFFFFFFFF ' on Sun Nov 8 21:17:18 UTC 2020[2020-11-08 21:17:18.316,405] [NWK_MGR/MAIN] INFO   :  there are 2 args
    [2020-11-08 21:17:22] Startup phase 3 completed successfully, server started (NETWORK_MGR_PID=1007) on Sun Nov 8 21:17:22 UTC 2020
    [2020-11-08 22:31:48.852474] > [SYS/AREQ] **RESET_IND** PWRON(0) Protocol 2 ProductID 2 SWVer 2.7.2     (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 00 02 02 02 07 02 c0

    The following is an example of two other startups to indicate how the reset_soft and reset_hard are revealed when they happen - this could be extracted from logs a day earlier:

    zigbeegw.20201107152825.log:[2020-11-07 15:39:57] starting NETWORK_MGR, cmd ' ./NWKMGR_SRVR_arm 127.0.0.1:2536 -v 0xFFFFFFFF --reset_soft ' on Sat Nov 7 15:39:56 UTC 2020[2020-11-07 15:39:57.109,756] [NWK_MGR/MAIN] INFO   :  there are 3 args
    zigbeegw.20201107152825.log:[2020-11-07 15:39:57.132,977] [NWK_MGR/MAIN] INFO   :  option reset_soft selected
    zigbeegw.20201107152825.log:[2020-11-07 15:42:04] starting NETWORK_MGR, cmd ' ./NWKMGR_SRVR_arm 127.0.0.1:2536 -v 0xFFFFFFFF --reset_hard ' on Sat Nov 7 15:42:04 UTC 2020[2020-11-07 15:42:04.282,096] [Z_STACK/READ] INFO   : Received 1 bytes, subSys 0x45, cmdId 0xC0
    zigbeegw.20201107152825.log:[2020-11-07 15:42:04.395,891] [NWK_MGR/MAIN] INFO   :  option reset_hard selected

    For convienience, I add a log with more automatic decoding of information, and combining some logs from earlier runs.

    DevicesLostAfterPowerCycle20201108_211721.zip

  • In addition to my previous reply, I conclude that it is not the Power Cycle that resulted in losing the device list.

    It all happened in under 5 minutes:

    ****** This details the startup sequence where the gateway/ZNP went from 6 devices to 0
    **** At this time, the NWK_MGR still finds 6 devices
    [2020-11-08 21:12:53.207,117] [NWK_MGR/LSTN] MISC1  : NWK_DEVICE_LIST_MAINTENANCE_REQ
    [2020-11-08 21:12:53.207,216] [NWK_MGR/LSTN] MISC1  :  on all
    [2020-11-08 21:12:53.207,665] [NWK_MGR/LSTN] MISC1  :  found 6 Device Records
    **** But "Startup phase 4 failed":
    [2020-11-08 21:12:54] Startup phase 4 failed
    **** So the Gateways servers are started again, there are multiple restarts.
    **** I could not see anything suggesting that the Device Records would be lost.
    [2020-11-08 21:13:05.284661] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:13:23.461206] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:13:43.344988] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:14:03.374505] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:14:23.085424] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:14:43.006739] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:15:00.239011] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:15:23.961940] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:16:59.770980] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    ***** The next reset, eventually leads to complete startup:
    [2020-11-08 21:17:08.781509] > [SYS/AREQ] **RESET_IND** EXTRST(1) Protocol 2 ProductID 2 SWVer 2.7.2    (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 01 02 02 02 07 02 c1
    [2020-11-08 21:17:30] when we see something missing we will send a SIGUSR2 to pid 560
    *****
    ***** But between the reset and full startup the NWKMGR reports 0 Device records.
    [2020-11-08 21:17:21.478,576] [NWK_MGR/LSTN] MISC1  :  found 0 Device Records
    
    ***** Which seems to happen before the POWER_ON RESET that is reported later.
    ***** So it does not seem to be triggered by the Power Cycle at all:
    [2020-11-08 22:31:48.852474] > [SYS/AREQ] **RESET_IND** PWRON(0) Protocol 2 ProductID 2 SWVer 2.7.2     (SYS:1/TYPE:40/CMD:80) ( 11)::fe 06 41 80 00 02 02 02 07 02 c0

  • , this question has been stale for over a week - I suppose that nobody at TI is going to look any further into this.

  • The CSV Database parser has a bug when that appears when a parser error occurs which at best resulted in a memory access exception, and that was not detected on most launches.

    I fixed the parser and the CSV file and my systematically empty list was filled again ;-).

    I am not sure that this is the root cause for the cas mentionned in this thread, but in case something like this happens again, I know where to look.