This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

the ZED could not rejoin the ZC after the ZC doing “frequently restart”

Other Parts Discussed in Thread: Z-STACK

Hi all :

there are 5 ZEDs(three of them are enabled power saving,others were not ),and they are all enabled the NV_RESTORE

when they were join the ZC, i doing the “restart” on the ZC。

it means execute belowing code ,(maybe  you can only disable the ZC NV_RESTORE, and do the power restart)

this issue could be happen。

    uint8 pBuf = ZCD_STARTOPT_CLEAR_STATE | ZCD_STARTOPT_CLEAR_CONFIG;
    (void)znp_nv_write(ZCD_NV_STARTUP_OPTION, 0, 1, &pBuf);
    zapPhyReset(zapAppPort);
    return;

when i test 6 times , i found that one of the POWER_SAVING ZED could not rejoin the network (others are OK)

The ZED short address is 0xE059 at the fifth time “restart”

when the ZC finish the sixth time “restart” ,  the 0xE059 could not rejoin the ZC anymore ,

i saw that there is a lot of   “Leave” CMD (which option include  rejoin bit 0x01 and request bit 0x01 ) from ZC to 0xE059。

here  is the log :1234.rar

(Ps:  We could search the 12324 ,it  is the begin of the issue  happen )

and i  wait a long time,  This  ZED could not rejoin the network .

so  i  power off the ZED and then power on,  but  it also  could not be  rejoin ,

then  i  turn on the permit join of the ZC,  i saw that  the ZED could rejoin success by  using a new short address 0xBB05

so  what is the cause of this issue?

why does othere  ZED could rejoin success ???

BR!

 

  • What version of Z-Stack are you using and did you change any compilation flags on the coordinator or the end devices? I tried reproducing your issue on 4 (1 ZC, 3 ZED) devices with Z-Stack Home 1.2.2a + all default flags except with NV_RESTORE and NV_INIT on all devices, 2 end devices with POWER_SAVINGS and 1 without, but I did not experience the same problems as you. Every time I reset ZC + cleared NV, all ZEDs successfully rejoined the ZC on a new PAN with the same short addresses.

    If you can be more specific about your setup I can try to reproduce the issue again.
  • zstack Home 1.2.2a  

    ZC (using ZNP project) compile flag:

    ZC_Complie_flag.rar

    ZED POWER SAVING flag:

    ZED_POWSAVING_Complie_flag.rar

    ZED without PowerSaving flag:

    ZED_WithoutPSAVING_Complieflag.rar

    does Z-stack Home1.2.2a have been correct below bug ?

     :  https://e2e.ti.com/support/wireless_connectivity/zigbee_6lowpan_802-15-4_mac/f/158/t/275570

  • hi JasonB,
    do you know why does ZC sending the Leave CMD
  • Do you have child aging configured? It is disabled by default, but check your values of zgChildAgingEnable and associated variables in ZGlobals.c
  • yes

    it is enable both in the ZC and ZED。

    does it matter?

    why does ZC would send the Leave CMD  to  the ZED while it sends the rejoin request?

     

  • If a ZED gets aged out for whatever reason and it tries to poll for data, ZC will send it a leave command because the ZED is no longer in the ZC association table. Do you need to have child aging enabled in your application?

  • But why ZED doesn't join back? If we enable child aging and parent node ask device to leave because it's already aged out of association list, it should join back.
  • yes ,i do !

    if we disable the child aging, how could ZC knows the node isn't here any more ?

    BR!

  • Hi JasonB

         is there anything update  of this issue?

         i  set several ZEDS(15) in the house last night , i found that there are 8 ZEDS disconnect ZC and they could not rejoin  the network anymore ...

         When  i  turn  on the permitjoin ,those  ZEDs could rejoin the network again.....

         But all of the ZC and ZED  were enable  the NV_RESTORE,Why does  some of ZED could not rejoin until ZC turn on the permitjoin?

        In addition , Those 8  disconnected  ZEDs were being in race condition

        

  • Hi Yikai

    do you encounter this issue, i think this issue always happen in the race condition .
    please refer the below reply !

    anyway , i always  don't  know  why  does  the ZC  always send  the router request , and it make the network so busy in sometime.

    when it sending the router request, i saw that cause other data request from the ZED  being resent, and it will easy make the ZED being in orphan state


    BR !

  • What do you mean race condition?
  • This fixed a similar issue for me on a previous version of the stack.

    http://e2e.ti.com/support/wireless_connectivity/zigbee_6lowpan_802-15-4_mac/f/158/p/208581/1337128#1337128

    https://e2e.ti.com/support/wireless_connectivity/zigbee_6lowpan_802-15-4_mac/f/158/t/383622

    I have migrated to 1-2-2a Home but don't know yet if the modification to ZDApp_NetworkInit() is still required.

  • in the “edge”  of  ZC network.

    The  SSI vuale is lower...

  • Do you try the modification to ZDApp_NetworkInit which is suggested by Andrew?
  • No ,i have not yet!

    but i saw that ZDApp_NetworkInit in HA1.2.2a have not add those code ?

    is that mean new version forget to update this patch ?

    or does it already update in kernel?

    so i confused Adding those whether will casuse the new problem!!!

    BR!
  • Some to TI patches are erratic and No idea if TI really puts this patch into its kernel. So, I suggest you to put that modification into Z-Stack Home 1.2.2a and test again.
  • Hi JasonB

    i found that when ZC doing above “restart” ZED could rejoin sometimes ,but ZR could not rejoin anymore !!!!

    And when ZR disconnected , those ZED the child of this ZR ,could not link to ZC anymore。

    but when i poweroff the ZR, the ZED could link the ZC directly!!!


    Anyway ,i have a question :

    when  there are more  than 20 ZED link the ZC directly, how could i delete some of them  to make more  room for other new ZED to join ?

    or  i  could  only wait  for  the child aging  time out  then i could add the new ZED ?


    BR!

  • You could change the compile option NWK_MAX_DEVICE_LIST from its default value of 20, or you could write your own routine in your application to manually remove associated devices, using the functions declared in AssocList.h

    In regards to your original problem, I have still been unsuccessful in reproducing the same problem on our devices. Did you attempt to implement the changes proposed by Andrew above? If not, please try them out and see if you notice any changes in the behavior of your network.

  • OK i would like to have a try !

    but does it really have not updates in the kernel of HA1.2.2a ?
    i afraid that Adding those code will cause something unpredictable issue!


    and i always have a question about the ZR and ZC re-link to ZC 。


    as same as doing above “restart ” on ZC,

    1. why does ZED could be success to rejoin into the network and It is OK to send msg to ZC (most of the time)

    2. why does ZR could not link the ZC anymore ?


    i know the Association list of the ZC is deleted after doing above “restart” ,
    but why does ZR and ZED implement diffierent ????


    BR!
  • Cetri's patch is deployed to ZDApp which I believe is not related to Z-Stack Kernel. ZED would send beacon request to join original network after restart if NV_RESTORE is enable. However, ZR wouldn't send beacon request but stay what it is after restart when enable NV_RESTORE. I think that's why you see different behavior.
  • Hi JasonB

                I  found something strange phenomenon happen 。

                i do not let the ZC doing frequently “restart” , just let it here 。

                those ZEDs will sending the IEEE request to the ZC every 45s, and there are 16 ZEDs link to ZC。

             

                i saw  that the there are 16 PANID in the log although there is only one ZC actually ,within  24hours Testing。

              

                log:CHANGE PANID.rar

              

    i  was so curious about why does  ZC changing the PANID  so frequently, although it had enable the NV_RESTORE。anyway i have got the log about PANID changing as above:

    you can see the 337  log , ubiqua shows it is Reserves , i have not any idea about this MSG means?

    is that  matter with  the issue about  ZED could not rejoin the network until  the ZC turn on the permit join???

    and  i always found that there is also beacon while  ZED sending  beacon request, but  ZED could not rejoin success until  ZC turn on permitjoin !!!!

    BR!

  • Hi JansonB

               could you please refer the below log :

              which  from  13647  to 14716 。

              13647 ===》 the ZED (0XF5E8)   being Oprah node , and try several times to rejoin, and there is always a successfull rejoin response, but it always to send the beacon request , and then sending rejoin request again  and ZC have not response again, so it  was going to to sending beacon request, and  this issue happen again。  when i turn on the permit join on ZC,  it was success to rejoin by using a new short address (0XF2F4

                SDFF.rar

          

  • In response to packet #337, that appears to be junk data picked up by your packet sniffer, unrelated to your own network. You can see the RSSI (link quality) is very low, which means that whatever is sending that signal is much farther away from your sniffer than your network devices. This is normal, I see packets like that in my own testing occasionally.

    The PANID list above only shows 1 network containing more than 1 device, PAN 0x9456. My guess is that the other networks you see listed in your sniffer log are present from junk data packets picked up by your sniffer, similar to packet #337 from above. In that CHANGE PANID log you can see that packet #337 corresponds to PAN 0x65E9, and there are no other packets corresponding to that PAN in that log. Do you have the log for the image you posted above with all the different PANIDs?
  • Can you try to add the following red code in zgPreconfigKeyInit() and test again to see if ZED still can rejoin ZC after ZC does restart with ZCD_STARTOPT_CLEAR_STATE | ZCD_STARTOPT_CLEAR_CONFIG.

    static uint8 zgPreconfigKeyInit( uint8 setDefault )
    {
      uint8 zgPreConfigKey[SEC_KEY_LEN];
      uint8 status;

      // Initialize the Pre-Configured Key to the default key
      osal_memcpy( zgPreConfigKey, defaultKey, SEC_KEY_LEN );

      // Generate a ramdom key and assign it to the pre-configured key
      for ( uint8 i = 0; i < SEC_KEY_LEN; i ++ )
      {
        *((uint16 *) (&zgPreConfigKey[i])) = osal_rand();
      }
    ...

  • Hi JasonB

          thank you very much ,  so  PANID changing didn't related this issue.

          last night i found something new .

    ===> 1<===

         As  you know ,  HA1.2.2a Using a new mechanism of Child Aging.

        i saw  that  there is  a Orphan Notification  and  Coordinator Realignment  。
       but  when the issue about  ZED could not rejoin  happens, i found  that  there is  a orphan notification without coordinator realignment !

      (PS: the Link Quality  is good , so  you  don't need to question whether ZED is far  from  the ZC or not  )

        Q1:

        so i questions if the ZC  forget to  sending the coordinator realignment , does it  would  cause the  ZED disconnect the network ?

        Q2:

        What   would casus the ZC forget to  sending the coordinator realignment ?

             

      ===> 2<===

               when ZED being sending the beacon request, and could not rejoin  the network, 

               i  compare those  beacon which  send  from ZC  , and  i have found that they were different  between  before turn on the permit join and after  turn on permit join。

              Before  Turn on the permit join   Beacon association permit bit  is set 0:

             

             After  Turn on the permit join   Beacon association permit bit  is set 1:

            

     

           Q3 :

               ok, we  just don't care why does  ZED disconnect the network although link Quality is  good 。

                but  why does ZC  send  the beacon  without   enable association permit bit ?(that causes  the ZED could not rejoin  network directly)

                and that   is  why  i tell you  every time  this  issue happens , i  turn on  the permit  join  and then ZED is ok to rejoin。

              so it seems that NV_RESTORE does not work !!!!is  that right ????

     

         Q4:

               But above  phenomenon  just  happens in some ZED(some times ZED1 ,sometimes ZED2,it is random ) , the  others  had  not encountered

               i saw that   all  there is not more  specific info in  the beacon request   nomatter which ZED  sends

               so how does the  ZC knows  which beacon request should   turn on  the association permit  and which should not  be ???

     

    BR!

  • here is the  log :orphan&coordinator realignment issue.rar

       in  6509 item,

       you can see a ZED sending orphan notification  and  there is not coordinator realignment ,and then it  turns to send beacon request         

       and i let it  to scanning  all the night , and  it always could not rejoin 。。

     

       in  145638 item,

        i turn on the permit join  , it could rejoin success 。。。

    BR!

  • Do you apply the code that generates a ramdom key and assigns it to the pre-configured key on your ZC when you do sniffer log in orphan&coordinator realignment issue.rar?
  • Generates a ramdom key and assigns it to the pre-configured key on ZC ?
    sorry i have not idea about it

    What do you mean ?


    ZC is act the ZNP project!
  • I suspect ZC alway use the same network key after factory reset and that's why ZED can still join a fresh new Zigbee network by resetting ZC. So, I give you that code for changing network key every time ZC forms a new network. I suggest you to verify this first. I expect ZED shouldn't join a fresh new Zigbee network by resetting ZC.
  • oh  sorry  i  forget get  your post 。

    OK i will try it , but could  you please tell  what does those code means?  do you have found sometings in the log ?

    Q1:

         i found that the ZED sends the orphan notification  and there is not  coordinator realignment  and ACK, then it going to sending beacon request, but it could not rejoin the network although there is a beacon come from the ZC, until i turn on the permit join on ZC side。

           please refer below log:   item  6059

           6443.orphan&coordinator realignment issue.rar

     

      Q2 :

         I compare  the beacon of  the ZC  and found  the below  info:

         rejoin success ZED =====》  no matter  the beacon  do not enable the association request , it could rejoin the network,

         rejoin failed  ZED  ======》   just sending  the beacon request, if the beacon had not enable the association permit join。

          so  i think that the problem  maybe  caused by the ZED side (rejoin  part ),  it seems that  the  ZED have not notice it have already joined the network。

          when receive a beacon without enable association permit, it could not sending the rejoin request。

          so  should we  update  the NIB into the NV again by ourselves   when the device join the network???

    BR

  • The code is to change network key every time ZC forms a new network. In this way, I expect ZED shouldn't join a fresh new Zigbee network by resetting ZC.
  • Hi Andrew
    Adding those codes could not solve the problem.

    Now it seems that the ZED had not save the NIB while it joining the network.

    When it lost network , it could not rejoin again although there is a beacon response for the beacon request。

    BR!
  • Hi Yikai

    those code make the ZED could not join the network after ZC doing above “restart”。

    but now the serious problem is that the ZED could not rejoin the network while it sends the beacon request and there is also a beacon response for it。

    it seems that the ZED lost the NIB information or it had not save it after it joining。。。

    BR!
  • Hi JasonB

         I  had  test  this  issue all the night  and  just  set the ZEDs as normal as it was( just  joining the network  and  sending a MSG to ZC every 45s) .

        at  this morning , i found  there were  more than 5 ZED was lost network  and  could not rejoin although there is a beacon  response for  the beacon request。

        

         when  this  issue  happen(sends beacon request), i found  there is not more Coordinator Realignment  and ACK

         2588,

         10699,

          11111,

          44195,

           43516,

           Log is here : 

       

        but  i always have a  question about  the Rejoin 。

        ZED could  auto rejoin the network  when it had  joined , but  now it seems  this function  is failed 。。。

        is there any  bug  on the state- mechanism ?

         anyway, I am afraid of the NWK_PARENT_INFO_ORPHAN_NOTIFICATION mode 

         so if  i changing the Child Aging Mode as  NWK_PARENT_INFO_MAC_DATA_POLL,

         how should i set  the ZED ???

         is  that need  to set  the ZED  POLL RATE as  some time ??

         Is  there any more setting!!!

    BR!

  • Hi JansonB

    when this issue happen, i using debug mode to detect its state ,
    i set a break point in the ZDO_beaconNotifyIndCB()

    and i found that pNwkDesc is diffierent with pBeacon, so the "found" is set as flase


    and then i found the ZED could not run into this function

    // check if this device is a better choice to join...
    // ...dont bother checking assocPermit flag is doing a rejoin
    if ( ( pBeacon->LQI > gMIN_TREE_LQI ) &&
    ( ( pBeacon->permitJoining == TRUE ) || ( _tmpRejoinState ) ) )
    {

    }


    as i saw this the log , the LQI is good and bigger than gMIN_TREE_LQI, permitJoing is FALSE.

    so it depends on the value of _tmpRejoinState.

    but i search all the ZAPP.c i had not found anywhere set the _tmpRejoinState as FALSE.

    why does it could not run into above " if function"

    BR!