This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Z-Stack 3.0.2 bug report(typically devices could not join ZC forever)

Other Parts Discussed in Thread: Z-STACK, CC2538, CC2530

Hi Ryan:

Previous mentioned topic:https://e2e.ti.com/support/wireless-connectivity/zigbee-and-thread/f/158/p/767378/2838938?tisearch=e2e-sitesearch&keymatch=TP2_LEGACY_ZC#2838938

As we all known,some customer want to zigbee HA 1.2 devices join in standard zigbee 3.0 network formed by ZC,but the zigbee 1.2 and 3.0 not fully compatibility on ZC side,therefore,customer have to define macro TP2_LEGACY_ZC whenever build ZC project,but this may be caused a typical behaviour(the devices could not join network forever,seems like ZC don't process "Update Device" correctly),I have been resolved this issue recently,the following segments was my workaround.

/****************************************************************************
* @fn          bdb_TCProcessJoiningList
*
* @brief       Process the timer to handle the joining devices if the TC link
*              key is mandatory for all devices
*
* @param       none
*
* @return      none
*/
void bdb_TCProcessJoiningList(void)
{
  bdb_joiningDeviceList_t* tempJoiningDescNode;
  
  if(bdb_joiningDeviceList)
  {
    tempJoiningDescNode = bdb_joiningDeviceList;
  
    while(tempJoiningDescNode)
    {
      if(tempJoiningDescNode->NodeJoinTimeout)
      {
        tempJoiningDescNode->NodeJoinTimeout--;
      }
      
      if(tempJoiningDescNode->NodeJoinTimeout == 0)
      {
        //Check if the key exchange is required
        if(bdb_doTrustCenterRequireKeyExchange())
        {
          AddrMgrEntry_t entry;
          entry.user = ADDRMGR_USER_DEFAULT;
          osal_memcpy(entry.extAddr,tempJoiningDescNode->bdbJoiningNodeEui64, Z_EXTADDR_LEN);
          
          if(AddrMgrEntryLookupExt(&entry))
          {
            ZDSecMgrAPSRemove(entry.nwkAddr,entry.extAddr,tempJoiningDescNode->parentAddr);
          }
          
          //Expired device either is legacy device not using the TCLK entry or got
          //removed from the network because of timeout, eitherway it is not using
          //TCLK entry neither the Security user in the address manager, so free the entry
          //in both tables.
          
          uint16 keyNvIndex;
          uint16 index;        
          APSME_TCLKDevEntry_t TCLKDevEntry;
          uint8 found;
          
          //Remove the entry in address manager
          ZDSecMgrAddrClear(tempJoiningDescNode->bdbJoiningNodeEui64);
          
          //search for the entry in the TCLK table
          keyNvIndex = APSME_SearchTCLinkKeyEntry(tempJoiningDescNode->bdbJoiningNodeEui64,&found, NULL);
          
          //If found, erase it.
          if(found == TRUE)
          {
            osal_memset(&TCLKDevEntry,0,sizeof(APSME_TCLKDevEntry_t));
            TCLKDevEntry.keyAttributes = ZG_DEFAULT_KEY;
            
            //Increase the shift by one. Validate the maximum shift of the seed which is 15
            index = keyNvIndex - ZCD_NV_TCLK_TABLE_START;
            
            TCLinkKeyFrmCntr[index].rxFrmCntr = 0;
            TCLinkKeyFrmCntr[index].txFrmCntr = 0;
            
            //Update the entry
            osal_nv_write(keyNvIndex,0,sizeof(APSME_TCLKDevEntry_t), &TCLKDevEntry );
          }
          
          if(pfnTCLinkKeyExchangeProcessCB)
          {
            bdb_TCLinkKeyExchProcess_t bdb_TCLinkKeyExchProcess;
            osal_memcpy(bdb_TCLinkKeyExchProcess.extAddr,tempJoiningDescNode->bdbJoiningNodeEui64, Z_EXTADDR_LEN);
            bdb_TCLinkKeyExchProcess.status = BDB_TC_LK_EXCH_PROCESS_EXCH_FAIL;
            
            bdb_SendMsg(bdb_TaskID, BDB_TC_LINK_KEY_EXCHANGE_PROCESS, BDB_MSG_EVENT_SUCCESS,sizeof(bdb_TCLinkKeyExchProcess_t),(uint8*)&bdb_TCLinkKeyExchProcess);
          }
        }
       
        //Free the device from the list
        bdb_TCJoiningDeviceFree(tempJoiningDescNode);
      }
      tempJoiningDescNode = tempJoiningDescNode->nextDev;
    }
  }

  //we are done with the list
  if(bdb_joiningDeviceList == NULL)
  {
    osal_stop_timerEx(bdb_TaskID,BDB_TC_JOIN_TIMEOUT);
  }
}
  • Hi,

    Thanks for sharing.

    We'll look into this and get back to you.

    Regards,
    Toby

  • Hi,

    I'm not entirely sure how this fixes the issue. Can you share how you reached this conclusion?

    From what I can see, you are bypassing the function call ZDSecMgrAddrClear(.) and the sequence of function calls that clears the TCLK from NV.

    I don't see how this affects whether or not the ZC sends (or tunnels) the NWK key.
    In both cases of "direct association" or "update device", the NWK key is sent by the ZC after the ZC calls ZDSecMgrDeviceJoin(.).
    To me, this is independent of bdb_TCProcessJoiningList(.). Each device in the list is given up to 1000 ms before bdb_TCProcessJoiningList(.) actually processes that device. The Transport Key should've happened before this time passes.

    Regards,
    Toby

  • Hi,

    May be Ryan could confirm this workaround working well or not,of course, you can replicate this issue by yourself,reference  previous mentioned topic,only need three or four ZR,join the first ZR to ZC directly and successful,then join the second ZR through the first ZR,Update Device and Transport Key work well,and join another ZR one by one,you must be found third and other ZR could not indirect join in ZC forever.

    According the brief of bdb_TCAddJoiningDevice,this function is responsable track the "Request Key" issue or not whenever standard zigbee 3.0 devices join the network,but,zigbee HA 1.2 don't issue "Request Key" whenever join network,so ZC don't care key exchange procedure,only free the bdb_joiningDeviceList linklist whenever timeout (NodeJoinTimeout variable equal zero).

    picture capture and ubiqua sniffer log attachments archive:

    ubiqua sniffer attachments:

    could not join network forever.7z

  • Ok, thanks for explaining and sharing your workaround.

    behold said:
    need three or four ZR,join the first ZR to ZC directly and successful,then join the second ZR through the first ZR,Update Device and Transport Key work well,and join another ZR one by one,you must be found third and other ZR could not indirect join in ZC forever

    Do you mean that it's necessary to have multiple ZR join indirectly through one ZR?
    In other words, would the network topology look something like:
    ZC --- ZR1 --- {ZR2, ZR3, ZR4, ...} ?
    Does Z-Stack 1.2.2a need to run on ZR2, ZR3, ZR4, ... ?

    behold said:
    According the brief of bdb_TCAddJoiningDevice,this function is responsable track the "Request Key" issue or not whenever standard zigbee 3.0 devices join the network,but,zigbee HA 1.2 don't issue "Request Key" whenever join network,so ZC don't care key exchange procedure,only free the bdb_joiningDeviceList linklist whenever timeout (NodeJoinTimeout variable equal zero).

    Based on the logs, seems the issue is that ZC does not send Transport Key after receiving Update Device. But in cases of both indirect join (Update Device) and direct join, the Transport Key should happen before bdb_TCProcessJoiningList.

  • Yes,bug only influence indirect join network formed by zigbee 3.0 ZC,first indirect ZR(for example, ZR2 indirect join through ZR1) work well,but ZR3 and ZR4,... could not join network forever.

  • The following is my setup:

    • ZC:
      • CC2538, Z-Stack 3.0.2
      • NWK_MAX_DEVICE_LIST == 1 (nwk_globals.h) (to enforce join through ZR1)
      • #define MAX_NEIGHBOR_ENTRIES    1 (to enforce join through ZR1)
      • -DTP2_LEGACY_ZC (f8wConfig.cfg)
    • ZR1:
      • CC2538, Z-Stack 1.2.2a
    • ZRn (n > 1)
      • CC2538, Z-Stack 1.2.2a

    With this setup, I was able to join multiple ZRn, such that network topology was ZC --- ZR1 --- {ZR2, ZR3, ZR4, ...}. The Transport Key was sent as expected to each ZR2, ZR3, ZR4.

    This setup was done with the default bdb.c file.
    Can you elaborate further on how you are able to reproduce the issue of ZR2, ZR3, ZR4, ... not being able to join (no Transport Key)?
    Can you retry the setup with out-of-box code, and erase all flash before loading the images onto the devices?

  • please reproduce the issue, ZC build with Z-Stack 3.0.2 on CC253x platform defined TP2_LEGACY_ZC,ZR1,ZR2,ZR3... build with Z-Stack 3.0.2 on CC2530 platform

    add following MACRO in znp.cfg file

    -DMAC_CFG_TX_DATA_MAX=5
    -DMAC_CFG_TX_MAX=8
    -DMAC_CFG_RX_MAX=5
    
    /* www.ti.com/.../swra427c.pdf */
    -DSRC_RTG_EXPIRY_TIME=30
    -DCONCENTRATOR_ENABLE=TRUE
    -DCONCENTRATOR_DISCOVERY_TIME=60
    -DMAX_RTG_SRC_ENTRIES=10
    -DCONCENTRATOR_ROUTE_CACHE=1
    -DMTO_RREQ_LIMIT_TIME=5000
    -DLINK_DOWN_TRIGGER=12
    -DNWK_ROUTE_AGE_LIMIT=30
    -DDEF_NWK_RADIUS=15
    -DDEFAULT_ROUTE_REQUEST_RADIUS=10
    -DROUTE_DISCOVERY_TIME=13
    -DZDNWKMGR_MIN_TRANSMISSIONS=10
    -DNWK_LINK_STATUS_PERIOD=30
    -DNWK_MAX_DEVICE_LIST=10

  • The issue is related to the ZC (the ZR in your log sends the Update Device as expected).
    Even with those settings on the ZC, and using 3.0.2 ZRs, I am still able to join multiple routers (e.g. ZR2, ZR3, ZR4...) through ZR1 to ZC.

    Using default bdb.c, the steps are:

    1. Use your settings for ZC (except NWK_MAX_DEVICE_LIST=10; I limited this so that only ZR1 can join directly to ZC).
    2. Use out-of-box ZR1, ZR2, ...
    3. Join ZR1 to ZC
    4. Join ZRn (n > 1) to network. They will join through ZR1 since ZC only accepts one direct join.

    With this, I see all the ZRn join successfully through ZR1.

    Can you retry your setup (i.e. use default bdb.c, rebuild, erase flash on all devices, flash with the new images) to see if you still see the behavior?

  • Thanks for sharing your modified bdb.c. In addition to your previous changes, it seems you also commented out two calls of bdb_nwkSteeringDeviceOnNwk(.). 
    I recommend you to retry with the default bdb.c from Z-Stack 3.0.2, attached here.1538.bdb.c