This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CC2652R: Coordinator dropping inaccessible devices

Part Number: CC2652R
Other Parts Discussed in Thread: CC2530, , Z-STACK

Hi guys,

I would need to understand what is the default behavior of zigbee network when it comes to devices with  weak signal - a lot of failed messages.

Let's say you have battery powered sleepy device which reports once in an hour and auto polls every 10 seconds.

From my experience everything works fine when the device is placed near coordinator and there are no communication failures.

But when I move the device to the place with bad signal coverage  I have the following behavior:

- End device is able to report to coordinator. It may happen that some reports are not delivered, but e.g. even if the report after one hour is not delivered, the next one after 2 hours is accepted by coordinator.

- End device polls for new messages every 10 seconds. Coordinator is set to expire messages after 30 seconds. So when the coordinator tries to deliver the message to the end device and the delivery is not successful, and maybe when such situation happens multiple times it decides to ignore such end device. When this happens, coordinator refuses to send anything to the end device, even when the end device is moved to better location. On the other hand, reports from the device are received by coordinator without problems. Only thing that helps is to restart the end-device and everything starts working (I'm not sure why - re-join?)

So I'm not sure what happens in such cases. 

- Does coordinator removes such problematic devices from neighbor list after some number of failures?

- Is there any way how to prevent such behavior?

- Is there anything that can be configured on the end device and/or coordinator?

I'm using cc2652R with zigbee2mqtt as coordinator and e.g. cc2530 as a sleepy end device with z-stack1.2, nwk_auto_poll with polling interval 10 seconds. Power management set to battery and POWER_SAVING set.

But I noticed such behavior also with some commercial end devices.

It seems to me that when coordinator encounter some number of communication failures it removes the end device from the neighbor list and after that doesn't even try to deliver anything to that. Only when the end device re-join the network (e.g. batteries removed), it refreshes the table and is again able to communicate with the device..

Can I do anything to force coordinator NOT to give up on some devices even if there is temporary communication problem?

Thank you.

  • Can you provide sniffer log to elaborate your issue/case?

  • Thanks for fast reply :) I will try to sniff and collect some logs. For now I'm not even sure if it is coordinator who refuses messages from end device, or if it is end device who stopped polling.. I was hopping to get better understanding how it should work under such circumstances.. What is end device supposed to do by default when there is a weak connection to coordinator and no router around? Should it stop with polling or try forever? What should coordinator do? Should it remove such device from neighbor list or just wait? What settings are related on the end-device side and what settings on coordinator side..

  • When end device polling failure or cannot get mac ack when sending messages, it will turn to orphan state and start rejoining network. For Zigbee HA 1.2 or Zigbee 3.0 device, it will backoff a while if it cannot rejoin network and restart again the same loop until it joins back. 

  • Thanks. I would expect that behaviour.. In one of the older threads in this forum I found the following: 

    Since you use Z-Stack 3.0 coordinator, it will age out the device which doesn't polling for a long time. When the aged out device tries to send message to coordinator, the device won't be on association list and coordinator will ask device to leave to do rejoin.

     

    What would happen if my zstack 1.2 end device does not poll for "longer" time because of poor quality of connection. Then my 3.0 based coordinator marks it as "aged out" and then asks it to leave.. What happens to my ZED when I have only default implementation - no custom handling of any network state.. Would the device re-join automatically when asked to leave or does it need to implement such mechanism manually? Thank you.

  • Hi Peter,

    Please refer to the Child Management Sections of the Z-Stack User's Guide.  If the END_DEV_TIMEOUT_VALUE is surpassed, the ZC will remove the device from its association table but not delete its information from the network.  It will also attempt to send the end device a leave request with rejoin enabled.  Thus the ZED is entirely able to rejoin the network via a separate ZR parent or the ZC once it regains a better connection.  Being in an orphan state, it will attempt to rejoin the network in the manner described by YK.

    Regards,
    Ryan

  • Hi guys. Thanks a lot for your answers. It would be great if it worked like you described.. but it seems that as soon as my end device gets into orphaned state it does not try to re-join. Only when I restart it by removing batteries.. otherwise it only sends periodic reports triggered by timer, but it is not able to receive anything ( write commands ).. Well I need to sniff and debug.. quite strange, since I didn't tweak anything unusual..

    Thanks a lot for your inputs.

  • You could also try debugging ZDApp.h to determine whether the device enters the DEV_NWK_ORPHAN state and thus begins periodic ZDO_REJOIN_BACKOFF events.

    Regards,
    Ryan

  • Hi guys, after some investigation, I found that the problem was caused by such poor quality of the signal that the end device was not able to get out of an orphan state. Probably, it was not even possible to successfully finish communication necessary to re-join the coordinator or find a new parent. I've fixed some antenna problems, and it seems that the device is back on track and behaves as expected, even in situations with temporary weak signal... Thanks a lot for the support. At least it helped me to understand what to expect and narrow down the problem.