CC2652R: Neighbor tables on power cycle

Grant China

Part Number: CC2652R
Other Parts Discussed in Thread: Z-STACK

Hello,

We are seeing a problem in a large network with 55 routers and a coordinator, all on the same Zigbee network and in close proximity. If we power off 5 routers, and leave the other routers and a coordinator running for several minutes, when we power on the 5 routers again, the 5 routers send a route discovery message to try and discover a route to the coordinator. None of the other routers rebroadcast the route request, and the coordinator does not send a route reply.

We're on a relatively old version version of the SimpleLink CC13X2-CC26x2 SDK, version 4.20.104. We do have future plans to move to a more current version but that is not going to happen in the short term.

We have a theory why this is happening and appreciate any suggested fixes. We noticed the link status messages in the 50 routers do not have the short addresses of the 5 that were powered off. Since the 5 routers are not in any neighbor tables, maybe the 50 routers do not rebroadcast the route discovery packets because they do not see these routers as neighbors. Similarly, the coordinator may not send a route reply because it doesn’t receive a route discovery broadcast from any neighbors it knows about. When it receives a route discovery broadcast from one of the 5 that were powered off, it doesn’t have their addresses in its own neighbor table either. (The 50 routers all rebroadcast rare route discovery commands sent from one of the 50, and the coordinator sends a route reply to these.) I confirmed the routers have the same pan ID, network key, network key sequence number, etc. We let the network run for many minutes after powering on the 5. In those several minutes, the link status messages in the other 50 routers did not seem to change at all.

Is this a known bug in the TI stack? Or is there a setting we could enable that would cause the stack to add late nodes into the neighbor table? They seem to be in a state where they are excluded from the rest of the network.

We are considering adding our own code to remove 1 or 2 entries from the neighbor table occasionally to help allow devices to get into the neighbor tables of other routers. But if there are any better solutions, or stack settings we are missing, we would appreciate any other direction.

Our neighbor table size is 16 entries, and our networks can scale up to over 100 routers, so we don’t want to just make the network table large. We are also not interested in and not able to support many-to-one and source routing right now. (Long story.)

Thanks in advance,

Grant China, WattIQ Inc.

5 months ago

0 Ryan Brown1 5 months ago

TI__Guru**** 205697 points

Hello Grant,

Thank you for giving the full description of your system's behavior. It may also be helpful to provide a sniffer log which demonstrates the issue. However, the behavior you've described is expected and possible within the boundaries of Zigbee protocol and the Z-Stack network configuration. With a large number of routers in close proximity and source routing disabled, it is feasible that the router's network tables are being completely filled. In these instances, route discovery from the rejoined router will not be rebroadcast as there is no more table space to add more entries.

As you've already mentioned, it is possible to increase MAX_NEIGHBOR_ENTRIES to accommodate for more neighbors. This is feasible, however you should be cautious of having too many routers in close proximity and causing broadcast storms. Is it possible to use ZEDs for some of these devices instead of ZRs? You could also reduce the value of GOOD_LINK_COST, also inside nwk_globals.h, to restrict the quality of incoming link status messages before that router is added to the local neighbor table. The nwk_util.h file contains several APIs you can utilize in your application to modify the neighbor table directly, if necessary.

Regards,
Ryan

0 Grant China 5 months ago in reply to Ryan Brown1

Prodigy 110 points

Thank you for your very helpful response! We are far enough in the current architecture that we will not be able to convert some of these to end devices. We were able to implement a simple algorithm to periodically delete one neighbor table entry if the table is full, and that seems to have solved our initial problem.

We are now seeing a case where some packets seem to be dropped by the coordinator. As a quick background, we are running a test network with 30 routers. The routers send periodic packets to the coordinator, but these do not have APS acknowledgments enabled. (Without going into detail, enabling acknowledgements is not a desirable or needed solution right now.) The coordinator has a neighbor table size of 16 and a routing table size of 40.

In the failure case, one particular router is able to establish a route that hops through another router, and reaches the coordinator reliably. But for some reason, this router will sometimes start sending packets directly to the coordinator. When it routes directly to the coordinator, the coordinator sends a MAC acknowledgment, but the data never gets into the application code. It appears the coordinator’s stack is discarding the packet. When the router eventually routes data through the 1-hop router again, the coordinator application does receive the data.

I understand why the routing changes. When the router routes directly to the coordinator, the router’s link status broadcast includes the coordinator in the list of neighbors, so the coordinator is in the router’s neighbor table. When the router routes through another router, the router’s link status message shows the coordinator is not in the list of neighbors. This makes sense as I’m guessing the routing algorithm routes directly to neighbor nodes, or uses the routing table if the destination is not a neighbor.

So one big question we have is what would cause the coordinator to drop the packet when it is received directly from the router? I confirmed the router is not in the coordinator’s neighbor table. Does the router drop direct packets if the sender is not in the neighbor table?

Also, what would cause the coordinator to get into the router’s neighbor table? We are not adding it in our application from what I can tell. Does the stack have code that tries to add the coordinator into a router’s neighbor table in some cases (rx broadcast packet, rx link status, etc)? If so, is there a way we can disable that code? It also appears that something is causing the coordinator’s address to be removed from the neighbor table (which allows the router to resume using the 1-hop route, which is successful). Maybe the stack sees the router is not in the coordinator’s link status messages and removes it?

Regards,

Grant

0 Ryan Brown1 5 months ago in reply to Grant China

TI__Guru**** 205697 points

Hi Grant,

You've identified the correct issue: the router link status views the coordinator as a valid neighbor and sends it a packet directly, whereas the coordinator link status does not view the router as a valid neighbor and thus drops the packet.

The Z-stack routing layer does not make special conditions for the coordinator. Routing has been improved past v4.20, but you've explained that it isn't possible to migrate at the moment.

To resolve this issue, you can increase the neighbor table of the coordinator, decrease the good link cost of the router, or send a route request from the router to the coordinator if you can detect the issue from the router application (or remove the coordinator from the neighbor table) and try sending the packet again.

Regards, Ryan

Zigbee & Thread

Zigbee & Thread forum

CC2652R: Neighbor tables on power cycle