CC2652P: About the update of the neighbor table and router table

Howjie zhou

Part Number: CC2652P
Other Parts Discussed in Thread: SYSCONFIG, Z-STACK,

The relevant configuration of the coordinator is basically the same as the prior thread , and only the MTO mechanism is disabled.
The coordinator and the routing device are placed together and both are within the communication range.
NWK_MAX_DEVICE_LIST = 10
MAX_NEIGHBOR_ENTRIES = 16

Q1:
There are 20+ routers in our test network. It appears that the coordinator initiates a route discovery and receives the Route Reply, but the coordinator sends data directly to the dest device without using a new path.

0x0000 and 0xEB01 are not valid neighbors of each other.
packet Num 188935: 0x0000 initiates route discovery to 0xEB01
packet Num 188952: 0x0000 RReply received
packet Num 188958: 0x0000 directly sent a ZCL message to 0xEB01. At this time, 0x0000 and 0xEB01 are still not valid neighbors of each other! Is this packet sent based on Tree Link?
It seems that 0x0000 discards the Route Reply. What are the possible conditions for discarding the Route Reply in ZStack?

This phenomenon has also appeared in the prior thread.

Q2:
The refreshing of the adjacent devices of the coordinator is strange, as shown in the screenshot below:

packet Num 185089:0x0F80 (inCost = 1, outCost = 0) is not a valid neighborbor of 0x0000;
packet Num 185183:0x0F80 is a valid neighborbor of 0x0000, but 0x0F80 (the neighbor table of 0x0F80 may not have available entry space for 0x0000) does not have 0x0000 in the link status message. Why does 0x0000 use 0x0F80 as a valid neighbor at this position?

Packet Num 185375 & 185453 is the repetition of this process.

filter.zip

over 4 years ago

0 Howjie zhou over 4 years ago

Intellectual 475 points

Add the capture file before filtering：

origin.zip

0 Ryan Brown1 over 4 years ago in reply to Howjie zhou

TI__Guru**** 219977 points

Hi Howjie,

Since you aren't enabling MTO then source routing is not enabled, therefore unlike SWRA650 it would be advisable to increase the value of MAX_RTG_ENTRIES (40 by default) inside ti_zstack_config.h as generated by SysConfig. As there are several devices active in the network, the ZC is likely receiving the route reply from 0x5A14 but unable to add this path to its routing table and is thus directly sending its intended message as a final effort to deliver the message.

The second question could be a related issue. The ZC is aware that it can receive incoming Link Status messages from 0x0F80 but that this ZR is not able to receive outgoing messages from the ZC. But since the routing table is full the ZC attempts a direct message before attempting to find an existing path through the Route Request.

Regards,
Ryan

0 Howjie zhou over 4 years ago in reply to Ryan Brown1

Intellectual 475 points

Hi Ryan,

About Q1:
There are only 25+ routers in the test network. Ideally, a router table of 40 is sufficient.
The subsequent communication process of 0x0000 and 0xEB01 is as follows:

Step3 shows that 0x0000 has not processed the RReply at step2.
Step6 shows that 0x0000 correctly handles the RReply at step5.

Why is there such a difference in the two RReply?

About Q2:
In the position of packet Num 185183, the outCost of 0x0F80 becomes 1 in the link status of 0x0000, why?

Thanks,

Howjie

0 Ryan Brown1 over 4 years ago in reply to Howjie zhou

TI__Guru**** 219977 points

Thank you for providing further clarity. I've noticed that this is not an isolated instance as there are a multitude of similar interactions between the ZC and ZRs. In behavior is consistent in most cases, where the ZR device does not have the ZC in its link status entries while the ZC has a down/poor outgoing cost for the ZR.

It is possible that since the outgoing cost is 0x00, the ZC sends the route request but ignores the reply since a route exists (albeit a bad one). Then, because of the "No route available" Network Status from 0xEB01 (in 188960, not shown) it deletes the route completely so on the next attempt the route response is actually saved & applied.

I have a theory with which I would appreciate further debugging on your setup. I will communicate further instructions directly through Shuyang.

Regards,
Ryan

0 Howjie zhou over 4 years ago in reply to Ryan Brown1

Intellectual 475 points

Hi Ryan,

For question 1, Shuyang has told us how to debug. If there is new information, we will synchronize it to Shuyang.

In addition, we have a question about data transmission that needs to be confirmed:

if the device calls AF_DATA_Req (disable APS ACK) twice to send data, when the first message arrives in MACTxQueue, will the second message be cached in MACTxQueue or NWKTxQueue?

If the path changes before the second message is sent, will the new path or the old path be used when the second message is sent?

Regarding question 2, what might cause it?
There is no 0x0000 in the link status of 0x0F80. Normally, the outCost of 0x0F80 in the link status of 0x0000 should always be 0, but the sniffer shows that 0x0000 changes the outCost of 0x0F80 to 1. This seems very strange, this unexpected change Will cause frequent changes in the path.

Thanks,

Howjie

0 Ryan Brown1 over 4 years ago in reply to Howjie zhou

TI__Guru**** 219977 points

Messages will queue on the NWK layer until receiving confirmation from the MAC layer that the previous message was transmitted. Messages are stored in a buffer after the next routing hop is determined, so the old path will be used. The window for which this behavior could occur is minimal.

I have no leads as it pertains to your second question. I can only surmise that the routing entry was refreshed to the default link cost of one, although there is nothing in the sniffer log indicating why this would occur. Is it a recurring issue or rarely observed?

Please note that further responses will be delayed until 11/29 due to the U.S. Thanksgiving holiday..

Regards,
Ryan

0 Shuyang Zhong over 4 years ago in reply to Ryan Brown1

TI__Expert 4930 points

Hi Ryan,

The customer has reproduced the issue that the coordinator did not handle the RReply problem (not sure if it is the same reason).

When the problem recurred, the MAC layer of the coordinator received the RReply (ReqID = 9) message, but the program did not execute to RTG_ProcessRrep().

It seems that after 0x0000 received the RReply message of 0xDAF8, the message was discarded from the mac layer to the nwk layer. Is it because of the nwk frame counter?

About 40 seconds later, the customer continued to control the 0xA718 device. The route discovery (ReqID = 10) is successful. From the sniffer, the communication path is 0x0000 → 0x152F → 0xA718, and it is symmetrical.

In the process of ReqID = 9, the routing table of 0xA718 stores routing entries:

destAddr-0x0000,

nextHopAddr-0xDAF8.

In the process of ReqID = 10, 0xA718 received the RReq message of 0x152F. Since 0xA718 returned RReqpy, RTG_ProcessRreq()→RTG_UpdateRtDiscEntry() returned RTG_SUCCESS, and RTG_AddRtgEntry() should be called to modify pRtg->nextHopniffer. In other words, the routing entry of 0xA718 needs to be updated to:

destAddr-0x0000,

nextHopAddr-0x152F.

but When controlling 0xA718 again after three hours, the path is not like this.

Since ZNP was used as the coordinator, there will only be linkStatus messages in the network within 3 hours, and then the customer used 0x0000 to control 0xA718 again, the forward path is 0x0000 → 0x152F → 0xA718, and the backward path is 0xA718 → 0xDAF8. why?

1129.zip

Best regards,

Shuyang

0 Ryan Brown1 over 4 years ago in reply to Shuyang Zhong

TI__Guru**** 219977 points

I believe the Route Reply from 0xDAF8 is discarded by the ZC because it does not have a link available to this ZR as seen in the Link Status messages, whereas 0x152F is a valid route. 0xA718 seems to believe that the best route to the ZC is through 0xDAF8 and not 0x152F, which is causing the problem. Practically, 0xA718 should be recognizing this issue and sending a Route Request for 0x0000. Many-to-one/source routing could help eliminate this discrepancy.

Regards,
Ryan

0 Howjie zhou over 4 years ago in reply to Ryan Brown1

Intellectual 475 points

Hi Ryan,

In the path discovery process, for what purpose does it use invalid neighbors? In order to choose a better communication path?

If we modify the code as follows, can we avoid the problem of invalid neighbors in the path discovery process?

The Route Record in MTO can avoid using invalid neighbors?

Regards，

Howjie

0 Ryan Brown1 over 4 years ago in reply to Howjie zhou

TI__Guru**** 219977 points

Hi Howjie,

Note: source code excerpt removed. The line in question should not be modified since invalid neighbors should still be processed as explained in the comments and in the description below.

My assumption is that Route Responses from an invalid link (source node not in the neighbor table) will continue to be processed by the routing layer and either add the source node to its neighbor table (and thus exist in Link Status messages) or, if the neighbor table is full, rebroadcast the Route Request to try and find a valid link. Is this not what you are observing?

I cannot confirm whether MTO routing will resolve the issue, especially if there are limited neighbors allowed. You can review the MTO route maintenance section of the Z-Stack User's Guide to further understand the benefits it could offer to your system.

Regards,
Ryan

0 Howjie zhou over 4 years ago in reply to Ryan Brown1

Intellectual 475 points

Hi Ryan,

My previous description was not very clear.

If the neighbors of Zigbee are divided into the following three types:
1. Valid neighbor: inCost and outCost are valid values
2. Invalid neighbor: only inCost is a valid value
3. Potential neighboring: the neighbor does not appear in the link status of the device.

In this thread, our understanding of the situation that the coordinator does not handle RReply:

CASE1
0x0000→0x5A14→0xEB01
When the problem occurs, 0x5A14 and 0x0000 are valid neighbors, but 0x0000 does not handle the RReply relayed by 0x5A14.

In this case, we are debugging according to your instructions to get a more specific abnormal location, such as RTG_HIGHER_COST.

CASE2
0x0000→0xDAF8→0xA718
When the problem occurs, 0xDAF8 and 0x0000 are potential neighbors, but 0x0000 does not handle the RReply forwarded by 0xDAF8.
From the debug information of 0x0000, the RReply message did not reach the NWK of 0x0000 for processing, because the two are not valid neighbors.

This situation seems to be caused by ZStack trying to use potential neighbors to send RReply and the device will discard messages sent by non-valid neighbors.

In response to the phenomenon in CASE2, we want to prohibit the use of potential neighbors in the route discovery process to avoid this problem.
Our modification is also for this purpose, adding the condition to disable potential neighbors, but we are still not sure about the design intent of ZStack using potential neighbors, so we need to confirm with you whether this is feasible?

We enabled MTO in ZStack at the beginning of the test, but during the test it had a strange problem, so we disabled MTO, and we will try to enable Many-to-one/source and provide specific problem feedback.

Regards,

Howjie

0 Ryan Brown1 over 4 years ago in reply to Howjie zhou

TI__Guru**** 219977 points

During case 2, the potential neighbor 0xDAF8 has the ZC listed as an invalid neighbor in its Link Status messages. It replies to the Route Request in hopes of re-establishing a route between the source and destination nodes. When processing the Route Reply on the ZC, the intended behavior is to have the entry added to both the routing and neighbor tables. For whatever reason (ex. tables filled), this is rejected on the ZC through processing at the routing layer. So although it may be possible to modify this layer to fully discard messages from invalid or potential neighbors, there is nothing to stop the potential/invalid neighbor from replying to the next Route Request as well and repeating the process. The best recommended path is to optimize the ZC settings to allow a substantial number of routes and neighbors as allowed by the available Flash/RAM on the CC2652P device. If not already mentioned, please be sure to increase the NV memory and Heap as necessary.

Regards,
Ryan

0 Howjie zhou over 4 years ago in reply to Ryan Brown1

Intellectual 475 points

The method of increasing the size of the neighbor table and routing table can eliminate invalid neighbors and potential neighbors as much as possible. Of course, this depends on the node density, and the efficiency of data exchange may be reduced.

As long as invalid neighbors and potential neighbors are prohibited during the route discovery process (the neighbor table is not modified at the point), The next RReq should not use these abnormal paths.

In addition, we found an abnormal phenomenon during the test, and the description of the problem has been sent to you through Shuyang.

Regards,

Howjie

Zigbee & Thread

Zigbee & Thread forum

CC2652P: About the update of the neighbor table and router table