This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Hi nkn,
Thanks for your note. It looks like you are pretty knee deep with the ProFlex module and the Z-Stack and quite knowledgeable about the products already. Thank you for choosing LS Research and TI products. It would be helpful if you could share a "clean" Ubiqua sniffer trace of time t = 0 to when the problem occurs. The sniffer should show the time period from the start of the formation of the network to when the route requests are observed. It will be interesting to see which node is generating the route requests etc. and an analysis of the link status messages from each router can give some information as to who it thinks its best neighboring routers are. In addition, here are a few tips/comments/questions to help narrow the problem down:
1) What is the physical separation between each router?
2) Is NV_RESTORE being used?
3) What is the transmit power of the modules?
4) Do all 10 routers try to transmit at roughly the same time within the 2 second interval? Typically it makes sense to add some random start time + some jitter so that when the nodes come up the likelihood that they all try to send their messages at the same time is minimized.
5) Try turning off frequency agility on all the nodes by setting ZDNWKMGR_MIN_TRANSMISSIONS=0 in f8wConfig.cfg.
In the meantime, I have tried some more ways to fix the problem and can add some information:
3) Setting a transmit power of 4 or 8 doesn't help. Also, I tried configuring the CC2591 in low-gain mode by calling the HAL_PA_LNA_RX_LGM() macro, which did not have an effect on the observed route requests.
4) I did add a 0-127ms jitter to the send interval periods, without effect.
Apart from that I have also unsucessfully tried to increase the MAX_NEIGHBOR_ENTRIES and to change various broadcast settings (e.g. queue length).
Hi nkn,
Thank you again for your detailed responses and spending time to investigate the problem. I have been trying to analyze the problem further and what I notice is that there doesn't seem to be regular link status messages come from all the nodes in the network. I'm not sure if this is just an artifact of where the sniffer was placed in relation to the rest of the nodes, but if the assumption is that all nodes are within earshot then the sniffer should be picking up more regular link status messages from all the nodes. Each link status message from every router should come out at roughly 15 second intervals. If they are not doing this, then perhaps this is a clue to the aberrant behavior.
This certainly is not normal behavior and something that we have not seen in our own 400 node test network.
If you can, I would recommend getting the CC2531 USB dongle ($49 each from the TI e-store) and try to capture this with the Ubilogix sniffer. I think you can download a 30-day free trial from ubilogix.com. The problem is a lot easier to analyze with this sniffer.
Could I also entice you to try this experiment with Z-Stack 2.5.0 the latest release? We have addressed some potential pitfalls with routing in this release.
Just to confirm, you using the default settings of MAX_RTG_ENTRIES=40 and MAX_NEIGHBOR_ENTRIES=16?
Guys,
Did you have any lick? I am seeing very similar issue
Guys,
Here is my configuration:
- 11 Nodes all configured as routers (all sending out reports to Gateway)
- Implementation is based exactly off Sensor Demo application.
- Stack is 2.4.0-1.4.0
- All nodes seem to be storing nwk info in NVRAM.
During reset entire NWK comes up. All nodes respond and I get desired data. After few hours of operation the whole network becomes unstable in that, some nodes stop sending data and others keep sending data, but I see a lot of "Route Req" and "Match desc Reqs" going around in network. See below for a snapshot. The basic questions I had are:
- Is that a healthy sign that every few seconds one "route req" and "match desc" gets generated? I do not understand why is this happening especially if that router is already successfully sending data to gateway. I understand such requests if they are from disconnected nodes.
- I took one node from this network and tested on bench. I found it was generating many "Route Reqs" along with sending partial data after dropping few packets. Even after turning it on/off. Apparently when I flashed a new FW, it stopped sending such flood Req. and in started responding neatly to gateway - only few Rout Req initally- after that none.
I was suspecting, this node along with few other nodes in earlier network went stale. But still I do not understand why and how did this happen?
Any clue what is going on here?
(This snapshot is after few hours of NWK formation)
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 62 | 0xFC63 | 0xFFFD | 0xA6 | 0x1D |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 63 | 0x035B | 0xFFFD | 0xED | 0x21 |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 64 | 0x65A6 | 0xFFFD | 0x67 | 0x1B |
| 33 | 11 | NWK: Route Request | 0xC301 | 0x0000 | 0xFFFF | 65 | 0x0000 | 0xFFFD | 0x63 | |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 66 | 0x035B | 0xFFFD | 0xED | 0x21 |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 67 | 0x65A6 | 0xFFFD | 0x67 | 0x1B |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 68 | 0x65A6 | 0xFFFD | 0x67 | 0x1B |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 69 | 0x9C1E | 0xFFFD | 0x9D | 0xE8 |
| 33 | 11 | NWK: Route Request | 0xC301 | 0x0000 | 0xFFFF | 70 | 0x0000 | 0xFFFD | 0x65 | |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 71 | 0x9C1E | 0xFFFD | 0x9D | 0xE8 |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 72 | 0x9C1E | 0xFFFD | 0x9D | 0xE8 |
| 50 | 11 | Reserved | 0x7701 | 0xE800 | 0xFFF8 | 128 | ||||
| 5 | 11 | Acknowledgment | 33 | |||||||
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 73 | 0x8878 | 0xFFFD | 0x1F | 0xE4 |
| 33 | 11 | NWK: Route Request | 0xC301 | 0x0000 | 0xFFFF | 74 | 0x0000 | 0xFFFD | 0x67 | |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 75 | 0xD1C5 | 0xFFFD | 0xDF | 0xFA |
| 33 | 11 | NWK: Route Request | 0xC301 | 0x0000 | 0xFFFF | 76 | 0x0000 | 0xFFFD | 0x69 | |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 77 | 0x8878 | 0xFFFD | 0x1F | 0xE4 |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 78 | 0xD1C5 | 0xFFFD | 0xDF | 0xFA |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 79 | 0x8878 | 0xFFFD | 0x1F | 0xE4 |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 80 | 0xD1C5 | 0xFFFD | 0xDF | 0xFA |
| 41 | 11 | NWK: Link Status | 0xC301 | 0x0000 | 0xFFFF | 81 | 0x0000 | 0xFFFC | 0x6A | |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 82 | 0x6C39 | 0xFFFD | 0xFE | 0x42 |
| 33 | 11 | NWK: Route Request | 0xC301 | 0x0000 | 0xFFFF | 83 | 0x0000 | 0xFFFD | 0x6C | |
| 33 | 11 | Reserved | 0xB301 | 0x1029 | 0xFFFB | 39 | 0x691D | 0x0000 | 0x70 | |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 84 | 0x6C39 | 0xFFFD | 0xFE | 0x42 |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 85 | 0x6C39 | 0xFFFD | 0xFE | 0x42 |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 86 | 0xFC63 | 0xFFFD | 0xA8 | 0x1E |
| 33 | 11 | NWK: Route Request | 0xC301 | 0x0000 | 0xFFFF | 87 | 0x0000 | 0xFFFD | 0x6E | |
| 5 | 11 | Acknowledgment | 43 | |||||||
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 88 | 0xFC63 | 0xFFFD | 0xA8 | 0x1E |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 89 | 0x035B | 0xFFFD | 0xEF | 0x22 |
| 33 | 11 | NWK: Route Request | 0xC301 | 0x0000 | 0xFFFF | 90 | 0x0000 | 0xFFFD | 0x70 | |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 91 | 0xFC63 | 0xFFFD | 0xA8 | 0x1E |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 92 | 0x035B | 0xFFFD | 0xEF | 0x22 |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 93 | 0x035B | 0xFFFD | 0xEF | 0x22 |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 94 | 0x65A6 | 0xFFFD | 0x69 | 0x1C |
| 33 | 11 | NWK: Route Request | 0xC301 | 0x0000 | 0xFFFF | 95 | 0x0000 | 0xFFFD | 0x72 | |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 96 | 0x65A6 | 0xFFFD | 0x69 | 0x1C |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 97 | 0x65A6 | 0xFFFD | 0x69 | 0x1C |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 98 | 0x9C1E | 0xFFFD | 0x9F | 0xE9 |
| 33 | 11 | NWK: Route Request | 0xC301 | 0x0000 | 0xFFFF | 99 | 0x0000 | 0xFFFD | 0x74 | |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 100 | 0x9C1E | 0xFFFD | 0x9F | 0xE9 |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 101 | 0x9C1E | 0xFFFD | 0x9F | 0xE9 |
| 41 | 11 | NWK: Link Status | 0xC301 | 0x0000 | 0xFFFF | 102 | 0x0000 | 0xFFFC | 0x75 | |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 103 | 0xD1C5 | 0xFFFD | 0xE1 | 0xFB |
| 33 | 11 | NWK: Route Request | 0xC301 | 0x0000 | 0xFFFF | 104 | 0x0000 | 0xFFFD | 0x77 | |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 105 | 0x8878 | 0xFFFD | 0x21 | 0xE5 |
| 33 | 11 | NWK: Route Request | 0xC301 | 0x0000 | 0xFFFF | 106 | 0x0000 | 0xFFFD | 0x79 | |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 107 | 0xD1C5 | 0xFFFD | 0xE1 | 0xFB |
| 36 | 11 | ZDP: Match_Desc_req | 0xC301 | 0x0000 | 0xFFFF | 108 | 0x8878 | 0xFFFD | 0x21 | 0xE5 |
Hi guys,
I seem to have observed a similar issue,
I have a found that "APS Ack's" and "ZDP:MatchDescRsp" messages not been routed back to the source device in a multi Hop network with a Concentrator. I have witnessed this problem with several networks with route paths of two to three hops deep.
Observed behaviour:
When a network is newly formed all data packets are correctly routed to and from the network concentrator when each device reports in at a set interval. The network will work like this for days but over time devices that are three to four hops deep seem to disappear from the network, i.e. stop reporting into the network concentrator. I believe this is because they are not receiving the "APS Ack" from their report messages and unbinding from the concentrator as it appears to not exist. It also appears that when the now unbound device tries to re-find the concentrator through "MatchDescReq" its not receiving the "MatchDescRsp".
I have also found that if I try to connect to one of the devices in the network that has appeared to have disappeared by brute force, i.e. constantly quarrying for an attribute. I can eventually establish a link to the device. Once a link is re-established the device will rebind to the concentrator and start reporting again for a period, before it evenly stopping again. I have also found that if I add a new concentrator to the network that all devices will bind and subsequently create a routing path to the new concentrator and report in, even those that have stopped reporting to the old concentrator. This proves that all the devices in the network are stable and running.
I believe the problem is caused by source routing tables not been recorded properly or routing tables not being updated correctly. I have tried experimenting with different setting in the network concentrator, i.e setting different discovery times and route cache but always get the same result.
Has anyone had the same issue and managed to resolve it?
----------------------------------------------------------------------------------------------------------------------------------------
My test network:
I have a real world network setup with 14 routers spread across a campus made up of 5 building with open spaces and trees in between the buildings. My network is broken down as: 6 Router within the same room as the concentrator(my Office), one hop from the concentrator,
3 Router set at two hops,
2 Routers set at three hops,
3 Routers set at four hops,
Hardware:
Customer boards with CC2530 + CC2591 into an F antenna
Running Zstack 2.5.0, I think I have also observed this behaviour with Zstack 2.3.1
MAC Settings:
MAC_CFG_TX_DATA_MAX 3
MAC_CFG_TX_MAX 5
MAC_CFG_RX_MAX 5
Network settings:
ROUTE_EXPIRY_TIME=0
APSC_ACK_WAIT_DURATION_POLLED=3000
NWK_INDIRECT_MSG_TIMEOUT=30
MAX_RTG_SRC_ENTRIES =12
SRC_RTG_EXPIRY_TIME =10
MAX_RREQ_ENTRIES=8
APSC_MAX_FRAME_RETRIES=3
MAX_POLL_FAILURE_RETRIES=2
MAX_BCAST=9
Concentrator settings I have tried:
CONCENTRATOR_ENABLE true
CONCENTRATOR_DISCOVERY_TIME 0, 15 and 60
CONCENTRATOR_ROUTE_CACHE false and true (not sure if CC2530 can support router cache )
Dear ALL
We too are facing the same problem as above. Please share if anybody got any work around or solution.
Hoping for some support.
Thanks in Advance
Adarsh