Other Parts Discussed in Thread: CC2652R, , CC1312R7, , CC1312R
Tool/software:
Hi,
In one of our network with 20 sensor nodes, the collector will become non-responsive occasionally (~ once every 1 ~ 2 months). During this time, if a sensor node becomes orphaned, then restarted by external MCU to rejoin network, it will go into orphan state again in about ~ 15 seconds. This will keep happening until collector is restarted by external MCU.
This seems to coincide with high number of pending messages in TX data queue. While the SDK doesn't seem to provide an easy way to track the number of messages in TX queue, we added code to manually track pending messages in TX data queue. During this period of non-responsive time, the pending message in TX data queue usually stays around 16 ~ 20. I'm not entire sure whether this is a result or cause of collector becoming non-responsive.
Here is the size of TX/RX queue as defined in collector.opts file:
-DMAC_CFG_TX_DATA_MAX=60 -DMAC_CFG_TX_MAX=150 -DMAC_CFG_RX_MAX=16
The other relevant information are: 1) MAX device number is left as default 50; 2) POLLING_INTERVAL is changed to 60seconds; 3) SDK version: 7.40
There is a recent post on related issue with CC2652R. Presumably something has been done in the latest SDK v8.30.
Can people with knowledge of SDK internals elaborate how we should manage such situation? I suspect this has something to do with insufficient RAM in collector radio, because we have never observed such behavior in smaller networks with fewer sensor nodes. But we also haven't observed such behavior in 15.4-FH while network sizes, TX/RX data queue sizes, and pretty much everything else remains the same.
So please advise.
Thanks,
ZL