This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CC1352P7: Linux collector + mac coprocessor - lockup/resource exhaustion - Beagle Play hardware

Part Number: CC1352P7

Tool/software:

I'm using a Beagle Play as a collector in a Sub-Gig system. Configured for 200 kbit, frequency hopping operation.

The collector binary has some light customization to add some messages that are custom to our application, and likewise the collector that runs on the Linux side has matching changes. The co-processor binary is running SDK 7.10.2.23, and the linux binary is using the latest release, which is 4.40.00.03. The only modifications to the co-processor are to increase the number of Tx and Rx queue entries.

The sensors are running in dual mode configuration using DMM to simultaneously run the SubGHz stack and BLE stack. We are using SDK 6.10.0.29. I am using a custom application, written in python, to talk to the collector over the TCP socket, the same as the TI example app.

The issue I've run into is that after a long period of runtime (6 weeks or so) with 8 sensors joined and reporting, the co-processor seemed to have stopped responding. Sensors were NOT joined as indicated by a few dead batteries from attempted re-joins, and by reading the provisioning profile sensor status characteristic. The collector socket was still open, and there was nothing observably wrong, but it was apparent that the co-processor to upper level communication had broken down.

What I'm looking for are some suggestions as to what might have happened, and how I might gather better information for the next time that this happens. I will need to implement a solution that can be deployed to the field and inspected after failure. It will not always be practical to have direct real time access to misbehaving units. I suspect that possibly there's a leak of some resource, for example if the mac tx or rx queue had entries that never were released.

What I've done in the meantime is to put a watchdog on the communications from the collector to my application. If I don't see any sensor packets for fifteen minutes, I kill the collector, reset the co-processor, restart the collector and re-establish the TCP connection. This is quite a crude solution. It should be effective, but it's not great.