Other Parts Discussed in Thread: Z-STACK, CC2530,
This is a question related to "CC2530: Problem with modifying Z-Stack Linux Gateway code, Gateway server doesn't answer after a while" because we also run into the problem that the gateway server stops working after minutes to hours. It then goes into a state where si_send_packet() in demo/framework/socket_interface.c returns early because waiting_for_confirmation is and remains true. The system doesn't recover.
We decided to use the »zstack linux gateway« because this was recommended by a TI employee here in the forum a while ago and because it supports HA. I've meanwhile read that the gateway is no longer supported, but replacements don't support HA and we are kind of stuck now.
We haven't changed much in the gateway code, we only fixed:
- several cases of "use of uninitialized values", some of which are definitely bugs, while others might be false positives. (This was complained about by cppcheck.)
- a few cases of unaligned memory access, which, depending on the configuration of the kernel, can be a problem on ARM hardware
- a few other bugs, like dereferencing of null-pointers or leaks
- code that caused compiler warnings
Depending on the frequency of attribute requests, the gateway server group fails sooner or later. Restarting the gateway_server often helps, sometimes restarting the network_manager does. Even when attribute requests fail with error message "BUSY - please wait for previous operation to complete", zigbee commands do still work for a while, until they start to fail as well.
Is there anything I could try or tune to avoid the zstack linux gateway failing completely? Has the gateway ever been tested with any serious real-world software, or has this never left the demo stage with simple examples?

