Hi,
I am working on an implementation of a ZigBee Smart Energy compliant in-home display which uses CC2530 and ZStack 2.5.1a in order to act in a ZigBee network as an end device. The implementation is almost finished and we are preparing the device for official ZigBee conformance testing, but there is one weird issue that currently prevents many of the tests from passing reliably.
In several test cases, the test harness acting as a Trust Center sends a ZCL Read Attribute command frame in order to verify that our device sends a correct reply. However, these tests occasionally fail (about 20% of the time) because our device randomly replies with the wrong attribute. Each time this happens, the attribute ID contained in the reply has its less significant byte changed to 0x05. So if the Trust Center requests attribute 0x0000, our device sometimes replies by sending attribute 0x0005 instead, etc.
Looking at the packet traffic with a third party sniffer didn't reveal anything particularly interesting: TC sends a ZCL Read Attribute packet, its payload contains one attribute ID and it is 0x0000. Then our device sends a ZCL Read Attribute response and its payload contains attribute ID 0x0005 and its current value (or an unsupported attribute error, if the attribute ID has transformed into something unsupported).
To debug the issue further, I then modified the zstack by adding zclSendMsg(pInMsg) to zclProcessInReadCmd() in both ZNP and ZAP level code. This way, all received ZCL Read Attribute packets got relayed to application level event loop, and I was able to verify that by the time the incoming command frame got processed by ZNP in zclProcessInReadCmd(), the attribute ID had already mysteriously changed to 0x0005.
This behaviour is really baffling, since these frames are encrypted and therefore the possibility of packets getting corrupted during transmission is basically out of the question. Most of the ZCL Read Attribute packets are received correctly, but the problem still occurs often enough to prevent compliance tests from passing at the moment. It is also quite possible that similar corruption occurs with other types of command frames as well, they just haven't been tested as extensively. I have noticed that Certificate-Based Key Establishment occasionally fails for no obvious reason, and succeeds on retry, so that could be related too.
Any ideas what to do about this?