This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CC1314R10: 802.15.4 data polling problem

Part Number: CC1314R10
Other Parts Discussed in Thread: SYSCONFIG

Tool/software:

The problem is very similar to this 

https://e2e.ti.com/support/wireless-connectivity/sub-1-ghz-group/sub-1-ghz/f/sub-1-ghz-forum/1190449/cc1310-what-are-the-possible-causes-of-a-sensor-node-disconnects-from-a-collector-in-ti-15-4-network-with-frequency-hopping-enabled

After adding 20-21 sensors, we observe incorrect behavior.

From the dump we see that the coordinator does not send data to the sensor after a poll request. What can we check in our code?

RX and TX queue sizes set to 8. The problem depends on the number of sensors in the network.

  • 8865.test.zip

    Network traffic dump

  • That is nobody can help us?

  • Hi,

    Thank you for posting and thanks for the log.

    I would advice you to check the collector statistics when this happens (global variable: Collector_statistics). 

    Can you give an estimate of how often it fails? (Is it every package once the network size reaches 20 sensors?)

    Cheers,

    Marie H

  • We will try to use MAC_STATS functionality.

    The problem is random after reached about 20 sensors with upload period 2 seconds, not every request failed.

  • Hi,

    Please do.

    Can you also check if increasing the tx and rx buffer on the collector helps? (If there is no available tx buffer when the collector application tries to file the data packet it will not be able to send it..)

    Cheers,
    Marie H

  • We repeated the test, counters were not increasing, we displayed. Was increasing only sensorMessagesReceived

    Collector_statistics.ackFailures

    Collector_statistics.channelAccessFailures
    Collector_statistics.otherTxFailures
    Collector_statistics.rxDecryptFailures
    Collector_statistics.txEncryptFailures
    Collector_statistics.txTransactionExpired
    Collector_statistics.txTransactionOverflow
    Collector_statistics.sensorMessagesReceived

    It is very strange that we don't see a response to only one sensor but between data requests from same sensors (between retries) we see successful communication between the coordinator and other sensors, it looks like only one sensor communication "stuck". But next time it happens with another sensor randomly.

  • Hi,

    Do you have the frame counter enabled (I think it goes under MAC_SECURITY)? If this one gets out of sync, the collector will start ignoring packets (for security reasons).

    Cheers,
    Marie H

  • We will try. Could you please explain what "gets out of sync" means?

  • Hi,

    out of Sync means in this case that the frame counter does not match anymore the packet number.

    Since you see the Collector_statistics.sensorMessagesReceived counter increasing this means that the message is received but that there is some kind of error because of which there is no response sent to the sensor.

    This counter is increased by the dataIndCB on the collector. And you can use the callback case to check the received message if needed.

    Kind regards,
    Theo

  • Thanks Theo,

    we placed Collector_statistics.sensorMessagesReceived++; to dataIndCB and right after that sendMsg so our program sends data to the sensor problem is why the coordinator does no send it after the poll request. We see ACK after DATA and DATA REQUEST are sent to the coordinator, I guess it means the coordinator received the poll request but no data after that.

  • Hi,

    with a poll request the sensor requests data from the collector. The collector should acknowledge the poll request and send data to the sensor.

    Can you track in the collector dataIndCB that it received the poll request and that a message is sent to the sensor returning mac success in the collector dataCnfCB?

    Please help me to understand at which point of the communication it fails.

    Kind regards,
    Theo

  • We made new test with disabled MAC_SEC 

    #define CONFIG_SECURE                false 

    on sensor side and

    on collector side

    Below network dump filtered for only one sensor with the problem. As you can see no response from the coordinator 5 times and the sensor changes state to Orphan and then the coordinator realignment event. On dump you can't see but ACK from the coordinator sent to the sensor after any type of message DATA or DATA REQUEST.

    network dump.zip

  • Hi Theo,

    not sure I understood your question. For polling there is another callback function pollIndCB, why should we track dataIndCB for a poll request?

  • network dump full.zip

    Full network dump

  • Hi SV, 

    thank you for the full capture. 

    I apologize for the typo. You are right that the polling indication callback is pollIndCB. What I was refering to is that when the collector sends a message successful the mac layer returns in the dataCnfCB with Api mac status success. 

    But on your sniff log we can now clearly see that the collector does not send data out to the polling sensor and because of that it transitions to orphan.

    The log also shows that the collector receives another data request after 2 s. I think we actually might run into timing issues here because the collector tries with the configuration message to setup the sensor timers such that each sensor has a different slot.


    To confirm before we go on with further debugging, have you tested the same network with a 5 s polling and reporting interval instead of 2 s and does the error persist?

    Kind regards,
    Theo

  • We changed the period and the problem appears not so often with 5 sec period and even less with 10sec, so it depends on the period.

    Not sure I understood "collector tries with the configuration message to setup the sensor timers such that each sensor has a different slot."

    We modified the original code of the coordinator, and the collector doesn't send any config to the sensors, we use only data messages without any config options because we don't need it. Could you please explain more about the coordinator algorithm and config messages and time slot? Is the time slot a necessary thing, I guess the message can appear on the coordinator in any random time and should be processed?

  • Hi,

    thank you for the confirmation. Let me explain what is happening.

    When you are running the 15.4-Stack and a sensor joins the collector it needs to send a configuration request to the collector. The collector sends following a configuration message to the sensor that is received in sensor.c processConfigRequest(). This re-initializes the polling and reporting timers and by sending the configuration message at different times to different sensors they end up to trigger their timers in different "slots" that can overlap or conflict.

    In your case it seems like exactly this kind of collisions lead to some sensors not receiving data anymore.

    Did you delete the processConfigRequest() (sensor) and configuration message sending (collector) from your application?

    Kind regards,
    Theo


  • We removed processConfigRequest and config processing on the coordinator side. Could you please explain what the coordinator does before sending the config? As I understood sensors just changes period for upload the data. How coordinator chooses the timeslot?

  • I have checked generateConfigRequests in the new example project and I didn't find any "timeslots" there 

    collect uses only these predefined intervals

    CONFIG_REPORTING_INTERVAL
    CONFIG_POLLING_INTERVAL

  • Hi SV,

    I think this explains the issue. The config request is needed for the network setup.
    Please see the procedure summary below.

    Project Setup
    - In the sensor SysConfig you configure the reporting and polling interval.
    - In the collector SysConfig you configure the reporting and polling interval.

    Network Join
    - The collector opens the network.
    - The sensor starts looking for a network and uses the reporting and polling interval from the sensor SysConfig.
    - The sensor joins the collector network.
    - The sensor sends a configuration request to the collector replies with a configuration message.
    - The sensor processes the configuration message and re-initializes its network timers with the collectors SysConfig reporting and polling interval.

    This last step ensures that the network timers of the sensors are reinitialized at different points in time which helps to avoid collisions and creates "different slots".

    Could you please try to use the configuration message procedure?

    Kind regards,
    Theo

  • Below alg how the collector chooses "timeslot".

    as you can see for all the sensors used fixed data uploading period and I don't see any individual per sensor interval, please correct me if I am wrong.

    We use fixed CONFIG_REPORTING_INTERVAL and that is why we don't need config messages.

    CONFIG_REPORTING_INTERVAL
    CONFIG_POLLING_INTERVAL

    /*!
    * @brief Generate Config Requests for all associate devices
    * that need one.
    */
    static void generateConfigRequests(void)
    {
    #ifndef POWER_MEAS
    int x;

    if(CERTIFICATION_TEST_MODE)
    {
    /* In Certification mode only back to back uplink
    * data traffic shall be supported*/
    return;
    }

    /* Clear any timed out transactions */
    for(x = 0; x < CONFIG_MAX_DEVICES; x++)
    {
    if((Cllc_associatedDevList[x].shortAddr != CSF_INVALID_SHORT_ADDR)
    && (Cllc_associatedDevList[x].status & CLLC_ASSOC_STATUS_ALIVE))
    {
    if((Cllc_associatedDevList[x].status &
    (ASSOC_CONFIG_SENT | ASSOC_CONFIG_RSP))
    == (ASSOC_CONFIG_SENT | ASSOC_CONFIG_RSP))
    {
    Cllc_associatedDevList[x].status &= ~(ASSOC_CONFIG_SENT
    | ASSOC_CONFIG_RSP);
    }
    }
    }

    /* Make sure we are only sending one config request at a time */
    if(findDeviceStatusBit(ASSOC_CONFIG_MASK, ASSOC_CONFIG_SENT) == NULL)
    {
    /* Run through all of the devices */
    for(x = 0; x < CONFIG_MAX_DEVICES; x++)
    {
    /* Make sure the entry is valid. */
    if((Cllc_associatedDevList[x].shortAddr != CSF_INVALID_SHORT_ADDR)
    && (Cllc_associatedDevList[x].status & CLLC_ASSOC_STATUS_ALIVE))
    {
    uint16_t status = Cllc_associatedDevList[x].status;

    /*
    Has the device been sent or already received a config request?
    */
    if(((status & (ASSOC_CONFIG_SENT | ASSOC_CONFIG_RSP)) == 0))
    {
    ApiMac_sAddr_t dstAddr;
    Collector_status_t stat;

    /* Set up the destination address */
    dstAddr.addrMode = ApiMac_addrType_short;
    dstAddr.addr.shortAddr =
    Cllc_associatedDevList[x].shortAddr;

    /* Send the Config Request */
    stat = Collector_sendConfigRequest(
    &dstAddr, (CONFIG_FRAME_CONTROL),
    (CONFIG_REPORTING_INTERVAL),
    (CONFIG_POLLING_INTERVAL));
    if(stat == Collector_status_success)
    {
    /*
    Mark as the message has been sent and expecting a response
    */
    Cllc_associatedDevList[x].status |= ASSOC_CONFIG_SENT;
    Cllc_associatedDevList[x].status &= ~ASSOC_CONFIG_RSP;
    }

    /* Only do one at a time */
    break;
    }
    }
    }
    }
    #endif
    }

  • Hi SV, there is no individual interval.

    When the sensor receives the configuration message from the collector it is reinitializing its polling and reporting timer.

    The configuration message is sent to all sensors at different points in time which means they reinitialize their timers at different times.

    This leads to "different slots".

    You can see the reinitialization of the sensor timers in sensor.c -> processConfigRequest().

    Please test with using this configuration message as it is by default.

    Kind regards,
    Theo

  • Thanks Theo,

    Unfortunately, It will be very hard to test because we have to rewrite a big part of the code. And I still do not understand how it will help if the collector doesn't expect a packet from the sensor in a predefined time (also for the timeslot clock on the coordinator and the sensor should be synchronized, but that functionality doesn't exist), so in any case arrival time will be random and the collector should process the packet.

  • What else can we do for the diagnostics of the problem? 

  • Hi SV, 

    the collector is when idle always in RX. 

    It will help because the collector sends the configuration message to each sensor at a different time. So the reinitialization point of the sensor timers happens at different times which helps to decrease collisions.

    What I would advise you to do is to implement your own "configuration" message. You can once the network is set up send a message one by one to each sensor where you trigger them to reinitialize their polling and reporting timers. By doing that one by one you will create timeslots.

    Kind regards,
    Theo

  • We made new test with one sensor and the same problem was seen, so it is not caused by a collision. 

    18.07.25-sensor_side-wireshark (1).pcapng.zip

  • Hi SV, thank you for running this test and sharing the log. 

    Since you are using the Linux coprocessor for the collector and the embedded sensor: How did you remove the configuration message in code and what other modifications have you made to the applications?

    One of the timers causes the sensor to be orphaned. To trace down which one it is I need to understand what the sensor expects to receive.

    Kind regards,
    Theo