This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS/CC3220: What can i do for SL_DEVICE_EVENT_FATAL_DEVICE_ABORT error ?

Part Number: CC3220


Tool/software: TI-RTOS

Hi, TI Engineer,

I used CC3220 to initially complete product development, but when the router protocol type connected to the device is 802.11ac, the SL_DEVICE_EVENT_FATAL_DEVICE_ABORT error will occur at high frequency, especially when the signal is weak.
I tried to call sl_Stop followed by sl_Start after error occur, but it didn't help, it still caused the whole chip to restart(WDT). I don't want the whole chip to restart, as it will cause other accidents. If I restart the network separately and recover, what should I do? ? Or what should I do if I don't reset the entire chip after encountering this error?

Waiting for your reply online, thanks!

  • Hi user1889926,

    The CC3220 only supports 802.11bgn in station mode. Please see the CC3220 datasheet: www.ti.com/.../cc3220.pdf

    Best regards,
    Sarah
  • Hi, Pelosi,

    From the datasheet, CC3220 only supports 802.11b/g/n, but 802.11ac and 802.11b/g/n should be compatible, and 802.11ac also specifies backward compatibility with 802.11b/g/n, CC3220 uses 802.11b/ g/n can communicate with the router, but why is this error occurring during communication?

    Best wishes,
    paul
  • Hi Paul,

    A router that supports 802.11ac should fall back to 802.11n, unless you have disabled that somehow. But the CC3220 will not connect as 802.11ac or reach those speeds. What do you define as "high frequency"?

    Errors are expected at the edge of a router's range where the signal is weak no matter the standard. What are you seeing for your error event DeviceAssert Code and Value?

    If you get a fatal error, the only option is to restart the network processor, but this shouldn't restart the MCU. If you are hitting the watchdog because of inactivity, what are you setting the timeout on sl_Stop to? We suggest 200 ms.


    Best regards,
    Sarah

  • Hi, Sarah,


    The device can connect to the router of 802.11ac protocol, the network configuration of CC3220 refers to the demo of Network Terminal, there is no so-called "High Frequency" definition!

    After an error occurs, the printed information is as follows:
                [ERROR] - FATAL ERROR: Abort NWP event detected: AbortType=1, AbortData=0x263df

    I tried to restart the network device, but it didn't work. In the meanwhile, the following error message also appear:
                [ERROR] - FATAL ERROR: Async event timeout detected [event opcode =0x8]

    The restart code is as follows:

    RetVal = sl_Stop(SL_STOP_TIMEOUT); //SL_STOP_TIMEOUT = 200
    ASSERT_ON_ERROR(RetVal, DEVICE_ERROR);

    RetVal = sl_Start(0, 0, 0);

    ASSERT_ON_ERROR(RetVal, DEVICE_ERROR);
    SocketCreat();

    With the above code, after the error occurs, the CPU will still hang, the other application code is not running. It looks like there is a while(1) statement that takes up the entire CPU.

    How can I restart the network device without restarting the CPU?

    Best wishes,

    paul 

  • Hi Paul,

    Your original question said "the SL_DEVICE_EVENT_FATAL_DEVICE_ABORT error will occur at high frequency, especially when the signal is weak." I was asking what you meant by high frequency there.

    sl_Stop and sl_Start only resets the network processor, not the application MCU. Are you getting any return values from sl_Stop or sl_Start? Is there a SimpleLink API function that is not returning? We need to know where the application is hanging. Please print your terminal when the error occurs.


    Best regards,
    Sarah

  • Hi,Sarah,

    I am sorry that I did not respond in time because of other things.
    For the SL_DEVICE_EVENT_FATAL_DEVICE_ABORT error, it occured in about 2 minutes if the signal is very weak (I removed the antenna of the device).
    When the error occurred, I tried to add some more prints and found that after executing the sl_stop(), after the start() function was executed, a hang occurred, and then the following code was printed.

    [NETAPP EVENT] IP set to: IPv4=192.168.1.241 , Gateway=192.168.1.1
    user:
    [ERROR] - FATAL ERROR: Abort NWP event detected: AbortType=1, AbortData=0x6af5
    SL stop finish
    SL begin start
    [ERROR] - FATAL ERROR: Async event timeout detected [event opcode =0x8]

    [line:779, error code:-2018] Device error, please refer "DEVICE ERRORS CODES" section in errors.h
    SL stop finish
    [line:747, error code:-2005] Device error, please refer "DEVICE ERRORS CODES" section in errors.h

    Please help diagnose the cause of the problem and help solve the problem. Question, thank you!

    yours,

    paul

  • Hi paul,

    Did you call sl_Stop()/sl_Start() directly from asynchronous handler? What SDK version and version of ServicePack do you have uploaded in the device?

    Jan

  • Hi, Jan,
    I call sl_stop() and sl_start() in case SL_DEVICE_EVENT_FATAL_DEVICE_ABORT of function SimpleLinkFatalErrorEventHandler().

    The version information printed by the device is as follows:
    Platform: CC3220R
    CHIP ID: 0x31000010
    MAC: 2.0.0.0
    PHY: 2.2.0.6
    NWP: 3.6.0.3
    ROM: 0
    HOST: 2.0.1.26
    MAC address: 90:70:65:02:f3:e9

    yours,
    paul
  • Hi paul,

    Do you call sl_stop() and sl_start()  directly inside SimpleLinkFatalErrorEventHandler()?

    It looks that your code is based on SDK version 1.60 and ServicePack from this SDK version. This SDK version is slightly outdated but it should be fine, though is recommended to use latest SDK version. But you should update ServicePack to latest version at least. Version of ServicePack that you use is affected by Krack vulnerability. Also newest ServicePack may fix other issues.

    Jan

  • hi, Jan,
    Yes, I call sl_stop() and sl_start() directly inside function SimpleLinkFatalErrorEventHandler() that reference from document SWRU455E (P35)!
    What is the correct operation for the FatalError occur?

    SDK version is 1.60, and used the ServicePack of the SDK, and whether the version of the service pack needs to be consistent with the version of the SDK?

    yours,
    paul
  • Hi paul,

    It is prohibited call any sl_ API from interrupt or asynchronous handler context especially at all SDK versions before 2.20. Calling sl_ API from asynchronous handlers is a best way how to crash SimpleLink driver and RTOS itself. I suppose that after calling sl_Stop()/sl_Start() inside async. handler, you fall into hard fault.
    At page 35 of SWRU455 is not stated that you need to perform NWP restart directly from handler context.

    Yes, SDK version is tied with particular ServicePack version and validated with this particular ServicePack. ServicePacks are backward compatible with previous SDK versions. Usage newer Service Pack with older SDK will probably not recommended by any TI engineer from app. team, but according my experiences you will not have any issue.

    Jan
  • Hi, Jan,


    You said earlier that It is prohibited call any sl_ API from interrupt or asynchronous handler, then you said the best way is to call the API interface in an asynchronous interrupt. I am confused. Please specify how to call sl_stop() and sl_start() after the SL_DEVICE_EVENT_FATAL_DEVICE_ABORT error occurs. ,Thanks!


    I tried to call sl_stop() and sl_start() in the main thread, but after SL_DEVICE_EVENT_FATAL_DEVICE_ABORT  error occur, the sl_stop() and sl_start()  I called didn't run, it seems still suspended.

    please help tell me where and how call sl_stop() and sl_start() when SL_DEVICE_EVENT_FATAL_DEVICE_ABORT  error occur, thanks!

    yours,

    paul

  • Hi paul,

    I said that that calling sl_ API from interrupt or asynchronous handler context is a best way how to CRASH driver and RTOS (=crash your code).

    You need to call NWP restart from thread context. That mean you need to signalise from handler and serve it in the task. You need to be sure that you not call any other sl_ API when abort event is signalised.

    BTW ... you should consider to update to latest SDK version. Update from version 1.60 to 2.30 should be nice and smoothly.

    Jan
  • Hi, Jan,
    Thank you for your patient answer, so where should I add this code(sl_stop()and sl_start())? The SL_DEVICE_EVENT_FATAL_DEVICE_ABORT error is occurring during normal device running, there is not a thread context here, and it seems that the SL_DEVICE_EVENT_FATAL_DEVICE_ABORT error event happened in the background.
    I'm trying to update the SDK and sevicepack to the latest version now!

    yours,
    paul
  • Hi paul,

    I don't know how your code is designed and from this reason I cannot suggest you a exact place. In general you need set in event handler flag (e.g. semaphore) and react to this flag inside your tasks. You need to break normal code execution and restart NWP by (sl_Stop/sl_Start).

    Jan
  • Hi, Jan,

    I understand what you mean, I try to add a error event flag after the error occurs, then confirm the flag and perform the restart operation in other tasks, but after the error event occurs, the restart code in other tasks is not executed, in other words, once the error event Occurs, the system hangs, nothing else be executed, that is, I said earlier that there is a While(1) statement somewhere.
    That why that i don't know where to add my sl_Stop/sl_Start!

    yours,
    paul
  • Hi paul,

    That while(1) loop is not somewhere but is probably in hard fault handler. How to debug exception in TI-RTOS you find training.ti.com/debugging-common-application-issues-ti-rtos from 7:16.

    There is no reasons that FATAL_DEVICE_ABORT will cause hard fault unless you do something significantly wrong inside your code. I am not aware about any bug in SimpleLink driver 1.60 which could cause similar issue.
    I saw hard fault in case of calling sl_Stop() in moment when another sl_ API is still executed at SDK 2.20. This was easy to solve by additional critical section above sl_ API (it looks that this is solved at current 2.30).

    Jan


  • Hi, Jan,

    In recent days, I have completed the upgrade of the SDK and service pack. The following is the printed informatio
    ============================================

    CHIP: 0x31000010
    MAC: 2.0.0.0
    PHY: 2.2.0.6
    NWP: 3.9.0.6
    ROM: 0
    HOST: 3.0.1.41
    MAC address: 90:70:65:02:e1:af

    ============================================

    I unplug the antenna and power on the device. After running for a while, there will still be NWP abort errors.Now that, i am not trying to completely avoid the NWP Abort error. Of course, I can solve it better.

    I hope that you can help me to perform NWP reset separately without performing the entire CPU reset, because that will cause other problems, Please give me some specific steps, or code examples, etc. thanks a lot !

    yours,
    paul

  • Hi paul,

    OK, it looks that you successfully ported your project to latest SDK version.

    NWP restart is done by sl_Stop()/sl_Start() as you know. Code-flow should work like that.
    - In case you detect abort event, you need set some flag (e.g. global variable, OS synchronisation object).
    - Other task need react to this flag. Task with network communication need breaks own loop and wait till is NWP ready again.
    - There should be also one tasks which (control task) will do NWP reset.

    Jan
  • Hi, Jan,

    I have adjusted and optimized the code according to your suggestion, re-tested and found that there will still be NWP errors. I tried to add the sl_stop and sl_start code, but the following error occurred!

    [ERROR] - FATAL ERROR: Abort NWP event detected: AbortType=1, AbortData=0x263df

    [line:4256, error code:-2005] Device error, please refer "DEVICE ERRORS CODES" section in errors.h

    Is this the location of the placement is wrong, for the NWP error(SimpleLinkFatalErrorEventHandler(), then which thread should I put to restart NWP, thank you!

    yours,

    paul

  • Hi paul,

    I am sorry, I am not familiar with this kind of the abort event.

    I don't know how your code is designed and where you call NWP restart.

    Jan
  • Hi Paul,

    If you're still having this issue, I wanted to clarify that SimpleLinkFatalErrorEventHandler() is an event or interrupt handler and not a thread. You can check out a SDK example like cloud_ota to see how we use SignalEvent(APP_EVENT_RESTART) to reset after a fatal event outside of the handler.

    Best regards,
    Sarah