This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CC3220SF: HTTP Server all threads blocked, bug apears sporadically

Part Number: CC3220SF
Other Parts Discussed in Thread: SYSCONFIG

Hello

I am using the cc3220sf on a custom board with a MCU device. The cc3220sf(in further text wifi device) is running a http server based on the one in the portable example with TIRtos. The Mcu and wifi device are connected through UART.
Due to the board being wifi certified with sdk 5,20 we are using that sdk.

The problem is the following:
I connect and open a page from the wifi device. The server returns the page and all the scripts and images needed to open it. After loading the js it automatically sends a request for the WIFI device and the  MCU device to enter uart communication. Aproximately 1 in 10 or 20 times, this request fails. If this happens, all the requests that go through the http server fail. We can get files and images,
I can reload the page, but mq_receive in the httpServerThread never triggers.

Once this bug happens, looking at the object view there are 5 threads. Idle task is running and never stops,.httpServerThread us blocked on Task_sleep(2), which i can not find anywhere in the code.
Uart TX thread is blocked waiting on an event which is as intended. The Uart RX task is blocked on "Unknown"(?), which it should be blocked waiting for an interrupt(blocking call of UART2_read from driver) on uart rx but the unknown part may be suspicious. The sl_Task is blocked on a semaphore. I can not find the TI Rtos implementation of the sl_Task source code anywhere, could you give me information where to find it in the sdk?

All the tasks remain blocked indefinetly, and we can not figure out why. Without giving away too many details as im not sure what im allowed to share, the Uart tx thread will not trigger without receiving data from httpServerThread. The uart RX thread will not trigger without the MCU receiving data from the Wifi device. All requests toward the http server are relatively small in size and happen automatically and sequentially on load, one after another. As far as we can see, the very first request never triggers the mq_receive, but due to the sporadic and unpredictable nature of the bug its hard to confirm this.

Refreshing the page reloads it, but httpServer requests still never go through. the only way to bring it out of this bugged state is to power it off and back on.

All HW initialization was done using sysconfig GUI from Code composer 12.
All in all it seems to me like i have a breakdown of communication between the network processor and the M4 on the cc3220SF, but have been unable to figure out the reason( for more than 2 weeks now). Its obvious im missing some crucial infomration to fix it.

Please advise

  • To add some questions to this: How does the automatic sending of files work? Does the network processor handle that?

  • Update: I have found that the task sleep is actuallly from the Network_IF_InitDriver function, the last else block.
    The device starts in AP mode but the g_ulStatus bit for STATUS_BIT_IP_AQUIRED is not set whenever this bug happens
    The SET_STATUS_BIT for this particular bit is only called in SimpleLinkNetAppEventHandler, which i dont explicitely call anywhere in my code, and cant find it called anywhere in the server code which i based this on.

    Bug is stil sporadic, happening every one in 10 or 20 restarts. Please advise

  • yes, the network processor handles direct access to the file system.

    only if it can't find a match in the ROM or in the file system - it will forward the request to the host (see details in chapter 9 of the programmer's guide)

    Then it should be received by SimpleLinkNetAppRequestEventHandler() - if you didn't change the code there should be a print there ("[Http server task] NetApp Request Received ...."). Can you see this one?

    the sl_Task is implemented in the host driver (in source/ti/drivers/net/wifi/source/spawn.c (see _SlInternalSpawnTaskEntry).

  • If IP address is lost (flag should be handled in the SimpleLinkNetAppEventHandler you can add log messages there) - you can expect issues in the networking. But the reason is not clear. 

  • Kobi, thank you for your responses, i have more information. I have found the sl_Task in the location you mentioned.
    Display driver and that type of UART usage is not currently available on the custom board as it is connected to the MCU device,I could try running it from the LaunchXL board and report my findings, since we now know that the uart tasks themself are not the problem.

    For reference, i adapted the portable example using TI Simplelink academy.

    Can you confirm some things: sl_Task is the one in charge of communication between the main and network processors? Most of the Network_IF functions are triggered by sl_Task?

    I have questions about the http server thread:
    sl_Start is called, then sl_Stop, then the status bits are cleared in Network_IF_ResetMCUStateMachine, then sl_Start is called again in Network_IF_InitDriver. What is the reason for the multiple starts of the network processor?(as far as i understand that is what sl_Start does?)

    From what I see, SimpleLinkNetAppEventHandler is called and the IPV4 event happens, STATUS_BIT_IP_AQUIRED is set, im assuming this happens as a consequence of the first sl_Start. After that ResetMCUStateMachine happens. After that when the bug doesnt happen and everything is normal, the STATUS_BIT_IP_AQUIRED is set again. However when the bug happens, the bit is never set again, and the IP address(both gateway and station) read 0. Im completely unaware of why this could happen.

    However the SSID still starts broadcasting either way and i can connect to it. As mentioned before, the files from file system can still be fetched, but all the server requests never reach the server

    What is the exact meaning of STATUS_BIT_IP_AQUIRED, since the device is in AP mode?

  • sl_Task handles the communication (command-response, async event and rx data) from the NWP to the host. The thread will invoke all the async event handlers.

    Host commands (and data) are executed in the caller's (i.e. application) thread context (every command blocks till its immediate response is received, blocking commands like "sl_Receive"  will block until the relevant event , e.g. data reception will happen). 

    Some configurations of the device (specifically the NetAPP service configuration) will not take place until the NWP is being reset - thus the sl_Stop/sl_Start.

    You can read about it in the programmer's guide (Specifically in appendix B.1).

    In AP mode - the STATUS_BIT_IP_AQUIRED will return immediately when the AP becomes functional.

    I'll need to see some logs to help on the exact issue but the information here should help you checking the application code to eliminate possible race conditions. 

  • So i have a somewhat accidental and unexpected resolution to this,i think.

    Because we used the portable example declarations for Temperature mutex and temperature thread were left, as well as initialization for parameters of those two. Hovewer the thread was not even created(the definition is even missing, temperature.c is not even part of the project anymore), and the thread is not visible in ROV.

    Deleting these declarations and parameter initializations seems to have fixed the issue. I have no explanation, i was hoping maybe one of the Gurus does, maybe something in TIRtos i was not aware of. Not sure if its resolved or i just made the bug even less common but so far it has no reapeared. Either way i will mark this thread as resolved