This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS/TM4C1294NCPDT: NDK Http server stops responding until TCP accept()

Part Number: TM4C1294NCPDT

Tool/software: TI-RTOS

I'm having a problem where the NDK's HTTP server will occasionally (after about 5 hours of runtime) stop responding until the TCP daemon accepts a new connection which somehow changes the network stack state such that the HTTP server starts responding again. Can anyone suggest how I should start debugging this issue? I've attached the debugger, but haven't noticed anything weird as far as task preemption or memory issues. 

Along similar lines, does anyone have a suggestion of where I might be able to hook into the NDK to kick the watchdog timer? If the NDK does get into a hung state I would like to reset my system.

Thanks,

Nick

  • Hi Nick,

    What is the lease time on your DHCP supplied IP address? I don't know of issues related to this, but it might something to look at. It might be interesting to have WireShark running and see if anything fishy is happening around the same time.

    I need to think about the watchdog question a bit more, but you could always make a task that just pings something (e.g. the default gateway).

    Todd
  • Hi Todd,

    The IP is static and reserved by the DHCP server so there shouldn't be any conflicts. There won't be a DHCP server at all in a production environment. I will run wireshark today and report back with the results.

    Thanks for the suggestion about the task that pings the default gateway. Can I be sure that this task will hang when the HTTP server hangs? The reason I ask is because the TCP listener task continues to respond normally, otherwise I would just switch to a non-blocking socket and kick the watchdog from that task. I will try it out and see.

    Thanks,

    Nick

  • So one thing I immediately noticed using wireshark is that there was a bug in one of our javascripts causing it to make HTTP requests to a CGI function at a fairly high rate (10 requests/sec or so). I don't think this should hang the NDK, but I mention it here in case it's related.
  • Regarding the ping task, does the NDK have a way to send an ICMP Ping that doesn't involve setting up a console?
  • Hi Nick,

    Talk a look at ti/ndk/tools/console/conping.c in the NDK for an example.

    Todd
  • Thanks, I had looked at that function but was hoping there was something more convenient.

    Is there no way to hook into the stack event loop with a callback? The fact that the TCP server still works when the HTTP stops responding makes me nervous that the ping method will actually accomplish my goal.

    I haven't been able to re-create the issue with wireshark open so far, but I will try again today.

  • Hi Nick,

    Were you able to get a Wireshark capture yet?

    I'm particularly interested in seeing what Wireshark shows while you try connecting to the HTTP server and it doesn't respond.

    It could also be useful to print the TCP socket table during this time. I have an idea for how to do this.

    Since you have a TCP daemon that doesn't have this response problem, then you could modify it to call a new function that prints the socket table out.

    The idea is this:

    1. Run the app, wait for ~5 hours
    2. Run Wireshark (after 5 hours and before trying the HTTP server)
    3. Try to connect to the HTTP server.
      1. Anything interesting showing in Wireshark?
    4. If it doesn't respond, then try connecting to your daemon server (which should respond).
    5. With the update to the daemon thread, it should now print out the sockets.
    6. Is the HTTP socket there listening?

    You can print out the socket table with the code in the attached file. You should be able to just add this file to your project and build it in.

    Then, you can call it as follows to print the TCP sockets:

    SockUtils_printSockTable(IPPROTO_TCP);

    // optionally call System_flush() afterward to get the output to the console.

    Steve

    SockUtils.h

    SockUtils.c
    /*
     * Copyright (c) 2015-2017, Texas Instruments Incorporated
     * All rights reserved.
     *
     * Redistribution and use in source and binary forms, with or without
     * modification, are permitted provided that the following conditions
     * are met:
     *
     * *  Redistributions of source code must retain the above copyright
     *    notice, this list of conditions and the following disclaimer.
     *
     * *  Redistributions in binary form must reproduce the above copyright
     *    notice, this list of conditions and the following disclaimer in the
     *    documentation and/or other materials provided with the distribution.
     *
     * *  Neither the name of Texas Instruments Incorporated nor the names of
     *    its contributors may be used to endorse or promote products derived
     *    from this software without specific prior written permission.
     *
     * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
     * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
     * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
     * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
     * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
     * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
     * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
     * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
     * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
     * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
     * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
     * */
    /*
     *  ======== sockutils ========
     */
    #include <xdc/std.h>
    #include <xdc/runtime/System.h>
    
    #include <ti/ndk/inc/netmain.h>
    #include <_stack.h>
    
    static char *States[] = { "CLOSED","LISTEN","SYNSENT","SYNRCVD",
                              "ESTABLISHED","CLOSEWAIT","FINWAIT1","CLOSING",
                              "LASTACK","FINWAIT2","TIMEWAIT" };
    
    /*
     *  ======== SockUtils_printSockTable ========
     */
    void SockUtils_printSockTable(unsigned int protocol)
    {
        unsigned char *pBuf;
        int entries;
        int i;
        SOCKPCB *ppcb;
        char str[40];
    
        /* check argument passed and translate */
        if (protocol == IPPROTO_TCP) {
            protocol = SOCKPROT_TCP;
        }
        else if (protocol == IPPROTO_UDP) {
            protocol = SOCKPROT_UDP;
        }
        else {
            System_printf("SockUtils_printSockTable: Error: invalid protocol\n");
        }
    
        pBuf = mmBulkAlloc(2048);
        if (!pBuf) {
            System_printf("SockUtils_printSockTable: Error: out of memory\n");
            return;
        }
    
        /* Use llEnter / llExit since we're calling into the stack */
        llEnter();
        entries = SockGetPcb(protocol, 2048, pBuf);
        llExit();
    
        System_printf("\nLocal IP         LPort  Foreign IP       FPort  State\n");
        System_printf("---------------  -----  ---------------  -----  -----------\n");
    
        for (i = 0; i < entries; i++) {
            ppcb = (SOCKPCB *)(pBuf + (i * sizeof(SOCKPCB)));
    
            NtIPN2Str(ppcb->IPAddrLocal, str);
            System_printf("%-15s  %-5u  ", str, NDK_htons(ppcb->PortLocal));
    
            NtIPN2Str(ppcb->IPAddrForeign, str);
            System_printf("%-15s  %-5u", str, NDK_htons(ppcb->PortForeign));
    
            if (protocol == SOCKPROT_TCP) {
                System_printf("  %s\n", States[ppcb->State]);
            }
            else {
                System_printf("\n");
            }
        }
        System_printf("\n");
    
        mmBulkFree(pBuf);
    }
    

  • Nick,

    Did this get resolved?

    Todd
  • Thanks for the help. Fortunately I have not been able to reproduce the issue since fixing the bug in our javascript that essentially flooded the HTTP service with requests. I'm not sure why accepting a socket connection from the TCP daemon got the HTTP service to start responding again, but it's hard to imagine a non-malicious reason for this to occur (provided our app is bug-free). I'm going to add some integration tests to load test the HTTP server and see if I can re-create the issue, and if I can I'll get the data you requested.

    Should have it within a day or two.

    Cheers,
    Nick
  • I have been unable to reproduce the issue. Seems to drop connections consistent with the connection limit defined in the .cfg file.