This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS/TM4C129XNCZAD: NDK IP change with HTTP connection

Part Number: TM4C129XNCZAD
Other Parts Discussed in Thread: SYSBIOS

Tool/software: TI-RTOS

Trying to change the IP address when using the HTTP server gives me the error: 00010.933 llExit: Illegal call to llExit()

The code changes the IP in response to an HTTP GET which contains the new IP value. I am trying to remove the CFGTAG_IPNET configuration in the same CGI function that received the HTTP GET message.

The CGI function gives the error when calling CfgRemoveEntry()  and then CfgAddEntry() is not able to run. I can't tell for sure because the chip his loader_exit() when attempting to single step over the functions.

Other post have had a priority issue when getting the llExit error. There is a task called dchild that gets priority 9 (kernel) but I think this task is related to the HTTP server. I did not create this task in XGCONF.

Am I removing IPs in an unreasonable way? What is causing my error?

  • There's a similar situation here: e2e.ti.com/.../2050531

    But Leonardo Muricy's response does not make sense (to me). When I try to ad a new IP configuration entry without removing the old one, I get a callback error and a message that says "00006.623 BindNew: Duplicate bindings ignored"

  • Instead of trying to change the IP address in the CGI function, I created a low priority SWI.

    The CGI posts the SWI and it still crashes the stack. The chips ends up at loader_exit(), but the console output is different:
    00010.186 llExit: Illegal call to llExit()
    Network Removed: If-1:192.168.0.1
    00010.186 llEnter: Illegal reentrant call to llEnter()
    Network Added: If-1:192.168.0.100

    When is it safe for the stack to change and IP address?

    Edit: This is actually a bad idea. The CGI function gets interrupted by the SWI, so I haven't actually changed the timing of anything; lateral progress.

  • I made a low priority task which successfully set the new IP address. The CGI function posts this task's semaphore and the task runs after the CGI function. The IP address is changed, but the chips runs into loader_exit() in response to any network activity. Pinging unrelated IP addresses will put the chip at loader_exit().
  • I set a button press to change the IP address. This worked very well with the HTTP server.

    This seems to be a timing issue. Something happens after the CGI function to make changing the IP address safe.

    The CGI can set some timer for changing the IP address, but this is poor engineering. There must be some authoritative source that tells me when changing the IP is safe...
  • I used the clock module to create a half second delay. The CGI function tigers the clock to start, and then the clock function changes the IP address with no problems.

    This feels more like a "make it fit" fix as opposed to understanding the root cause of a problem and creating a solution.

    Why does the stack crash when the IP is changed within a CGI function?
    How much time has to pass before the IP address should be changed?

  • Hi Peter,

    Sorry for the delay. The hand-off on handling forum threads did not go smoothly during a vacation.

    Is this still a problem?

    Todd
  • Hi Todd,

    Yes, I still need help. I have more of a workaround than a fix. Everything slows down this time of year, including me! It is vacation season.

    The goal is to change IP address according to a user input. I chose to do configuration through a browser interface by using the HTTP server module. An HTML page is loaded on the chip's firmware. The web browser loads the page and responds to the HTML form with an HTTP GET message. The chip handles the message in a .CGI function. The chip then sends the browser another HTML page that reads "Success!".

    The problem is the stack breaks when changing the IP during the CGI function or immediately after the CGI function. I get the error "llExit: Illegal call to llExit()" on the console and the chip hits a break point at loader_exit(). I have seen in other threads this error coming from tasks that exceed kernel priority, but this is not the case.

    The workaround is the CGI function sets a 0.5 second timer. The timer ISR then changes the IP. I think the ISR is technically called a HWI in the context of your RTOS.

    This workaround seems dangerous because I am guessing that something changes 0.5 seconds after the HTTP communication to make the IP address change safe. There's a risk that the same something can occur again after 0.5 seconds to make the change unsafe during the ISR.

    Could you tell me what this something is and how to check for it?

  • Hi Peter,

    I just wanted to let you know that I'm looking into this and will get back to tomorrow.

    Steve

  • Hi Peter,

    I wanted to update you again. I was able to reproduce the problem you are seeing. I'm still digging into it and will reply back here once I have some more insight.

    Steve

  • Steven,

    Did you ever get to the bottom of this problem?
  • Hi Peter,

    Fyi...Steve is out this week. He'll respond when he gets back into the office next week.

    Todd
  • Peter,

    My apologies for the lack of response. I'm still looking at this and will report back shortly. Thanks for your patience.

    Steve

  • How have you been Steven?
  • Hi Peter,

    I have to again apologize for the lack of response, but I have not forgotten about this problem and have been working on it behind the scenes.

    It turns out that this issue is due to a race condition, which has been pretty tough to pinpoint.

    The good news is I have a pretty good handle on it now and am testing out a possible solution. Once I have it working, I'll pass it on so you can try it on your end.

    Steve
  • Peter,

    Just wanted to update you again. The solution I had didn't pan out and actually brought more (related) issues to light regarding this race condition.

    I'm still working through it and will update you again soon.

    Steve
  • Hi Peter,

    I finally have something for you! Please find the attached tarball which addresses the following 4 issues:

      1.  NDK-36 netsrv.c missing llEnter()/llExit() pair around socket clean up code (?)

      2.  NDK-225 Prevent lingering sockets from being freed in SockCleanPcb

      3.  NDK-230 Prevent lingering sockets from being freed when TCP_RST received

      4.  NDK-232 Call Sock6CleanPcb() when an IPv6 address is removed and/or IPv6 is deinitialized

    Issues 2 - 4 were found as a result of fixing #1. They all involved race conditions to free sockets between different threads when an IP address was removed (the issue you hit) or when the stack was rebooted or shut down.

    You should be able to drag and drop the packages folder from the attached onto your NDK intsallation's packages folder (e.g. C:/ti/tirtos_tivac_2_16_00_08/products/ndk_2_25_00_09/packages on your PC) and say "yes to all" (please make sure to back up your NDK installation! Or, you might want to first try this on a copy of the NDK install).

    Once you've updated the files in your NDK with the attached, you will need to rebuild the NDK stack, and then rebuild your application.

    Just reply back here if you need any more help with this or have any questions.

    -Steve

    ndk_2.25.00.09_raceConditionFixes.tar.gz

  • To be cautious, I am building in an ndk copy folder. The copied files appear to have compiled, but I do not know which library to copy back over to my main stack install.

    The NDK user guide says "If you configure the NDK in CCS with the XGCONF configuration tool, the appropriate NETCTRL library is
    automatically selected based on the modules you enable".

    How can I tell what library is selected by CCS? How can I tell that this library isn't being recompiled by CCS (undoing my changes)?

    The wiki gives instructions for building in debug mode. Do I need to build a copy with and without debug mode?

  • I can see a setting in CCS project properties to make a makefile in the projects Debug/ folder. There is no reference to an NDK file in this make file.

    In the console output, I see CCS creating another make file for sys/bios "making ../src/sysbios/sysbios.aem4f ...". I  do not see any setting in CCS project properties for this make file. There is no reference to and NDK file here either.

    I found a _linkinfo.xml file that calls out packages\ti\ndk\hal\timer_bios\lib\hal_timer_bios.aem4f and packages\ti\ndk\netctrl\lib\netctrl_ipv4.aem4f. So The NDK lirabries are being specified somehow..

    What is telling CCS to make the sys/bios library? How can I tell what NDK libraries are selected by CCS? When would it make NDK libraries?

  • Hi Peter,

    Peter Borenstein said:
    To be cautious, I am building in an ndk copy folder. The copied files appear to have compiled, but I do not know which library to copy back over to my main stack install.

    There are a lot of libraries to copy ... It might be easier to make your project point to the updated/copied/rebuilt NDK.

    Or, you could "trick" your project into using the updated version by renaming the NDK folders.

    For example, you have 2 versions of the NDK install right now. Something like:

    a. ndk_2.25.00.09

    - the original NDK installation

    b. ndk_2.25.00.09_copy

    - the rebuilt/patched NDK installation

    You could rename these to swap them:

    1. rename a. to be "ndk_2.25.00.09_orig"

    2. rename b. to be "ndk_2.25.00.09"

    (Just make sure that "b." is located in the same parent folder as "a.")

    Since your project is just pointing to "ndk_2.25.00.09", it should now bring in the updated libraries.

    Finally, rest assured that you can always go back to the original version of ndk_2.25.00.09 by re-downloading it dropping it into your TIRTOS products folder.

    Peter Borenstein said:
    How can I tell what library is selected by CCS? How can I tell that this library isn't being recompiled by CCS (undoing my changes)?

    You CCS project won't rebuild the NDK libraries, as it does for the SYS/BIOS libraries.

    The NDK libraries will be linked into your app based on your *.cfg settings. And depending on those settings, it will choose appropriate libraries for you.

    You can see exactly which NDK libraries are being linked in by looking at your project's generated linker command file (*.xdl). It's located under the "Debug" or "Release" folder of your project. Please refer to this post for a screen shot that will show you this.

    Peter Borenstein said:
    The wiki gives instructions for building in debug mode. Do I need to build a copy with and without debug mode?

    The NDK ships release mode libraries, so I think you should just build for release mode. The only reason you would want to build the debug versions of the libraries is if you want to single step through the actual stack code. So if you want to do that, then you should build them in debug mode with optimization turned off as described in the wiki.

    Hopefully this clears things up for you, but let me know if you have more questions on this.

    Steve

  • Hi Peter,

    Just checking back in.

    Were you able to try the above steps?

    Steve
  • I will shortly. I have been sidetracked.

    Review meetings go smoother when you announce new features. There isn't much love given for fixing a bug everyone forgot about months ago.
  • Steven,

    Your changes worked! I was able to re-create the problem, compile your changes, and then the problem went away. Some of the files named in the .xdl file had their "date modified" changed in windows file explorer after running gmake.

    Changing the static IP of my Netgear switch is done through a web interface too. I believe your fix will be celebrated by many.

  • Great! Glad you were finally able to move past this.

    Steve