• Join
  • Sign In with my.TI Login
Texas Instruments
  • Products
  • Applications
  • Tools & Software
  • Support & Community
  • Sample & Buy
  • About TI
Sample & Purchase Cart Sample & Purchase Cart
  • Search
  • Advanced
TI E2E™ Community
  • Support Forums
  • Blogs
  • Groups
  • Videos
  • 简体中文
  • More ...
TI Home » TI E2E Community » Support Forums » Embedded Software » BIOS » BIOS forum » NDK 1.94: halt on IP_DROP_MEMBERSHIP or fdClose() after DHCP renew
Share
BIOS
  • Forum
  • Announcements
Options
  • Subscribe via RSS

NDK 1.94: halt on IP_DROP_MEMBERSHIP or fdClose() after DHCP renew

NDK 1.94: halt on IP_DROP_MEMBERSHIP or fdClose() after DHCP renew

This question is answered
tomeko
Posted by tomeko
on Jun 21 2012 11:08 AM
Prodigy180 points

Hello,

while working with NDK 1.94 on C64+ platform I've stumbled upon problem - I think - related to this library.

My test device is using DHCP and joins multicast group (IP_ADD_MEMBERSHIP) on startup. When address from DHCP changes (either router was changed or IP assignment on same router after lease timeout) I want to shut down part of application and relaunch it. Unfortunately when I'm trying to close socket that previously joined multicast group some task at priority 9 (I believe one of NDK tasks) hangs up consuming all CPU time indefinitely and starving all lower priority tasks.

This happens only with multicast-enabled socket. If I drop multicast membership before network removed/network added events occur than everything works fine. Task halts when socket is closed or when IP_DROP_MEMBERSHIP option is set after DHCP renew. It happens even if delay between DHCP renew and and socket closing is long and in my test even if renewed address is the same as previous (although I don't want normally to reopen/rebind socket in this case).

I would be glad if someone could confirm problem or send any suggestions. 

Edit: after next few tries I'm not sure if this can happen when renewed address is the same as previous, but  changing router seems to be reliable method to get this effect. Task at priority 9 seems to be NDK kernel in my configuration (default NDK priorities). Halting of this task may be just random effect - at the same time I'm observing using TSK_stat strange (either 100% or some seemingly random, out of range number)  stack usage of two other tasks that are using NDK - that may suggest some incorrect memory access.

Report Abuse
  • Reply
You have posted to a forum that requires a moderator to approve posts before they are publicly available.
All Replies
  • Steven Connell
    Posted by Steven Connell
    on Jun 22 2012 12:53 PM
    Mastermind20600 points

    Hi tomeko,

    Which version of BIOS and CCS are you using?

    This certainly could be due to a task stack overflow and this is the easiest thing to try first.  Can you try increasing the stack size of the tasks that you see having close to 100% stack usage?

    Also, as a side note, there have been several NDK releases since 1.94.  I checked the release notes but did not see any bugs related to this listed as being fixed.  But, it's possible that something causing this issue was fixed but not tracked in the release notes.

    But before going the update route, let's see what the results are with increased stack size.

    Steve

    Report Abuse
    • Reply
    You have posted to a forum that requires a moderator to approve posts before they are publicly available.
  • tomeko
    Posted by tomeko
    on Jun 23 2012 12:17 PM
    Prodigy180 points

    Thanks for answer.

    Stack overflow seemed unlikely - mentioned tasks have stack margins of 8 kB (or little more) out of 16 kB. After removing from code single setsockopt with IP_ADD_MEMBERSHIP and repeating test this margin is kept, so stack overflow would be possible only if using multicast would consume more than 7 kB (taking DSP/BIOS required stack margin into account). If router was not swapped (and DHCP lease time expired) between IP_ADD_MEMBERSHIP and fdClose() or IP_DROP_MEMBERSHIP or if multicast was not enabled than any significant stack usage increase is not observed.

    I'm using DSP/BIOS 5.33.03, cgtools 6.1.7, CCS 3.3. Unfortunately any updates would not be a viable option (at least not just because of this problem) due to project size and risk involved. If I won't find solution I would probably just block functionality that requires multicast when DHCP is configured.

     

    Edit: I've tested with 256 kB stacks and effect is the same.

    Also I would suppose that IP_DROP_MEMBERSHIP action might be source of the problem - it seems to be executed when calling fdClose(). IGMP Leave message is sent when socket is closed under normal conditions (DHCP router not changed), I didn't checked if it actually sent in this situation.

    Report Abuse
    • Reply
    You have posted to a forum that requires a moderator to approve posts before they are publicly available.
  • Steven Connell
    Posted by Steven Connell
    on Jun 26 2012 20:50 PM
    Mastermind20600 points

    Hi tomeko,

    Very glad that you did and didn't find any stack overflow issues ... by the way:

    tomeko
    at the same time I'm observing using TSK_stat strange (either 100% or some seemingly random, out of range number)  stack usage of two other tasks that are using NDK - that may suggest some incorrect memory access.

    ... this is the only reason I suggested that you check the stacks.

    I looked at the code of setsockopts() for IP_DROP_MEMBERSHIP but nothing stood out immediately.

    Do you have the NDK sources with your 1.94 release? Can you try adding the file igmp.c into your project and rebuilding?  This should allow you to set a break point into the function:

    int IGMPLeave (HANDLE hSock, struct ip_mreq* ptr_ipmreq)

    Hopefully you can check to see which part of the code is run for this failure case and then we can go from there.

    Steve

    Report Abuse
    • Reply
    You have posted to a forum that requires a moderator to approve posts before they are publicly available.
  • tomeko
    Posted by tomeko
    on Jun 27 2012 04:02 AM
    Prodigy180 points

    No, I don't have NDK sources (at least not stack.lib sources, just usual 1.94 package). I've tried to link with own (empty) IGMPLeave, but it gives me linker conflict.

    Report Abuse
    • Reply
    You have posted to a forum that requires a moderator to approve posts before they are publicly available.
  • Steven Connell
    Posted by Steven Connell
    on Jun 27 2012 12:56 PM
    Mastermind20600 points

    tomeko,

    Ok, yes, the NDK didn't ship sources until version 2.0.0.  I see that the igmp.c file is the same between 1.94 and 2.0.0.  I've attached it for you.  Can you try again with the attached file?

    Steve

    6786.igmp.c

    Report Abuse
    • Reply
    You have posted to a forum that requires a moderator to approve posts before they are publicly available.
  • tomeko
    Posted by tomeko
    on Jun 28 2012 08:15 AM
    Verified Answer
    Verified by David Friedland
    Prodigy180 points

    Thanks, I didn't know that NDK 2.x comes with source code for stack.lib.

    I think I see the problem. Here is log fragment:

    Link Status: 100Mb/s Full Duplex on PHY 15
    Network Added: If-1:10.0.0.125
    Service Status: DHCPC : Enabled : Running : 017
    DHCP Server 1 = '10.0.0.1'
    Router 1 = '10.0.0.1'
    Mask 1 = '255.255.255.0'
    IGMP: Join hSock = -2121356932, ptr_ipmreq = 80403064
    IGMP: JoinHostGroup
    IGMP: joined
    IGMP: Timer
    ...
    IGMP: Timer
    // eth detached
    Link Status: No Link on PHY 15
    // second router attached
    Link Status: 100Mb/s Half Duplex on PHY 15
    Network Removed: If-1:10.0.0.125
    Service Status: DHCPC : Enabled : Running : 018
    Network Added: If-1:192.168.0.202
    Service Status: DHCPC : Enabled : Running : 017
    DHCP Server 1 = '192.168.1.6'
    Router 1 = '192.168.1.1'
    Mask 1 = '255.255.254.0'
    // closing socket after Network Added event
    IGMP: Leave hSock = -2121356932, ptr_ipmreq = 818e3d14
    IGMP: EINVAL bind
    IGMP: Leave hSock = -2121356932, ptr_ipmreq = 818e3d14
    IGMP: EINVAL bind
    IGMP: Leave hSock = -2121356932, ptr_ipmreq = 818e3d14
    IGMP: EINVAL bind
    ... 

    In SockClose():

    /* If the socket is being closed; we need to ensure that all the multicast group
    * this socket has joined are left. */
    ptr_mcast_rec = (MCAST_SOCK_REC *)list_get_head ((LIST_NODE**)&ps->pMcastList);
    while (ptr_mcast_rec != NULL)
    {
       /* Leave the multicast group */
       IGMPLeave (h, &ptr_mcast_rec->mreq);
       /* Get the head; since the IGMPLeave will have deleted the entry from the list. */
       ptr_mcast_rec = (MCAST_SOCK_REC *)list_get_head ((LIST_NODE**)&ps->pMcastList);
    }

    And IGMPLeave() in my test is leaving immediately without touching multicast list:

    hIf = BindIPHost2IF ((IPN)ptr_ipmreq->imr_interface.s_addr);
    if (!hIf)
    {
       return (EINVAL);
    }

    In effect SockClose() is stuck in loop.
    I'm not sure if my previous report with setsockopt + IP_DROP_MEMBERSHIP was correct - looking into source code I don't see problem there (it would not leave group but it would stuck in loop) and I've done this test only once, so I may have missed that it stuck inside fdClose() not inside setsockopt().

    Anyway, I'll probably have to leave this problem for a moment or two and I'll just deny IGMP in DHCP configuration in next software release.







    Report Abuse
    • Reply
    You have posted to a forum that requires a moderator to approve posts before they are publicly available.
TI E2E™ Community
  • Support Forums
  • Blogs
  • Videos
  • Groups
  • Site Support & Feedback
  • Settings
TI E2E™ Community Groups
  • TI University Program
  • Make the Switch
  • Microcontroller Projects
  • Motor Drive & Control
Other Communities
  • Deyisupport
  • Designsomething.org
  • beagleboard.org
  • TI on Element 14
  • TI on TechXchangeSM
Other Technical & Support Resources
  • WEBENCH® Design Center
  • Product Information Centers
  • Technical Documents
  • TI Design Network
  • TI Technical Articles
  • TI Training

All content and materials on this site are provided "as is". TI and its respective suppliers and providers of content make no representations about the suitability of these materials for any purpose and disclaim all warranties and conditions with regard to these materials, including but not limited to all implied warranties and conditions of merchantability, fitness for a particular purpose, title and non-infringement of any third party intellectual property right. TI and its respective suppliers and providers of content make no representations about the suitability of these materials for any purpose and disclaim all warranties and conditions with respect to these materials. No license, either express or implied, by estoppel or otherwise, is granted by TI. Use of the information on this site may require a license from a third party, or a license from TI.

Content on this site may contain or be subject to specific guidelines or limitations on use. All postings and use of the content on this site are subject to the Terms of Use of the site; third parties using this content agree to abide by any limitations or guidelines and to comply with the Terms of Use of this site. TI, its suppliers and providers of content reserve the right to make corrections, deletions, modifications, enhancements, improvements and other changes to the content and materials, its products, programs and services at any time or to move or discontinue any content, products, programs, or services without notice.

Follow Us Texas Instruments on Facebook Texas Instruments on Twitter Texas Instruments on LinkedIn Texas Instruments on Google+
TI Worldwide | Contact Us | my.TI Login | Site Map | Corporate Citizenship | mobile m.ti.com (Mobile Version)

TI is a global semiconductor design and manufacturing company. Innovate with 100,000+ analog ICs and
embedded processors, along with software, tools and the industry’s largest sales/support staff.

© Copyright 1995-2013 Texas Instruments Incorporated. All rights reserved.
Trademarks | Privacy Policy | Terms of Use