This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CCS/AM3359: TCP/IP drops connection

Part Number: AM3359
Other Parts Discussed in Thread: SYSBIOS

Tool/software: Code Composer Studio

Hi!

Currently having a issues with TCP/IP dropping connection after > 100K msg and right after an ARP request; mmCheck looks fine too. Added wirershark log.

Thanks for any help!!

Below is the evironment (are these compatible?)

ndk 3.6

pdk 1.0.15 

bios 6.75.2.0

XDC 3.51.1.18_core

21:48 ( 32%) 18:96 ( 56%) 1:128 ( 4%) 2:256 ( 16%)

2:512 ( 33%) 0:1536 0:3072

(15360/49152 mmAlloc: 510669/0/510628, mmBulk: 2/0/0)

2 blocks alloced in 512 byte page

16 blocks alloced in 96 byte page

1 blocks alloced in 128 byte page

20 blocks alloced in 48 byte page

2 blocks alloced in 256 byte page

Data received 108

disconnecting tcp: link 1 connection 0

0334.arpmsgcausedropconnect.zip

  • A little more info:

    TCP connection is drops the connection with an error code of 60 which seems to be coming from the socket itself and the socket flag = 0x21 and then after 20-30sec goes to 0x301 ~. If I read this correctly can't send or recv more data to/from peer. This is all after about 25mins and ARP request and it gets the response!  It seems this is moving down to the socket itself and how it handle msg in general.

    Could there be a configuration I am missing somewhere?

    Any help where to look would be great!

    Rob

  • Rob,

    Sorry! I thought this is a continued discussion about a TCP/IP connection drop after ~30 minutes on AM335x BBB.

    The environment we used for AM335x RTOS 1.0.15 is:

    REM Version of XDC
    set XDC_VERSION=3.55.02.22

    REM Version of BIOS
    set BIOS_VERSION=6.76.02.02

    REM Version of CG-Tools for ARM
    set CGT_VERSION_ARM=GNU_7.2.1:Linaro

    REM Version of the PDK

    set PDK_VERSION=1.0.15

    REM Version of the NDK
    set NDK_VERSION=3.61.01.01

    This is inside pdk_am335x_1_0_15\packages\pdkProjectCreate.bat. I am not sure if using some earlier SYSBIOS and XDC will cause any comparability issue.

    Please update the toolset and re-test, also let us know if AM335x is TCP server or client? What example ran on it? What is the other side? We run the  NIMU_FtpExample_<board>AM335x_armExampleproject for a long time and didn't see any failure.

    Please point out which function/code return "socket itself and the socket flag = 0x21 and then after 20-30sec goes to 0x301 ~. "

    Regards, Eric

  • I go and do the updates.. I see the same issue with UDP. with UDP the NDK_send provide and error code of 55 which means no buffer available, that was confirmed by SndNoPacket count incrementing. I don't believe this is a TCP issues I believe it much lower in the stack. Probably the function that tries to get a buffer - I believe "*SockCreatePacket( void *hSock, uint32_t Payload )" is that possible, where does the buffer get freed?

  • I can't seem to locate the NDK3.61.01.1. That's the only thing missing to rebuild libs. Can you please provide link?

    also client, server  or UDP does the same thing. In regard to the 0x21 going to 0x301 it is the open socket flag in the data structure.

  • I can't seem to locate the NDK3.61.01.1. That's the only thing missing to rebuild libs. Can you please provide link?

    also client, server  or UDP does the same thing. In regard to the 0x21 going to 0x301 it is the open socket data structure.

    one more thing for clarity: target = gnu.targets.arm.A8F and platform: ti.platform.evmAM3359

  • here is structure in BMP and excel format

    sockflag.zip

  • I just notice when I build the NDK lib 3.60.00.13 an error is generated that I did not notice early, it may not matter. slnetifndk.c won't build because it can't find:

    #ifdef SLNETIFNDK_ENABLEMBEDTLS
    #include "mbedtls/ssl.h"
    #include "mbedtls/entropy.h"
    #include "mbedtls/ctr_drbg.h"
    #include "mbedtls/certs.h"
    #include "mbedtls/x509.h"
    #include "mbedtls/ssl_cookie.h"
    #include "mbedtls/timing.h"
    #include "mbedtls/net_sockets.h"
    #include <entropy_alt.h>
    #endif

    the ifdef seems to be define in the mak file but this dir does not exist???

  • Hi,

    All the packages can be downloaded and installed from the AM335x Processor SDK RTOS package, 6.0 release : http://software-dl.ti.com/processor-sdk-rtos/esd/AM335X/latest/index_FDS.html .

    " I build the NDK lib 3.60.00.13" =====> Why you need to build NDK library? 

     "*SockCreatePacket( void *hSock, uint32_t Payload )" ======> What is your test application? Is that any of TI NDK/NIMU examples provided under pdk_am335x_1_0_15\packages\ti\transport\ndk\nimu\example? OR, you wrote your own socket application?

    I tried our NDK examples above and set the break point at SockCreatePacket(), it is never been hit.  There is a gap in our NDK/NIMU test application, we didn't develop any socket programming examples in Processor SDK RTOS package.

    Regards, Eric

  • Good morning Eric!

    I'm building the lib to help debug this issue. The link you provided installs the NDK 3.60.00.13 not 3.61.03. Thoughts?

    The example I started with was the beaglebone black "NIMU_BasicExample_bbbAM335x_armExampleproject" and build on that. If you recall I had issues with ping the unmodified code, it would work, however, if I paused for a period of time the bbblack would stop responding or come back after a couple failures. It sound like an development environment issue but everything looks correct.

    Is there any recommendation (checking:counter, flag, structures or...)you could make?

    Do you have a know good project for BBB that I can run?

    Note: the weird thing is UDP or TCP/IP it stops after 25-30 mins of running reliable

    Thanks for any help!

  • Hi,

    1) For the NDK package, I re-installed the latest PROCESSOR-SDK-RTOS-AM335X, 06_00_00_07. The development environment is inside pdk_am335x_1_0_15\packages\pdkProjectCreate.bat. The NDK is 3.60.00.13, PDK is 1.0.15, SYSBIOS is 6.75.02.00 and XDC is 3.51.01.18. Sorry, I was wrong about those. I have many versions of Processor SDK installed, including the next one 6.1. I mixed thing up!

    2) If you want to debug something in NDK, either you rebuild the library. Or, you just add the NDK source code files into your CCS project and change there, also use the original library linked. What inside the source code will take over the same functions in the library. So you don't have to rebuild library every time. When you add the source code into CCS project, typically there are header/include path from those new files to resolve.

    3) The same is true if you want to debug NIMU library, just as 2).

    4) For NDK, there are some global variables, you can add into CCS expression window (see below) to check how many packets Tx/Rx. 

    5) For NIMU, we typically add some global counter for Tx and Rx inside nimu_eth.c to track.

    6) All the NIMU BBB examples are under pdk_am335x_1_0_15\packages\ti\transport\ndk.

    But it may not easy for you to debug. A better way is you can provide a test application with source code (based on TI example) and explain how you test it for us to reproduce the issue and debug.

    Regards, Eric

  • ndk_3_60_00_13.zipThanks for the info! I'm glad i'm not going crazy about NDK 3.61.03! in addition, I went back to the original unmodified NDK libs seems to be running correctly now - crossing my finger though. I'm providing the Modified make file for you to look at. Is there some wrong in it for the am335X processor?

  • ok, went back to the original libs, error takes longer; however, 1hr later stop. in the NDK_tcp structure SndNoBuf is now = 1 and it will not respond to ping or ARP requests.

    Rob

  • I try and have a scaled down verse that can be set for UDP or TCP (Client) Monday or Tuesday. It there is anything you can recommend that could remedy the SndNoBuf incrementing that would be great 

  • Hi,

    In the CCS project .cfg file, there is: Tcp.transmitBufSize = 16384; ======>try to increase to 32768 or 65536 to see if it helps.

    If not, in the pdk_am335x_1_0_15\packages\ti\transport\ndk\nimu\src\v4\cpsw_nimu_eth.c, declare a global variable like dropCounter = 0; then check if any packet dropped in Tx direction in below code (you need to rebuild the NIMU library then rebuild your application, another way is to add this source code into your CCS project and rebuild the application):

    csl_send_pkt.PktChannel = 0;
    csl_send_pkt.PktLength = PBM_getValidLen(hPkt);
    csl_send_pkt.PktFrags = 1;


    sendResult = emac_send(0, &csl_send_pkt);

    if(sendResult)
    {

    dropCounter ++ ;
    NIMU_drv_log1("CPSW_sendPacket() returned error %08x\n",i);

    /* Free the packet as the packet did not go on the wire*/
    PBM_free( (PBM_Handle)csl_send_pkt.AppPrivate );
    }

    If this dropCounter != 0, means some packets didn't send out due to EMAC Tx problem. How heavy is the traffic when this happened?

    Regards, Eric

  • Good Morning Eric,

    Sorry really busy yesterday. I'm attaching a scaled down version of the project (Beaglebone black).  gTCPConnectionType sets the type of connection client = 1 sever = 0 and udp = 2. At the end of stackfunctions.c the remote and local address are defined.

    TCP server mode echo the recvd bytes back. UDP mode just transmit data, however, I had to put about a 25ms or more sleep time, so it does not give me a 55 for an error (no buffer available (true don't understand that one).

    What I am seeing:

    TCP server seems to work fine until the byte recvd and echo back goes about 250 bytes (at 210 seems to work fine). Then it work for awhile and the ARP request is end out and then no more data is recvd and then sometime after the connection is dropped

    The transmission rate does not matter!

    I did notice that there is (PBM buffer) 192 frames with 1536 bytes per frames is the default. I don't think this is the issue, just seems like a lot of buffer mem.

    Thanks for any help you can provided.

    Rob

    Lam_ndk.zip

  • Hi Eric!

    It looks like there is a issue with rtable.c, by disabling timer " _RtNoTimer = 1" the extra arp requests are gone and the problem of comm drop seems to be gone too.

    What I believe is happening, regardless of whether data is flowing on a given IP address it will clear the table entry when the time has expired, thereby creating the arp request. However, if a send command is requested before the table is repopulated an issue is created. What I believe should happen is if data is flowing, the timer should be reset so that IP entry is not deleted from the table.

    In addition, If data is not flowing and the timer expires an arp request should be sent and if and only if there is no response from the arp request then the entry should be deleted.

    I hope this make sense??

    -thoughts?

    Rob

     

  • BTW: I will retest, however, I don't believe this will solve the UDP data rate issue.

    RB

  • Good morning Eric!

    I Don't know if you have had a chance to look at the project? It seems all the changes I have made to stack have fix the TCP/IP dropping and the UDP time as well. I'm attaching the files I modified and would like to get a response to he the changes and especial the change to the rtable.c file and my comment about the table.

    beyond that I would like to have everything in C and centralize the libs; however, without out this "var Global = xdc.useModule('ti.ndk.config.Global)'" line of code in the cfg file Nc_start return -1. Why is that?

    Thoughts

    Rob

    files mod in stack_ndk 3.60.00.13.zip

  • Hi Eric!

    I have not heard back from you in awhile, I hope all is well and please let me know your thoughts

    Rob

  • Hi Rob,

    I will try to find a BBB with JTAG emulator to try your stack changes.

    For Nc_Netstart() returning -1 issue, you probably can dig into the function:

    /* Boot the configuration */
    if (!(TaskCreate(NS_BootTask, "ConfigBoot", OS_TASKPRINORM, bootStkSz,
    (uintptr_t)hCfg, 0, 0))) {
    /* Couldn't create boot task, don't start scheduler, close down stack */
    NetHaltFlag = 1;
    NetReturnCode = -1;
    }

    Regards,
    Garrett

  • Hi Garrett!

    Thanks for looking into this. As of today the UDP is sending msg out at 2.5ms (per my sleep task time). and TCP/IP as not stop either. I guess, if you go back to my understand of what was going on, does that make sense?

    On the NC_NetStart function, I have look at the code and the underlining question is why would it work fine with 'ti.ndk.config.Global in the Cfg file and once remove and point to the same libs that are now local it return -1. What is the 'ti.ndk.config.Global actually doing or adding to the code, beyond the time below:

        /* Create the NDK heart beat */
         ti_sysbios_knl_Clock_Params_init(&clockParams);
         clockParams.startFlag = TRUE;
         clockParams.period = 200;
         ti_sysbios_knl_Clock_create(&llTimerTick, clockParams.period, &clockParams, NULL);

    Thanks for any help you can provide!!

    Rob

  • if I may ask, could it be the iv6 is disable in the cfg "ti.ndk.config.Global"? Note, when I complied the code (without the below) I did have to add the stk6.aafg to the build, because it wants the iv6 init function. How do you disable iv6 outside the below method? if you have a simple project that I could look at, that would be great!

    version that works

    var Global = xdc.useModule('ti.ndk.config.Global');

    Global.ndkTickPeriod = 200;

    Global.enableCodeGeneration = false;

    Global.IPv6 = false;

  • Rob,

    >>What is the 'ti.ndk.config.Global actually doing or adding to the code

    Please refer to the NDK API user guide (), section 4.4 - 4.5 for the detailed explanation about the initialization procedure using and not using XGCONF.

    For your code change, why do you put the Tx memory free in the 'Shutdown read' instead of write?

    I just tried to comment out the 'ti.ndk.config.Global' in .cfg file of your project, instead of 'ipv6' error you mentioned, there are a few llTimer related error and undefined reference to ti_ndk_config_global_taskCreateHook and ti_ndk_config_global_taskExitHook.

    You don't seem have to add the auto generated header file LamNDK_pa8fg.h.

    Regards,

    Garrett

  • yes! you will get those errors until you add iiTimer extern and tasks. in addition, yes, you can remove the .h reference it was added during my debug the arp/tcp issue. I will review the 4.4-4.5 again to see if I missed something. you are right put the free TX  mem in the wrong place -oops. I guess i'm not perfect yet!

  • Garrett,

    Here is a project that remove the NDK global and builds correct. Once you run it will give a -1 when running NC_NetStart. if you remove the lib in the link area it will give the IPv6Init is undefined, does not seem right, IVP6 is not defined anywhere I can see.

    Lam_ndk_C.zip

    Thanks for all you help too!!

    Rob

  • Hi Rob,

    I could see the issues you reported with the project Lam_ndk_C.zip, but can't continue to look into or debug it until late next week as I am out of the office.

    Regards,

    Garrett

  • Garrett thanks for your help!! No worries at all. I have it working with the global NDK defined (just want to get to true C only). It also seems like the example that is defined in the manual 523 did seem right? posix timer and a reference to a ndk.c file that I can't fined anywhere. Please let me know your thoughts about the chances I made to the stack especially the _RtoTimer = 1 verses 0. 1 works 0 does not (0 = default).

    Thanks once again for all your help

    Rob

  • Rob,

    I just returned and was able to revisit the issue.

    For posix timer, the functions are from bios_6_7x\packages\ti\posix\tirtos\timer.c.

    Instead of setting _RtNoTimer=1, can you please try to configure the expiration timer with the following routine? I am trying to get a BBB with JTAG and reproduce the issue....

       int rc = <timeout value in seconds>;
       CfgAddEntry(hCfg, CFGTAG_IP, CFGITEM_IP_RTKEEPALIVETIME,                                                                                                               
                  CFG_ADDMODE_UNIQUE, sizeof(uint32_t), (unsigned char *)&rc, 0);

    Regards,
    Garrett

  • Hi Garrett!

    I hope you have a great time-off.

    I will try this; however, How will extending the time change the overall dynamics, beyond extend when the issues occurs? Also in regards to using raw C and no NDK reference in the cfg file. Does my code seem right?

    "void networkStack(UArg arg0, UArg arg1)" I'm wondering if the lltimer is right, it is different from the example..

    Rob 

  • Btw: I have two socket opened and running a probe on each without issue. I have also cleanup the full blow code too. I will update the Lam_NDK_C for a base line here. If you are interested I can re-zip and send you a copy: it support UDP and TCP/IP (Client and sever) with up to 4 sockets (I have only tested two running though).

    rob

  • Rob,

    Thanks! Ideally you should not need manually link each libraries as shown in your project. If you refer to other NIMU/NDK examples in PDK, there is no servers_ipv4.aa8fg, netctrl.aa8fg...in the --library area.

    We have some internal discussion with regard to _RtNoTimer. As we haven't reproduced it in our end, the option to enable debug trace in the route module may help.

    int rc = 1;
            CfgAddEntry(hCfg, CFGTAG_IP, CFGITEM_IP_RTCENABLEDEBUG,                                                                                                                
                    CFG_ADDMODE_UNIQUE, sizeof(uint32_t), (unsigned char *)&rc, 0);

    Good to know you have the code fully working! Yes, please share your baseline code here so other people can be benefit as well.

    Regards,

    Garrett

     

  • Thanks! I'm not sure I understand.. The last msg you sent before you left was that you were able to see the issue, which issue were you able to see or were you mistake after retesting?

  • Rob,

    I was able to see the ipv6 init issue with your CCS project. For the TCP/IP dropping connection with default _RtNoTimer issue, I could get a BBB with JTAG soldered this evening but somehow it always gave error - E_RPCENV_IO_ERROR(-6) No connection while connecting to xds200 emulator, will continue to look into this tomorrow.

    Regards,
    Garrett

  • good morning Garrett!!

    OK, sounds good! I should have the testing on your suggestion done today or Monday. On the lib front, The desire is to isolate the project from the full NDK and reference the libs local to the project(s). Can I assume the project should create a socket without the global NDK reference ( noting the missing libs)?

    Rob

     

  • Rob,

    Can you also try the attached .out file on your board? This is the default NIMU project from PDK 1.0.15 without modification. I have run this on BBB and ping from PC with '-n 10000' for a few hours...4062.NIMU_BasicExample_bbbAM335x_armExampleproject.out

    Regards,

    Garrett

  • HI!

    Yes! I'll have it for you Monday. I have a lot of other stuff to do today.. The ping all the time work for me too, it when you stop and X time the original issues developed.

  • HI Garret,

    I'm sending you a zip file with project with two out files. One is built with new libs and other with old. The ip address is 192.168.1.10 BBB and 192.168.1.100 host. There are two sockets created (UDP) that send out 1400 bits. stackfunction.c file has the config info. The CfgAddEntry you asked me to add did not solve the issues. I do believe the early msg on what could be going on during an ARP request still make sense to me.

    Rob

     3704.projects.zip

  • also within the files you can config the number of socket and type of socket - see below

    //Server = 0, client = 1, UDP = 2 NO type 0xff

    Connectiontype

  • Hi Garrett,

    Have you had a chance to look at the project? if so, two question:

    were you able to duplicate the issue (ARP msg causing lockup, when rtable = 0)?

    Any help on why the project can't remove the (below) and adding  var Settings = xdc.useModule('ti.posix.tirtos.Settings');  It complies but will not create a socket and error out of NC_NetStart with -1.

    It looks like it drop out of NetScheduler hwen NetHaltFlag == 1. I'm wondering if the phy chip is not connecting to the stack. However, I don't see any structure or ptr with Null though.

    "var Global = xdc.useModule('ti.ndk.config.Global');

    Global.enableCodeGeneration = false;

    Global.IPv6 = false;

    Global.ndkTickPeriod = 200;"

    Thanks for info you can provide!!

    Rob

  • Rob,

    I downloaded your new project but didn't have chance to look into it yet. I accumulated a few e2e threads due to recent travels. My colleague will be helping look into this as well, and update you if any progress...

    Thanks,
    Garrett

  • Thanks!

    no worries, I'll be on vacation all next week

    rob

  • Hi Rob,

    I have tested the two OUT files from your zip file on BBB with the Win10 PC. They both stop responding to the ping command from PC after about 4 mins idle, when the CCS console started to display "send data = -1"  

    I also tested the NIMU_BasicExample_evmAM335x_armExampleproject in Processor SDK RTOS 6.0.0. The result is about the same.

    Then I changed the "_RtNoTimer = 0" to "_RtNoTimer = 1" in C:\ti_am3_600\ndk_3_60_00_13\packages\ti\ndk\stack\route\rtable.c and rebuild the NDK with ndk.mak. After rebuild the NIMU_BasicExample_evmAM335x_armExampleproject, I load and run the newly generated OUT file. It worked as expected. I pinged the BBB after 30 mins, 2 hours and 10 hours idle from PC. They all worked.

    I did notice that the first ping after run or first ping after long idle returned two "Request timed out", then it goes to normal.

    Please try the attached OUT file on your platform.

    Ming

  • Hi Ming,

    Thanks for confirming what I'm seeing - very awesome. Earlier within this thread I stated what I believe is happening and how I believe ARP should work within the routing table. In addition, I provided a project that supports multi-sockets and what happens with the UDP msg and TCP/IP with the value set to 1 and 0. I'm on vacation this coming week, so I won't have any setup to run anything; however, I will have access to answer any question about my testing.

    I don't know if you can help but I want to encapsulate the libs outside have cfg file having var Global = xdc.useModule('ti.ndk.config.Global'). To this end I do have it compiling without issue, however, it seems like the layer between the phy and NDK is not being connected without the global def in cfg file (just a guess). The llTimer is running and the phy chip is found too. Would you have a basic project that is straight C per the ndk manual? 

    Rob

  • Hi Rob,

    Unfortunate the _RtNoTimer in NDK cannot be set using the  Global = xdc.useModule('ti.ndk.config.Global') in CFG file.

    On the other hand, since _RtNoTimer is a global variable in NDK, you may be able to change its value by define it as external variable in your application code and then directly assign a value to it:

    extern _RtNoTimer;

    ...

    _RtNoTimer = 1;

    I know it is a kind of hacking way to do it, but it is a way to avoid re-build NDK.

    Meanwhile, we have informed our NDK team about this issue and the workaround. They will make changes as needed in next release.

    I will close this post.

    Thanks you so much for your efforts to using and improving TI products

    Ming

  • Thanks Ming! I don't have a problem with rebuilding the libs. What I want is to be able to build a project without the cfg reference like the example state. However, building the project with the libs isolated and without the global CFG reference a socket can't be established,  NC_NetStart return -1.

    Does the global reference a mediate?

  • Hi Rob,

    The CFG file is a interface file to the RTSC components like BIOS, NDK, PDK, XDC etc. It is necessary for the BIOS based projects. If you want to get rid of the CFG in  general, the simple answer is no. You can go to baremetal which do not need CFG file, but NDK is depending on BIOS. If you want just get rid of any NDK references from the CFG file, the answer is also no, because NDK depends on other RTSC components.

    Ming

  • OK!! Thanks..

    Then the reference to "NDK with C code" (section 2.1) to me implies that the Global NDK ref should not be present. I would hope in the future the user manual and the examples would be clearer. For instance, the ndkHeartbeat (ilTimerTick) will automatically get added once you add the NDK global reference in the cfg file.

    rob

  • Hi Rob,

    I will inform the NDK team on your suggestion for NDK user manual. Would you mind close this thread by click on "resolved"?

    Thanks!

    Ming

  • Resolved

    Thanks