Wi-Fi

Wi-Fi forum

State
Locked Locked
Replies 54 replies
Subscribers 44 subscribers
Views 7780 views
Users 0 members are here

Support feedback

Options

Options

Related

CC3200 sending invalid SPI data

Ray Froehlich

Intellectual 295 points

Other Parts Discussed in Thread: CC3300

I am [now] running the latest CC33300 firmware (1.24) and host software (1.11.1). The general scenario I am running is described here. After upgrading to this latest software, I am no longer seeing the issue described by this earlier post, but rather encountering a new problem. What I am observing now is invalid data being sent by the CC3300. I have attached a logic analyzer capture that clearly illustrates this (named InvalidEvent_0x0404.logicdata). Free software can be downloaded from here to view this capture.

As can be seen by the capture, roughly 17 connect-send-recv-closesocket sequences complete successfully prior to failure. One example of a good scenario is as follows:

timestamp (ms) within
logic analyzer trace observed traffic
-------------------- ----------------
11,369.1 socket CMD
11,369.4   socket EVT
11,369.4   connect CMD
11,372.9   connect EVT
11,373.0   setsocketopt CMD
11,373.2   setsocketopt EVT
11,373.2   send CMD
11,373.4   send EVT
11,373.5   recv CMD
11,378.2   recv EVT
11,378.3   recv DATA
11,392.9   freebuffer EVT
11,392.9   closesocket CMD
11,392.9   TCP close wait EVT
11,492.3   closesocket EVT

At timestamp 12,216.6ms, the last command is sent out by the host - a closesocket CMD. Almost immediately following this, an EVT with opcode 0x0404 is received from the CC3300. This appears to be an invalid event, and is the last bit of communication that occurs on the SPI bus. The host continues to wait indefinitely for the EVT corresponding to the closesocket CMD that was sent out.

I am running SPI at ~5.3MHz, and see slightly better behavior when this is slowed down (similar issues can still be observed, just less frequently).

InvalidEvent_0x0404.zip

over 11 years ago

0 Martin Maurer1 over 11 years ago

Prodigy 40 points

Hello Ray,

can you check if I read your trace correctly, this is the data i see in your trace (as last receive data):

02 00 00 00 09 04 04 04 04 00 88 05 00 00

Best regards,

Martin

0 Ray Froehlich over 11 years ago in reply to Martin Maurer1

Intellectual 295 points

Martin,

That's correct. That data is the last bit of MISO traffic on the SPI bus.

-Ray

0 Tomer Kariv over 11 years ago in reply to Ray Froehlich

TI__Expert 5295 points

Hi Ray,

I would like to see network stack logs in order to understand why you see this problem.

Can you please provide network stack logs according to the following link: http://processors.wiki.ti.com/index.php/CC3000_Logger

Please also add the following command after init, this is in order to achieve maximum level of debug: netapp_set_debug_level(0x1FF);

Thanks,

Tomer

0 Ray Froehlich over 11 years ago in reply to Tomer Kariv

Intellectual 295 points

Attached is a binary Teraterm log that represents a failure. Note, this trial took a little longer before a failure was observed.

0 Marie Maurer over 11 years ago in reply to Ray Froehlich

Genius 4895 points

When you receive something wrong, can you print the complete received data packet?

From which pin have you recorded the log? Have you used a level shifter? Have you checked the communication parameters (921600, if i remember correctly)?

0 Ray Froehlich over 11 years ago in reply to Marie Maurer

Intellectual 295 points

Sorry, yeah, I did exchange the terms in my previous post. Just to avoid confusion, I deleted that text from the post. Also, looking more at the TI API code, I don't think my conclusion about expected and received opcode was entirely correct. At this point, it seems like between the serial capture, the logic analyzer capture, and the code snippet, TI should have enough to debug this problem. Honestly, I am hoping to avoid having to spend more of my own time debugging this problem, since based on the logic analyzer capture alone, it definitely seems like a module problem and not an issue with my software. Would you agree with this statement?

0 Marie Maurer over 11 years ago in reply to Ray Froehlich

Genius 4895 points

Sorry, i had only a quick look into your teraterm.log

My logfiles (drv debug pin) always start with 0x0B 0x56 0x00, 0x0B 0x55 0x00. But i just saw, this sequence comes just almost at the end of the file. So ignore my comment to check logfile, could be ok!

Only TI can decode the logfile, so only thing is to wait till they answer. Sorry, can't help here.

0 Tomer Kariv over 11 years ago in reply to Marie Maurer

TI__Expert 5295 points

Hi,

According to the log file, I could see that you issue: "netapp_set_debug_level(0x1FF)" command only at the end of the log file, right after reset is taken place. But all the logs prior to the rest are taken without this command.

Unfortunately, from the basic log file (without max debug level), there's not enough information to tell if something is wrong. I would need the enhanced level of debug to see whether incorrect opcode was sent to the host.

Can you please repeat the test with enhanced level of debug from the initial boot (reset)?

Thanks,

Tomer

Thanks,

Tomer

0 Ray Froehlich over 11 years ago in reply to Tomer Kariv

Intellectual 295 points

I issue the netapp_set_debug_level command any time I take the module out of reset and initialize it. So, while it would have been issued at the end of the capture, it was also issued at the beginning. The end of the capture represents the point where a failure has occurred, software has detected it, and the workaround of reset and re-initializing kicks in. Here is the snippet of code I am using for *every* initialization, including the first:

void WiFi_CC3300::Initialize (void)
{
    long result;

    CyclePower ();

    // WLAN On API Implementation
    wlan_init (CC3000_UsynchCallback, NULL, NULL, NULL, ReadWlanInterruptPin, WlanInterruptEnable, WlanInterruptDisable, WriteWlanPin);

    // Trigger a WLAN device
    wlan_start (0);

    // Mask out all non-required events from CC3000
    result = wlan_set_event_mask (HCI_EVNT_WLAN_KEEPALIVE | HCI_EVNT_WLAN_UNSOL_INIT | HCI_EVNT_WLAN_UNSOL_DHCP | HCI_EVNT_WLAN_ASYNC_PING_REPORT);
    MEMCO_ASSERT (result == ESUCCESS);

    netapp_set_debug_level (0x1FF);

    CC3300Updater ().UpdateCheck ();
    ConfigureIP ();

    ProtectedChangeState (Disconnected);
}

0 Ray Froehlich over 11 years ago in reply to Ray Froehlich

Intellectual 295 points

Hmmm... Does that setting not persist across calls to wlan_stop()/wlan_start()? If that is the case, that would explain why the log did not reflect it, since the ConfigureIP() function (shown above, shortly after the call to netapp_set_debug_level()), also calls wlan_stop() / wlan_start().

0 Tomer Kariv over 11 years ago in reply to Ray Froehlich

TI__Expert 5295 points

Hi,

You need to verify that there is "set debug level" command after each wlan_start. This is the reason why I can't see the enhanced level of logs.

Thanks,

Tomer

0 Ray Froehlich over 11 years ago in reply to Tomer Kariv

Intellectual 295 points

Let's try this again. New log added, this time with debug level set after every wlan_start() call.

0 Tomer Kariv over 11 years ago in reply to Ray Froehlich

TI__Expert 5295 points

Hi,

Now I can see the debug traces, thanks.

Here's my analysis: There is a successful sequence of opening a socket, connecting to remote, setting the socket option, sending data and receiving from remote.

Then after some time, the CC3000 device opens a socket (bsd_socket command successfully), and then, upon bsd_connect command, it sends SYN to remote server, but it doesn't receive SYN ACK from remote. During this time, the socket is considered open (the application should be blocked at this stage for TCP timeout - by default 60 seconds). The CC3000 is totally functioning and there are also keep alive messages. After some time, the host issues bsd_close_socket command, and the first event sent to host is "bsd connect" event, to notify the host tcp connection is unsuccessful. Right after, the "bsd_closesocket" event is sent.

Then you issue reset from the host, but from CC3000 perspective this is not required.

There is nothing wrong sent from the CC3000 over the SPI lines.

How come you call closesocket command when connect is not finished? connect should be a blocking command.

This might be related to what you see.

Is it possible that you have some memory leakage in your program?

What processor are you using? Have you tried to reproduce this on our reference platforms?

Regards,

Tomer

0 Ray Froehlich over 11 years ago in reply to Tomer Kariv

Intellectual 295 points

Tomer,

That's very interesting, but I am wondering if this is a catch-22. My theory is that I close the socket *because* of the detection of the CC3300 defect by my software. I will remove this detection, capture another log file, and re-post. I will be sure to wait a couple of minutes after the error case before I stop the TeraTerm capture.

-Ray

0 Tomer Kariv over 11 years ago in reply to Ray Froehlich

TI__Expert 5295 points

ok, but to understand better, when you call connect, the system is blocked in that state for TCP timeout (by default it is 60 seconds). Do you say that this "error" detected by your SW releases the host from this blocking state?

To clarify my question, how come you call close socket command when application is still blocked waiting for the connect command?

Thanks,

Tomer

0 Ray Froehlich over 11 years ago in reply to Tomer Kariv

Intellectual 295 points

In the error case, I do not believe the connect command will *ever* return. The workaround I implemented was a result of my having observed this to be the case. Hypothetically, if there is a defect within the CC3300 that prevents the TI CC3300 API from returning from a call to connect, it is better to eventually detect it and reset everything than to just sit there indefinitely. At least, that is my logic. I have removed this workaround and am recapturing now...

0 Tomer Kariv over 11 years ago in reply to Ray Froehlich

TI__Expert 5295 points

ok, looking for your inputs.

How long have you waited? More than 60 seconds?

btw, this timeout can be decreased to a lower value by netapp timeout...

Regards,

Tomer

0 Ray Froehlich over 11 years ago in reply to Tomer Kariv

Intellectual 295 points

In the next case, where my workaround is removed, after I observe the application stop responding, I will be sure to wait longer than 60 seconds before stopping everything, including the log. (I will wait several minutes, just to be safe.) I understand that 60 seconds is the default timeout value for socket connection.

0 Dmitry Bodnya over 11 years ago in reply to Tomer Kariv

Expert 1140 points

Hi, everybody.

I think, I saw such problem with cc3000. Sometimes cc3000 sends 04 04 after spi header(the quantity of '04' not fixed - it varies). And after '04' there was exactly the data that I expected after '04'.

I was thinking and debugging a lot. After receiving several '04' bytes driver crashes, due to incorrect data size computing.

What is '04'? '04' is event. Why I receive it multiply times? I don't know, but I supposed that cc3000 somehow delayed it's output and sends '04' while the other data is not ready.

So, I slowed down my 'read' operation from cc3000. I added a pause after each byte I am reading. Since that time I didn't meet such situation.

I saw this behavior on different patches. Now I am using 1.11.1.

It's not graceful, but it helps.

Hope that will help.

P.S. Anyway, I am still having other problems with cc3000).

0 Ray Froehlich over 11 years ago in reply to Tomer Kariv

Intellectual 295 points

Tomer,

I've attached yet another log, this one represents a failure case where my workaround is removed. Here, upon failure, the TI CC3300 host API sat and waited forever. I waited roughly 4 minutes (well over the 60-second timeout) after the failure was observed before stopping the log. I am interested to hear what you see from this log.

-Ray

P.S. Dmitry, thanks for the contribution. Another thing I observed that helped, although apparently still does not guarantee this problem will not occur, is slowing down the SPI clock. I saw this problem much more frequently when I ran at 10.6 MHz rather than the 1.3 MHz I am currently operating at.

0 Tomer Kariv over 11 years ago in reply to Ray Froehlich

TI__Expert 5295 points

Hi,

This time, the last sequence ends successfully. Meaning, that I can see the close_socket command issued from the host and then right after an event is sent from the CC3000 device to the host.

What do you mean the host API sat and waited forever? Did you issue a command from the host and didn't get an event back from the CC3000 device?

According to the logs, for each command from the host, there is a reply (event) back. The log file looks ok. There are also keep alive messages every 10 seconds, that if not masked should be sent to the host..

Did you try to issue new commands after the failure you observe?

Regards,

Tomer

0 Tomer Kariv over 11 years ago in reply to Tomer Kariv

TI__Expert 5295 points

In addition, since in the previous log, I don't see any wrong message/event sent from the device to the host, but according to your analysis some incorrect data is sent, I'd like to understand whether the CC3000 device starts a correct transaction, meaning, does it first assert the IRQ line, or you just see some data on the SPI line?

is the 0x0404 data consistent? do you always see it?

Thanks,

Tomer

0 Ray Froehlich over 11 years ago in reply to Tomer Kariv

Intellectual 295 points

Tomer,

In every failure that I've observed, the API is stuck waiting for a response from the module. Literally, the hci_event_handler() function never returns. This is what I mean when I say the "API sits and waits forever." Therefore, I did not try to issue any new commands after failure, either.

IMHO, I believe the SPI traffic is more useful to examine than the debug serial output from the module, since that at least isolates the problem to the module. I do not always look at the SPI traffic every time I observe a failure, but I have observed multiple cases where the SPI response sent from the module looked incorrect. I have observed the module [incorrectly?] send an 04 on multiple occasions as well, although I am unsure whether or not this is always the case.

Have you looked at the logic analyzer capture? Would you agree that the end of that SPI traffic represents incorrect behavior by the module? I'm glad we took the time to examine debug serial output, but I am wondering if the next step is for you to:

1. examine the logic analyzer trace and determine if it represents a module problem
2. if so, you attempt to reproduce the problem. If you'd like, I can send you IAR workbench code (I'm running on an STM32F407IG)

0 Tomer Kariv over 11 years ago in reply to Ray Froehlich

TI__Expert 5295 points

Can you tell to which event the application is waiting/stuck?

0 Ray Froehlich over 11 years ago in reply to Tomer Kariv

Intellectual 295 points

It varies.

0 Tomer Kariv over 11 years ago in reply to Ray Froehlich

TI__Expert 5295 points

According to the logs, it is not propagated from the chip.

Do you see such behavior over the SPI lines even when the device is just in idle mode?

0 Ray Froehlich over 11 years ago in reply to Tomer Kariv

Intellectual 295 points

Personally, I trust the SPI data more than the logs, simply due to the fact that the logs are originating from a node running potentially flawed software.

I have not observed any issue while the device is idle, but most of my testing has been of the nature described earlier - continuously opening a socket, sending, receiving, then closing.

0 Tomer Kariv over 11 years ago in reply to Ray Froehlich

TI__Expert 5295 points

In your SPI capture, did you take only the last transaction? was it the close socket?

Is the MOSI in your capture refers to data sent from the CC3000 device to the host?

0 Tomer Kariv over 11 years ago in reply to Tomer Kariv

TI__Expert 5295 points

please ignore the first question, I could just see all the trace...

0 Ray Froehlich over 11 years ago in reply to Tomer Kariv

Intellectual 295 points

As I described in my initial post to this thread, in the logic analyzer capture there are roughly 17 "normal" open-send-receive-close sequences before the failure case.

Nope, MOSI represents master-out-slave-in, on this SPI bus where the master is the host. So, MOSI represents data from the host to the CC3300.

0 Marie Maurer over 11 years ago in reply to Ray Froehlich

Genius 4895 points

Just learned you can also export SPI data with Saleae Logic Analyzer.

Here is the exported SPI data (exported to excel, then saved exchanged data as text file) from trace of first entry of this thread:

Hope it helps and makes analysis easier...

0 Ray Froehlich over 11 years ago in reply to Marie Maurer

Intellectual 295 points

That's awesome! Thanks Martin!

0 Marie Maurer over 11 years ago in reply to Ray Froehlich

Genius 4895 points

In case you want to do it on your own:

0 Marie Maurer over 11 years ago in reply to Marie Maurer

Genius 4895 points

Hi Tomer, hi Ray,

just had a look on the bad frame again:

02 00 00 00 09 04 04 04 04 00 88 05 00 00

The "88 05" looks to be something which can be used to search for similar frames. And really, there is

02 00 00 00 09 04 00 88 05 00 00 00 00 00

which seems to be a HCI_EVNT_BSD_TCP_CLOSE_WAIT (is this correct? It is some time ago, since i decoded the SPI data the last time...). What looks interesting, is that first part of frame ("02 00 00 00 09") is ok and signals 9 bytes to host, and after 9 bytes are read by host, the CC3000 puts WLAN_IRQ to HIGH, so also CC3000 seems to think, data is done, no more bytes left to read. So this could really be a problem of CC3000...?

Perhaps some kind of race condition where two events (from different origin both write to SPI peripheral of CC3000) collide? As far as i know the HCI_EVNT_BSD_TCP_CLOSE_WAIT event is something fairly new, perhaps there was an error implementing it (missing semaphore or similar)?

Best regards,

Martin

0 Tomer Kariv over 11 years ago in reply to Marie Maurer

TI__Expert 5295 points

Hi Ray,

I have analyzed the logs just now. Martin's analysis looks correct. Thanks Martin.

The event is indeed sent incorrectly over the SPI lines. I will look further more to see whether it is caused by some race condition in our firmware.

One question, in your scenario, does the server initiates close socket (sends FIN) in parallel to a close socket command from the host? This is just question to understand whether some workaround could be applied for the time being..

Regards,

Tomer

0 Ray Froehlich over 11 years ago in reply to Tomer Kariv

Intellectual 295 points

Tomer,

I believe the answer to your question is 'yes,' since the server and the host are operating in parallel, and fairly independently of one another.

For the time being, the workaround I am using is to start a timer within the hci_event_handler() function and if it expires while we're waiting for the response from the module, return an error code. At a higher level, when this error code is encountered, I power cycle the module. It is pretty ugly, and not acceptable from a finished product standpoint given the time it takes to reconnect to the network, etc., but has run overnight.

-Ray

0 Tomer Kariv over 11 years ago in reply to Ray Froehlich

TI__Expert 5295 points

Hi Ray,

I've sent you over email a patch to try. Please let me know the results.

Please provide network stack logs and SPI if problem is still reproducible at your end.

Thanks,

Tomer

0 Ray Froehlich over 11 years ago in reply to Tomer Kariv

Intellectual 295 points

Tomer,

After having patched to what reports 1.27, I see improved behavior, but have observed a failure. I will run my test again, collect another logic analyzer trace, and post.

-Ray

0 Ray Froehlich over 11 years ago in reply to Ray Froehlich

Intellectual 295 points

Tomer,

I'll be honest with you - it seems like the model here is that users (such as myself) are doing the majority of debugging for the CC3300 module. However, as a user, I would rather spend my time implementing new features for my product than debugging the WiFi module that I otherwise expect to just work.

In this case, the test I am running that continues to produce failures is fairly straightforward. I have a private network using static IPs, with two nodes on it, one of which is the CC3300. The CC3300 is doing what is described in my initial post. The other node is a PC running the following .NET application code:

using System;
using System.Diagnostics;
using System.Net;
using System.Net.Sockets;
using System.Text;

public class SynchronousSocketListener
{
    public static int Main (String[] args)
    {
        StartListening ();
        return 0;
    }

    private static void StartListening ()
    {
        Stopwatch m_stopwatch = new Stopwatch ();

        // Data buffer for incoming data.
        byte[] rxBuf = new Byte[1024];

        // Establish the local endpoint for the socket.
        IPAddress ipAddress = new IPAddress (new byte[] { 192, 168, 1, 100 });
        IPEndPoint localEndPoint = new IPEndPoint (ipAddress, 80);

        // Create a TCP/IP socket.
        Socket listener = new Socket (AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);

        // Bind the socket to the local endpoint and 
        // listen for incoming connections.
        try
        {
            listener.Bind (localEndPoint);
            listener.Listen (1000);

            // Start listening for connections.
            while (true)
            {
                // Program is suspended while waiting for an incoming connection.
                Socket handler = listener.Accept ();

                m_stopwatch.Restart ();

                // An incoming connection needs to be processed.
                while (m_stopwatch.ElapsedMilliseconds < 20000)
                {
                    if (handler.Available == 0)
                        continue;

                    int rxCount = handler.Receive (rxBuf);

                    if (rxCount == 0)
                        continue;

                    Handle (rxBuf, rxCount);

                    // Echo the data back to the client.
                    handler.Send (rxBuf, rxCount, SocketFlags.None);

                    m_stopwatch.Restart ();

                    break;
                }

                handler.Shutdown (SocketShutdown.Both);
                handler.Close ();
            }
        }
        catch (Exception e)
        {
            Console.WriteLine (e.ToString ());
        }

        Console.WriteLine ("\nPress ENTER to continue...");
        Console.Read ();
    }

    static int m_count = 0;

    private static void Handle (byte[] rxBuf, int numBytes)
    {
        string wholeAsString = Encoding.ASCII.GetString (rxBuf, 0, numBytes);

        int endOfheader = wholeAsString.IndexOf ("\r\n\r\n");

        if (endOfheader == -1)
            return;

        string httpHeader = wholeAsString.Substring (0, endOfheader);
        Console.WriteLine ("Packet " + m_count++);
        Console.WriteLine (httpHeader);
        Console.WriteLine ();
    }
}

Here are my IP addresses:

192.168.1.1 - wireless router
192.168.1.100 - PC
192.168.1.101 - CC3300 node

Given all this information, my hope is that it is possible for you to re-create this test and reproduce the failure(s) yourself. That way, you can test, fix, and re-test as many times as it takes before you are confident in the fix. Does TI perform similar stress testing? If so, are you unable to reproduce the failures I am seeing? Since I have everything all set up, I will perform another logic analyzer capture of the failure using this new 1.27 firmware, but unfortunately after this, I cannot afford to spend much more time debugging CC3300 firmware.

-Ray

0 Ray Froehlich over 11 years ago in reply to Ray Froehlich

Intellectual 295 points

Tomer,

Attached is the 1.27 failure case logic analyzer trace. As can be seen from the trace, after some time, the module fails to respond to a socket close command. I was sure to collect a couple minutes' worth of data after the failure as well. It is also worth noting that with 1.27, the only failures I have observed up to this point are ones where closesocket is not responded to.

-Ray

1.27.logicdata.zip

0 Tomer Kariv over 11 years ago in reply to Ray Froehlich

TI__Expert 5295 points

We will try to reproduce the issue and update.

Regards,

Tomer

0 Ray Froehlich over 11 years ago in reply to Tomer Kariv

Intellectual 295 points

Thank you, Tomer!

-Ray

0 Raghavendra Shenoy over 11 years ago in reply to Ray Froehlich

TI__Mastermind 27095 points

Hi Ray,

I am trying to reproduce this issue at my end. I wanted collect some more observations from your side. Can you please help me with the below queries?

1. When you issue 'send' over TCP socket, what is the return value each time you send? Is it matching the bytes you actually send? I want to understand if these two numbers are matching and still you see missing/corrupt packets.

2. What is the percentage of reproduction of this issue? Does this happen everytime you run the test, or are there cases when it has passed for you?

3. Does this happen only when you assign a static IP?

4. I want to recreate the exact environment that you are running. Hence, I tried to run your C# TCP server application on Visual C#. I am running into below error, have you come accross this error?

"System.Net.Sockets.SocketException (0x80004005): The requested address is not valid in its context
at System.Net.Sockets.Socket.DoBind(EndPoint endPointSnapshot, SocketAddresssocketAddress)
at System.Net.Sockets.Socket.Bind(EndPoint localEP)
at SynchronousSocketListener.StartListening() in line 33"

Thanks & Regards,

Raghavendra

0 Ray Froehlich over 11 years ago in reply to Raghavendra Shenoy

Intellectual 295 points

Raghavendra,

Thanks for helping. I'll attempt to answer your questions...

1. After upgrading to 1.27, the send() function always seems to work. When I call it, it returns the number of bytes sent, and this matches what I would expect based on the number of bytes I asked it to send. My software does check this, and considers it a failure when the return value does not equal the number of bytes requested to be sent.

2. Mean time to failure is maybe one or two minutes in the scenario where I am transmitting and receiving as fast as possible. I have seen scenarios where it will run for 10 minutes before failure, and I have seen scenarios where it fails after only a couple seconds.

3. Good question. To test this, I left my router at 192.168.1.1, left my server at 192.168.1.100, but configured my CC3300 node to use DHCP rather than the static address of 192.168.1.101. After 1 minute 37 seconds, I observed a faiulre where a connect message (opcode 0x1007) was never responded to. So, yes, this seems to be an issue regardless of whether IP is static or dynamic.

4. The .NET source that was posted assumes there is an active local endpoint with IP address 192.168.1.100. If you don't have your wireless configured on your PC to use this static IP, you would see this exception. Here's a picture of my PC wireless adapter IPv4 configuration:

-Ray

0 Marie Maurer over 11 years ago in reply to Ray Froehlich

Genius 4895 points

@TI

One week is over, no reply, how is it going? Were you able to reproduce the problem? Is there perhaps already a new (test) patch available?

0 Jimmy Pedersen over 11 years ago in reply to Tomer Kariv

Intellectual 430 points

Hi Tomer.

I'm in the same position as Ray with boards up and running, ready for fabrication, but with serious issues with, among other things, the same thing thats descibed in htis thread.

Could you please send a newer firmware (>=v. 1.27) that I could use, atleast for testing.
The mentioned issues makes it though(/nearly impossible) to do the tests we need to do.

Kind regards

Jimmy

0 Tomer Kariv over 11 years ago in reply to Jimmy Pedersen

TI__Expert 5295 points

Hi Jimmy

I'm no longer working on the CC3000. Raghavendra will continue to support.

Unfortunately, we don't have a newer version you can test at the moment.

Regards,

Tomer

0 Raghavendra Shenoy over 11 years ago in reply to Tomer Kariv

TI__Mastermind 27095 points

Hi Ray,

I have been trying to get this issue reproduced at my end, but unfortunately I am not able to hit this issue yet.
I am wondering what I may be missing here.

Can you please help me to narrow down the observations here? I would just want you to check running the below scenario, and see if this work around helps us not hit this issue.

- On the CC3000, run the TCP client as usual. Send the data to the server and close the socket.
- On the remote side, run the TCP server as usual gain, i.e. accept the data, loop back. But do not close the socket.

Let us check only by closing the socket at the CC3000 side, and see if the race condition occurs or not.

Thanks & Regards,

Raghavendra

0 Ray Froehlich over 11 years ago in reply to Raghavendra Shenoy

Intellectual 295 points

Raghavendra,

Lines 65 and 66 of the C# server close the socket. Note the 'break' statement at the end of the while loop on line 62. The server is definitely closing the sockets.

One thing I've recently noticed here is that this is *much* more reproducible when there are other things going on. At this point in my code, in addition to the TCP socket to the server that gets built and torn down for every "transaction," there is also UDP traffic (both send and receive) that is occurring. Here is pseudocode that attempts to more fully describe the scenario under which I see the most frequent number of module issues:

//step 0 - initialize CC3300, configure for static IP, connect to network
//wait for all initialization to complete

//step 1 - get a socket address for remote IP 192.168.1.100, port 80
//here is the function used:
sockaddr WiFi_CC3300::SocketAddress (uint8_t ip[4], uint16_t port)
{
    sockaddr sa;
    uint16_t nboPort = htons (port);

    memset (&sa, 0U, sizeof (sockaddr));

    sa.sa_family = AF_INET;
    memcpy (&sa.sa_data[0], &nboPort, sizeof (nboPort));

    for (int i = 0; i < sizeof (ip); i++)
    {
        sa.sa_data[sizeof (nboPort) + i] = ip[i];
    }

    return sa;
}

//step 2 - using the socket address from step 1, write and read from a fresh TCP socket
WiFi_CC3300::readWriteStatus_e WiFi_CC3300::WriteRead (Packet& packet, const sockaddr sa)
{
    static unsigned long timeoutInMs = 1000U;
    readWriteStatus_e status = RW_Success;
    int32_t hSocket = -1;

    FOREVER
    {
        //get a connected socket and send

        if ((hSocket = socket (AF_INET, SOCK_STREAM, IPPROTO_TCP)) == -1 ||
            connect (hSocket, &sa, sizeof (sa)) != ESUCCESS ||
            setsockopt (hSocket, SOL_SOCKET, SOCKOPT_RECV_TIMEOUT, &timeoutInMs, sizeof (timeoutInMs)) < 0 ||
            send (hSocket, packet, packet.Size (), 0) != packet.Size ())
        {
            status = RW_WriteFail;
            break;
        }

        //receive

        packet.Reset ();
        packet.Grow (MEMCO_PACKET_MAX_SIZE);

        int rxBytes = recv (hSocket, packet, packet.Size (), 0);
        if (rxBytes <= 0)
        {
            status = RW_ReadFail;
            break;
        }

        packet.ShrinkBack (packet.Size () - rxBytes);

        break;
    }

    //teardown

    if ((hSocket != -1 && closesocket (hSocket) < 0) || status != RW_Success)
    {
        ProtectedChangeState (Uninitialized);
    }

    return status;
}

//step 3 - get a sockaddr corresponding to the local CC3300, IP 192.168.1.101, port 11012
//same function as step 1 used, but with these arguments

//step 4 - using the sockaddr from step 3, poll for received UDP data
void WiFi_CC3300::Poll (Packet& packet, sockaddr sa, uint16_t timeoutInMs)
{
    static uint8_t wildcardIp[4] = {0, 0, 0, 0};
    unsigned long timeoutAsULong = timeoutInMs;
    int hSocket = -1;
    int rxBytes = 0;
    socklen_t salen = sizeof (sa);

    packet.Reset ();

    hSocket = socket (AF_INET, SOCK_DGRAM, IPPROTO_UDP);

    if (hSocket == -1)
    {
        return;
    }

    sockaddr localSa = SocketAddress (wildcardIp, ntohs (*((uint16_t*)sa.sa_data)));
    (void)bind (hSocket, &localSa, sizeof (localSa));
    (void)setsockopt (hSocket, SOL_SOCKET, SOCKOPT_RECV_TIMEOUT, &timeoutAsULong, sizeof (timeoutAsULong));
    packet.Grow (MEMCO_PACKET_MAX_SIZE);
    rxBytes = recvfrom (hSocket, packet, packet.Size (), 0, &sa, &salen);
    packet.ShrinkBack (packet.Size () - rxBytes);

    (void)closesocket (hSocket);
}

//step 5 - get a sockaddr corresponding to a remote IP 192.168.1.102, port 11011
//same function as step 1 used, but with these arguments

//step 6 - using the sockaddr from step 5, fire off a UDP packet to the a node
void WiFi_CC3300::FireAndForget (Packet& packet, const sockaddr sa)
{
    int hSocket = socket (AF_INET, SOCK_DGRAM, IPPROTO_UDP);

    if (hSocket == -1)
    {
        return;
    }

    (void)sendto (hSocket, packet, packet.Size (), 0, &sa, sizeof (sa));
    (void)closesocket (hSocket);
}

//repeat steps 1-6 ad infinitum

//while all this is going on, remote node with IP 192.168.1.102, port 11011 is
//periodically sending UDP packets to this local CC3300 (these are the packets
//received in step 4 above)

0 Ray Froehlich over 11 years ago in reply to Ray Froehlich

Intellectual 295 points

Another variable that seems to make a significant difference in how reproducible this failure is, is the type of wireless router used. I have two that I use - a TP-Link model TL-WR740N and a Verizon Jetpack MiFi. Using the TP-Link, if I remove the UDP activity and only perform the TCP stuff described in my previous post, things typically work. However, in this same scenario but using the Verizon MiFi rather than the TP-Link router, the module hangs more often than not. Both routers are configured identically - same SSIDs, same security, same IPs, etc.