This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Ethernet connection losses with C++ API

Our software is using C++ API for LightCrafter (DLP LightCrafter API Sample Code-Windows v2.0). Everything is working fine on my machine, but on the client's machine they experience sudden connection losses. Here are the symptoms:

  1. Software connects to projector, sets led current to 0. Then does a couple of cycles: send BMP to projector, turn on the lights (led current to 0x3ff).
  2. On one of this cycles (at random cycle and random operation) a error is returned by TCP_send. WSA error is 10054. All subsequent calls return WSA error 10038. The RNDIS network adapter completely disappears from ipconfig /all output.
  3. To get the projector working again you have to cycle its power.
  4. When I manually try to reproduce step 1. in GUI 4.0.1 (obviously with much bigger pauses between actions), it works without problems.

Please find the communication log attached (unfortunately it misses the exact bytes sent to the projector and I can't get a better log right now).4572.log7_toTI.LOG

We have replaced USB cable and tried different USB ports (both front and back ones) and a USB hub without success. There is no IP address conflict between RNDIS adapter (192.168.1.99/24) and other networks on the machine.

Both client machine, where they have problem, and my machine, where there is no problem, are Windows7x64, and I couldn't find any significant (at my glance) difference between them. On both my and their LC the firmware versions are 4.0 for DM365, 2.6.43 for FPGA, 2.7 for MSP430. The resolution if 608*684.

I don't have easy/regular access to the client machine now, so if more information is needed, I'd appreciate multiple suggestions which I can try during next remote session.

Thank you.

  • I've realized that LCR_CMD_DisplayStaticBuffer function is my addition. Here is its contents:

    ErrorCode_t LCR_CMD_DisplayStaticBuffer(uint8 *data, unsigned long int size)
    {
    	_DebugPrint("LCR_CMD_DisplayStaticBuffer in");
    	/* Generate packet */
    	LCR_CMD_PKT_CommandInit(LCR_CMD_PKT_TYPE_WRITE, 0x0105);
    	
    	/*TBD - Check for return error*/
    	LCR_CMD_PKT_PutData(data, size);
    
    	if(LCR_CMD_PKT_SendCommand()) {
    		_DebugPrint("LCR_CMD_DisplayStaticBuffer out fail");
    		return FAIL;
    	}
    
    	_DebugPrint("LCR_CMD_DisplayStaticBuffer out success");
    	return SUCCESS;
    }
  • Deart TI,

    My quiestion is still unanswered after a month passed. I've tried to contact ASKTEXAS for support, but they redirect to DLP departments, and I have never received a reply from their support forum.

    Can you please suggest something about my issue?

    Best regards,

    Georgy.

  • Hi Georgy,

    We sincerely apologize oversight on our part missing your query.

    This is interesting issue, from your description we cannot rule out either LC Kit or specific PC hardware configuration. First, we are reproducing the issue here in our lab. We'll let you know what we find.

    Couple of questions for you to expedite resolving the issue -

    1. Since you mentioned it working on your PC. Did you try recreating the same senario with the problematic or particular LC Kit with your PC again? It will help to understand the problem better.

    2. Are they closing the application running on the PC? Then are they trying to reconnect at a later point of time?

    3. One quick work-around I suggest would be to keep the tcp traffic alive, like read DM365 version periodically in a loop when it is idle. See if it helps.

    Regards,

    Sanjeev

  • Dear Sanjeev,

    Thank you for your reply.

    I've got a new report from our client: a new LC kit has arrived to them and they have tried it instead of their previous one - with the same result. The connection is lost within a minute after they start working with the program (and the program starts sending images to the LC).

    I have also asked the client to try a completely new PC, but they haven't done it yet (I hope they will do it today or tomorrow).

    As for your questions:

    1. I couldn't work with the particular LC Kit, as I am in Russia and they are in India, but it seems that particular LC Kit isn't the problem, as I mentioned above. I've tried to recreate the scenartio on a couple of my machines with my LC without success - the connection loss didn't occur.

    2. They are closing (actually killing) the application, and then cycle power on the LC, and after it boots they start the app again (so the communiction begins with LCR_CMD_Open).

    3. Thank you, I will try this as soon as I can.

    Best regards,

    Georgy.

  • Just for your information: the client tried to connect the projector to other machines. He tried 2 machines: Win7 64 bit and XP Service Pack 3 32 bit, but the problem remained exactly the same.

  • Hello Georgy,

    You can refer to the USB/RNDIS discussion check if reinstalling the driver helps http://e2e.ti.com/support/dlp__mems_micro-electro-mechanical_systems/f/850/t/183854.aspx

    How did our suggested workaround go i.e., about keeping the TCP link active b/w Application and the board active? Also can you please get the UART dump of non-working setup.

    What is the make of PC of your customer?

    Regards,
    Sanjeev

     

  • Hi Sanjeev,

    Please forgive the delay with my answer. Unfortunetly, these 2 weeks are very intense both for us and our customer, so we still can't find a chance to try your workaround. We are planning to try it at the end of this week or next week.

    My customer uses HewlettPackard computers.

    Best regards, Georgy.

  • Hi Sanjeev,

    We tried the workaround, which you proposed (asking projector for version every 100 ms), but situation didn't change.

    Also I suddenly got information, that when problem happens, the projector starts blinking about 5 times per second with its LEDs. The hue of the light is slighly rose. (I really wonder why they didn't mention it to me earlier). Also another bliking pattern was observed: from rose through red to violet.

    Hope that this gives you any clues.

    Best regards, Georgy.

  • It seems that rose blinking occurs on old firmware only (and also they said that it didn't blink from rose to red to violet, but the color of blinking was somewhere in between).

  • Finally we invented a scenario when the projector fails in a program written by TI, LCR_Cmd_Interface.exe.

    Please see the screenshot below.

    It this command sequence work on my machine with my projector, but on their PC with their projector it fails as you can see on the screenshot.

    Now I hope that it will be much easier for you to reproduce the problem.

    The code was compiled from DLP LightCrafter API Sample Code-Windows v2.0

  • Hello Georgy,

    From the log above the error seems to appear when Demo#2 is running.

    The error is described here, connection reset by peer is the error code 10054 (http://msdn.microsoft.com/en-us/library/windows/desktop/ms740668(v=vs.85).aspx)

    You can do the following -

    1. Install Wireshark http://www.wireshark.org/ on the computer to see what is happening on the TCP network.

    2. The error code 10054 - Connection reset by peer indicate either socket connection closed from the host (which is LightCrafter) or some Keep-alive activity detect erros hence the connection broken. You can refer to Microsoft help for possible error root cause.

    3. You can also collect UART log o/p from the LightCrafter Kit, (http://e2e.ti.com/support/dlp__mems_micro-electro-mechanical_systems/f/850/t/197591.aspx). When the problem occurs, from the UART command log, look for any error activity reporting. If no unsual activity seen in the uart log, just type "ps" command to look if the cmdhanlder task is running.

    4. Another dumb suggestion, i read in some forum that, sometimes the PC/Laptop power setting matters. If they are configured for agrressive power setting mode, many time, it ends up powering down the network cards. this might  lead to the problem.

    Regards,

    Sanjeev

     

  • Dear Sanjeev,

    Thank you for your thorough answer. The idea 4 has not succeeded, but I've got Wireshark logs from the good and the bad systems, please find it attached.

    I tried to look at them, but my network-related qualification is too low;( Maybe you or your collegues will have some success.

    2626.WiresharkLogs.zip

    Best, Georgy.

  • Dear Sanjeev,

    I have conducted a couple of other tests, including a complete rewrite of TCP connection module in lcr_cmd code. Nothing helped. But there are my conclusions:

    1. When you plug the USB cable out during data transfer, the sympthoms are similar to sympthoms on our "wrong" machines. So you can easily reproduce the issue by unplugging USB cable during static image transmission.

    2. In Wireshark, when the error occurs, everything ends with TCP Retransmission packets from the PC to the projector. It looks like either LightCrafter can't handle TCP Retransmission events, or this is the result of connection loss because of unstable or slow USB conection.

    3. I have found the bug on the machines which were previously "clean" - probably after connecting USB cable to another port or something.

  • Hello Georgy,

    Thank you for simple steps to reproduce the issue.

    YES. It is a bug in the system.

    It turns out be a bug in the Linux Kernel which is causing system to hang. We are debugging the issue.  Have made the Kernel change this seems to avoid the hanging issue, the CommandHandler applicaiton running in the userspace needs more changes.

    Please share your personal ID via Friend's request. We can discuss more on how we can help you with early beta software for you to test.

    Many folks on vacation rest of the year, I'll let you know how we can help you in the interim.

    Regards,

    Sanjeev