This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS/PROCESSOR-SDK-AM335X: NIMU project issue

Part Number: PROCESSOR-SDK-AM335X


Tool/software: TI-RTOS

I've imported this project cleanly into a clean workspace and environment and built it. the resulting APP was then copied to a microSD and booted on a BeagleBone Black.

All booting looks ok. Ethernet negotiates, etc.

Then I PING 192.168.1.4 successfully, wait a minute or two, then PING again successfully, then PING again etc., after a few minutes the PING fails and destination is unreachable.

So that is the symptom/problem, here is the environment/background:

I started with this example project, built up some customized code and http server and got to the point of everything working the way I wanted it BUT had this type of symptom. So to eliminate any 'interference' from my code/changes I uninstalled EVERYTHING TI and started with a clean Windows 10 laptop. No c:/ti directory etc.

I then installed: CCS v7.3.0.00019. It created/installed the c:/ti directory and several standard subdirectories. CCS was installed for Sitara development only, no other processors.

I then installed: ti-processor-sdk-rtos-am335x-evm-04.01.00.06-Windows-x86-Install, followed by a pdksetupenv and gmake, started CCS to 'install' packages.

Then I created all project examples, then created an empty workspace, imported NIMU_BasicExample_bbbAM335x_armExampleProject and built it.

The microSD is formatted as bootable, contains an MLO binary image that was included in the BBB Patch tar gz. Binary is date 1-09-14 and is 26,420 bytes in size. Copied the APP to it and everything boots 'OK'. Standard outputs on UART0.

I've been pulling my hair out for about a week to solve this and can't find any pattern. My original app had some traces coming out on UART0 to help isolate, but I thought it was my code that was causing problem and tried to get back to square one. then STILL have the problem.

if I boot from eMMC I can load Linux and everything works for hours and hours, so I believe the hardware is fine.

Questions:

does current PDK include all the BBB patches or do I have to install them somehow inside the PDK tree? could that be the problem?

Any other ideas?

-Ed

  • The RTOS team have been notified. They will respond here.
  • Ed,

    Have you tried to use the bootloader_boot_mmcsd_a8host_debug_ti.bin under C:\ti\pdk_am335x_1_0_8\packages\ti\starterware\binary\bootloader\bin\am335x-evm\gcc as MLO to boot your image? I am not sure what is the BBB Patch tar gz you referred to.

    Also, do you have 'ping' issue if loading the image with JITAG/CCS?

    Regards,
    Garrett
  • Garrett -

    Thanx for the suggestion. I had been using the one from an old Starterware BBB Patch. I did not know that the AM335X-EVM in the PDK included the BBB etc.

    I did switch to using the newest one as you indicated. it detects the BBB and loads the APP. Normal Ethernet messages come up and pings work. Then after 5-10 minutes they stop working.

    I don't have setup for JTAG/CCS, just for MLO/APP booting from microSD with UART0 monitoring.

    I have watched traffic with Wireshark. normal ARP, TCP, ICMP, and all the OTHER garbage that windows likes to put out any Ethernet port. The ARP is done on first ping. then ICMP for the pings. then when it dies 2-3 ICMP go out unanswered to the MAC of the board. Then a NEW ARP goes out to try to re-find the destination.

    Next idea?

    When I ran my app before it included UART0 messages from a heartbeat task. the heartbeat task played with the LEDs once a second and then once every 5 minutes output a task cpu use table. The important thing is that the heartbeat and 5 minute output would continue. so I believe TI-RTOS was still stable but the stack wasn't responsive. Since the stock example doesn't do any LED output I pulled the Ethernet cable and plugged back in. Two messages about Phy 0 & Negotiatied commection come up. so TI-RTOS and parts of stack seem alive. But ping doesn't restart working.

    Any chance that the default stacks for NDK need to be bumped up? maybe something in an update uses more memory from the past?

    Again, my APP showed these symptoms so I went down to JUST the example with NO changes to try to prove its stability and still found this symptom.

    -Ed

  • follow-up: 1478 continuous pings WORK (about 20min). but then i stopped the ping. waited minutes, ok, waited minutes, ok, ... waited minutes, failed

    so continuous ping seems to work. but ping, pause, ping, pause, seems to fail.

    -Ed
  • Ed,

    This is weird the ping fails with pause. Be default, Ip.maxReassemblySize = 3020, can you try to adjust this to see if it behaves differently? You can add this in nimu_bbbam335x.cfg.

    Regards,
    Garrett
  • Garrett -

    Thanx for help. sorry if replies seem delayed. I try to do ALL possible testing before replying.

    New info: I rushed in a couple more BBB and they arrived this morning. same symptom on original and new #1. just covering all possibilities.

    Info: I power my boards over the USB cable from the laptop for convenience. inserted USB power meter. voltage 4.95v, .13A average. so power seems stable and not exceeding USB abilities.

    I copied the .cfg file to a new file so I wouldn't pollute the example source. it was a link to a file, not a copy (even though I asked for "copy files" when importing). I then changed maxReassembly = 6040 (2x orig).

    All testing on all boards with old config or new config still fail the same way. tested new board in Linux mode and again works "forever".

    Note: any chance my jumper hardware for BOOT could be 'wrong'? I made a small header with a perfboard I have and ran Pin 44 (LCD_DATA3) to a 1x2 jumper then the other side of jumper is through a 130Ohm resistor to Pin 2 (DGND). That way I can install/remove jumper to select microSD booting. Always seems to work ok. even tried installing jumper AFTER boot for the Linux to confirm it didn't hurt Linux running. I used 130ohm to be a STRONG pull since the pin is already pulled HIGH on the BBB.

    As I said, i'm not sure what/why this symptom. I've run the boards direct to the laptop and/or via WIFI to my LAN and the board on the LAN. So Windows 10 ARP is only used on direct, TP-LINK WIFI ARP is used on LAN.

    Losing a ping here or there would be acceptable. protocols should cover. but this is 'locking up' the stack and does not respond to anything after it locks up. Meanwhile TI-RTOS seems to be running. Original full app symptoms would be to hit a few web pages and then freeze trying to hit another.

    -Ed

  • meant to say: all BBB bought from Mouser, Rev C, Mfg by GHI Electronics.
  • so I removed my 'boot jumper header' and used BOOT pushbutton and THOUGHT it was acting better, then it still locked up. my theory was the LCD_DATA_3 might be pinmuxed in the default board setup as an output by default and I was pulling it down and hurting the MCU. Still not solved.
  • Hi Ed,

    I am out of office today, will try to get a BBB Monday and test it out with JTAG to narrow down the issue.

    Regards,

    Garrett

  • Thanx.

    Based on your "increase re-assembly" suggestion I've been playing around some more. I first upped re-assembly AGAIN to 16384, no joy. added ICMP module, no joy. Increase max socket conns in IP (?) from 8 to 32, no joy.

    Then went crazy and increased most memory in NDK that I could get my hands on:

    Re-Assembly: 3020 to 16384

    NetCtrl (?) Stack: 8192 to 16384

    Lo/Med/Hi Stacks: doubled each to 6144, 8192, 10240

    # of Frames: 384 to 762

    Page Size: 3072 to 6144

    # of Pages: 16 to 32

    All changes made, no joy.

    Been testing with Original board, # 2 and #3. Orig & #2 have been 'abused' by installing my header. #3 never has. always just use boot push button. a little worried that the 131ohm could have drawn 25mA from LCD_DATA3 if it was pinmuxed to an output by software. if so, my bad.

    i'm going to get back to work on my software and hope that Garrett has an answer. All COULD be a Win 10 stupid stack thing, but that is the real world we live in and I need the NDK to play nice with Win 10 like Linux does.

    All of this in in preparation to porting to a Dev board from my SOM provider and then to our custom board that accepts the SO-DIMM SOM module. Trying to get everything working BBB first so I KNOW when problems are generated by me and the other boards.

    Sometimes symptoms change a little. Sometimes after the pause it dies like before. Sometimes it misses two pings and then takes off but later dies. could just be funny timing etc. Even on the continuous ping that 'worked' I didn't drop any packets but one took 64ms to complete. that was on my WIFI/LAN setup and I just assumed congestion.

    All later testing is on my direct Ethernet cable.

    -Ed

  • Hi Ed,

    I just received the BBB Rev C board and ran the PING test this morning. The intermittent ping failure you observed is not reproducible on my setup. I did test with continuous ping with -n option, also tried to pause then resume. Have you tried another Windows PC or kill some network related tasks? My laptop/PCs are all running Windows 7 though.

    Regards,

    Garrett

  • ok, I will try several other environments (Win 7, Linux, etc.) and advise.

    I know it COULD be Win10 and my environment, but if some garbage is locking up the NDK stack I need to know that. I will also send Wireshark Traces to show what works or doesn't. The wireshark will also show the timing of WHEN failures occur. you may be waiting a different period or not holding your tongue right ;^)

    I am famous for having bad luck on stuff. if I order 50 boards made, i'll grab one and the first one will fail testing, then I test the other 49 with no problems ...

    -Ed