This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

66AK2G12: system crashes when enough network packets are sent to the assigned interface during NDK initialization

Part Number: 66AK2G12

To whom it may concern,

We've observed and constantly reproduced this NDK issue in different TI-RTOS + NDK versions.
Could you please help take a look and let us know whether there is a possible fix in the NDK?
We haven't been able to fix it in NDK.

Please note that
(1) This issue is not at all related to this post: e2e.ti.com/.../2766061
(2) This system crash problem actually not only happens during NDK initialization.
We've observed the same crash problem constantly if too many incoming packets are not
processed on time in the application level (by recv() / recvnc() / recvfrom() / recvncfrom()).
We've been looking for a solution to detect such crash and auto-recover the NDK interface, but no luck so far.

The details are described below. Thanks in advance for your time!

YW

Environment:
EVMK2G hardware numbers: FY160626000104 (SN: 15164P540011) ASSY REV: D1.3
ti-processor-sdk-rtos-k2g-evm-06.03.00.106-Windows-x86-Install.exe + CCS9.3.0.00012_win64.zip (NDK 3.61.01.01)
or ti-processor-sdk-rtos-k2g-evm-05.00.00.15-Windows-x86-Install.exe + ccs_setup_8.0.0.00016.exe (NDK 2.26.00.08)

Example project generation (take the latest TI-RTOS version for example):
1. Open command prompt
cd /D "C:/ti/pdk_k2g_1_0_16/packages/"
pdksetupenv.bat
pdkProjectCreate.bat K2G evmK2G little nimu all dsp C:/ti/pdk_k2g_1_0_16/packages

2. Import `C:/ti/pdk_k2g_1_0_16/packages/MyExampleProjects/NIMU_BasicExample_evmK2G_c66xExampleProject` into CCS workspace.

3. Edit helloWorld_k2g.c:
Change "useDhcp" from TRUE to FORCE
Change "LocalIPAddr" and "GatewayIP" according to your PC's network setup.

4. Build the NIMU_BasicExample_evmK2G_c66xExampleProject example.

Procedures to reproduce the issue:
1. Download packet sender packetsender.com/download
Fill the "ASCII" field with 1024 ascii characters.
Set the "Address" to LocalIPAddr used in helloWorld_k2g.c
Set the "Port" with any port you like. Doesn't have to be the echo port used in this example.
Set the "Resend Delay" to 0.1 (ms).
Set "UDP"
Press "Send"
An example of my Packet Sender screenshot is attached.

2. Run the NIMU_BasicExample_evmK2G_c66xExampleProject output binary file on DSP of EVMK2G.

3. You should see system abort with the following error messages in K2GEVM.ccxml:CIO
        [C66xx] StackTest: using localIp
        xdc.runtime.Main: "src/tirtos/SemaphoreP_tirtos.c", line 245: assertion failure
        xdc.runtime.Error.raise: terminating execution
instead of the usual
        "Network Added: If-1:<LocalIPAddr>"
An example of my CCS screenshot is attached.

  • Hi,

    From the call stack and runtime error, it is showing that there is a NULL handle passed to SemaphoreP_post() API.

    It appears to me there may be some kind of memory corruption happening corrupting the handle in the application.

    Possibly, to catch/debug this issue, you can change SEMOSAL_Assert implementation which is Osal_DebugP_assert(). (in ti\osal\src\tirtos\utils_tirtos.c file, around line number 98)

    /*
    * ======== Osal_DebugP_assert ========
    */
    void Osal_DebugP_assert(int32_t expression, const char *file, int32_t line)
    {
        if (expression != 0) {
            xdc_runtime_Assert_raise__I(Module__MID, file, line, Assert_E_assertFailed); --> Replace this with while (1) loop and a print that it hit this case.. This would help to debug the which function is passing the NULL pointer as the handle, (you can get it from call stack).
        }
    }

    /*
    * ======== SemaphoreP_post ========
    */
    SemaphoreP_Status SemaphoreP_post(SemaphoreP_Handle handle)
    {
    SEMOSAL_Assert((handle == NULL_PTR));  <------------------  The error is happening here.
    SemaphoreP_tiRtos *semaphore = (SemaphoreP_tiRtos *)handle;

    Semaphore_post((Semaphore_Handle)&semaphore->sem);
    return (SemaphoreP_OK);
    }

  • Hi Aravind,

    Our debugging already found out in this NDK initialization scenario, the system failed in the NDK API call NC_NetStart() of C:\ti\ndk_3_61_01_01\packages\ti\ndk\netctrl\netctrl.c, which is the original TI NDK source code. Could you please try reproduce the issue on your end?

    Thank you,
    YW

  • Hi YW,

    I tried to reproduce the issue at my end.

    The issue showed up only one or two times and later on it is not happening at my setup. I even tried filling up the entire MSMC, DDR, L2 memory with a known pattern and then reload the program and run.

    It did not happen.. However, I think, I agree with you that there is some issue to be addressed. I need to investigate this further.

    How frequent is this at your side? Can you please share the "ASCII" field with 1024 ascii characters, that you are using in your test?

  • Hi Aravind,

    Thanks for your effort and taking this issue seriously.

    I actually tried to upload the ascii file which can be loaded by the "Load File" button in Packet Sender. But this forum doesn't seem to be able to take such file by either "Insert File" or "Insert Media". However you can create my file content by yourself. It's 64 repeated "0123456789abcdef" sequence without EOF in the file. But the UDP payload content really doesn't matter in this test, as long as the payload is long enough to produce enough traffic and mess up the internal memory/buffer for some reason.

    The frequency of this issue on my end is "always", given the right sequence and timing of step 1 and 2 in the procedures. Basically, you can try letting the packet sender keep resending with 0.1 delay and never stop it. Then I start "Launch Selected Configuration" of K2GEVM.ccxml in CCS, launch c66x core, load the example binary to the core, and run it.

    One possibility of your inconsistent results which don't have NDK crash is that your OS didn't send the UDP packets to the target K2GEVM. This may be caused by multiple network interfaces in the OS and can be checked by Wireshark. For more details you can refer to the whole discussion of this post:
    e2e.ti.com/.../2766061

    Thank you,
    YW

  • Hi YW,

    Thanks for the ASCII file input data that you are using.

    I noticed that the Event combiner group interrupts are not set in the helloWorld.cfg file for this example.

    Can you try updating the cfg file and rebuild the project with enabling the Ecm event group interrupts as below and let me know your observations?

    update ti/nimu/example/helloworld/k2g/c66/bios/helloworld.cfg file:

    /*
    * Enable Event Groups here and registering of ISR for specific GEM INTC is done
    * using EventCombiner_dispatchPlug() and Hwi_eventMap() APIs
    */

    Ecm.eventGroupHwiNum[0] = 7;
    Ecm.eventGroupHwiNum[1] = 8;
    Ecm.eventGroupHwiNum[2] = 9;
    Ecm.eventGroupHwiNum[3] = 10;

  • Hi Aravind,

    Sorry for the late reply.

    It's the same on my end, and was actually used in my main project (where I first found the system crashing during NDK initialization).
    However, I have some new findings regarding the steps to reproduce this issue. I'm still testing them and will get back to you as soon as I can.

    Regards,
    YW

  • Hi Aravind,

    Although the original posted issue do still happen, its frequency is actually "rarely" instead of "always" on my end.
    It was "always" happening when I didn't do a cold reboot (power-cycle) before rerun.

    I'm not sure whether the above issue still needs to be investigated,
    but I don't think it's related to the other issue in our customized system, which is our main focus.
    Therefore, I'd like to ignore this issue for now until I'm able to reproduce our issue with limited codes changes in TI's example.

    Thank you very much for your time! And sorry for the incomplete information in the first place.

    YW