This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

NDK application to fit in Internal RAM?

We are developing an Industrial Automation protocol Ethernet/IP on AM335x, which uses NDK. We had developed this previously using Lwip stack, and it could be executed DDRless(not tested, but the memory map looked good). 

The NDK driver is up now, but memory map.6746.ethernetIP_adapter_NDK.txt doesn't look promising. We have to fit in the Read/Write sections of the program into 128K(Internal Memory of AM335x), and the .far section itself mounts to 350K plus. Is this possible on NDK? To what extent can we reduce the memory footprint?

Thanks,
Vinesh 

  • Hi Vinesh,

    We are working on a Wiki page that helps to clear this up, but it's not ready for publication yet.

    But to summarize, there are some configuration settings you can change to reduce the footprint.  The NDK ships with some application size benchmarks that were made for the M3 on Concerto.

    The benchmarks section is found in the NDK release notes and also contains the configuration files (which show how the footprint is reduced).  Once you bring up the benchmark results page, the configurations can be seen by clicking on the name of the app (which is actually a link to the config file) under the column 'Static Module Applications'.

    Please have a look at it.  The 'tcpSocket' app may be a good one to look at for you, but there are others as well.

    Steve

  • Hi Steve,

    Thanks for the response. I could find only this document http://www.ti.com/lit/an/spraaq5a/spraaq5a.pdf linked in ndk_2_21_00_32_ReleaseNotes.html . I couldn't find the config files you were mentioning in it. Am I looking at the wrong doc?

    Thanks,
    Vinesh 

  • The benchmark examples were added in a newer version. You need to get a NDK 2.22 release. Then look in <path-to-product>/packages/ti/ndk/benchmarks/sizing/tcpSocket

    Mark

  • Steve/Mark,

    Thanks for the response. I'll be working on optimizing the application now.

    Regards,
    Vinesh 

  • Mark,

    I modified the .cfg file to reduce the memory footprint. These are the lines I added,

    Global.pktNumFrameBufs = 2;
    Global.memRawPageCount = 8;
    Global.memRawPageSize = 2048;

    But I get this at the startup of NDK

    [CortxA8] Using MAC Address: bc-6a-29-65-41-1b
    [CortxA8] Got packet back, Enabling RXError: Unable to register the EMAC

    I tried increasing the values, still the same error comes up. What is going wrong?

    Regards,
    Vinesh 

  • Vinesh,

    Can you make a back up of the current state of your app and then try unwinding the changes you made for optimizing?

    Once you get the app back to a working state, then could try just adding those config lines back and see if the error then occurs?

    Steve

  • Steve,

    The only changes I made for optimizing are the lines I mentioned in the above post. It seems Global.pktNumFrameBufs = 2; is the one causing issues.

    • With the default value for pktNumFrameBufs(196), everything works fine. 
    • If I reduce it to 128, NDK initializes properly, but ping doesn't work. 
    • If I reduce it further(64), I get the same error I mentioned above

    Thanks,
    Vinesh 

  • Vinesh,

    The configuration parameter pktNumFrameBufs is used to set the number of frames in the PBM buffer pool (ti_ndk_config_Global_pBufMem[] in your generated "big.c" file).  This pool is used to get space to hold an Ethernet frame for PBM_alloc() calls.

    Also, PBM_alloc() will first try to obtain a frame for the caller from the ti_ndk_config_Global_pBufMem[] array.  But if there isn't one available, it will then call mmAlloc(), in order to get the frame from the NDK's memory pool (governed by config params Global.memRawPageSize and Global.memRawPageCount).

    Usually the driver will allocate some Ethernet frames it needs via PBM_alloc().  Are you calling PBM_alloc()n your driver?  If so, with what values?  I wonder if you're running out of frame buffers.

    Steve

  • Steve,

    Thanks a lot for the inputs. I could bring down the RAM section of the application considerably. I have one more query. I noticed that whenever any NDK tests(or any connection) is initiated, the function HeapMem_alloc(SYS/Bios) is called twice. Once for the Task stack (which I have configured to 2048), and another chunk of size 8200 is allocated. The second chunk is limiting the extent to which I can reduce the Heap Size and I'm not sure about configuring this.

    Can I reduce the size of this allocation? If yes, where should I look for it?

    Thanks,
    Vinesh

  • These must be TCP sockets. There are buffers for these stream sockets. The default sizes are set by the Tcp.defaultTxBufSize and Tcp.defaultRxBufSize configuration parameters, which are 8192 bytes. This is the 8200 allocation you see. Change these config parameters to whatever makes sense for your application. You can also change an existing socket's buffer sizes via the SO_SNDBUF and SO_RCVBUF ioctls.

    Mark

  • Thanks once again Mark. Now the application size looks promising!

    Vinesh

  • Vinesh,

    You must have figured this out, but I made a mistake above. The actual config settings are Tcp.transmitBufSize and Tcp.receiveBufSize. The default values for these are given by the constants above.

    Mark

  • Vinesh,

    I'm glad you seemed to have solved your issue.  However, I had some info wrong in one of my previous posts:

    Steven Connell said:
    Also, PBM_alloc() will first try to obtain a frame for the caller from the ti_ndk_config_Global_pBufMem[] array.  But if there isn't one available, it will then call mmAlloc(), in order to get the frame from the NDK's memory pool (governed by config params Global.memRawPageSize and Global.memRawPageCount).

    It actually works slightly different than this.  Looking at the code of PBM_alloc():

    PBM_Handle PBM_alloc( uint MaxSize )
    {
    ...
        /* Allocate Buffer off the Free Pool is size is OK */
        if( MaxSize <= ti_ndk_config_Global_sizeFrameBuf )
            pPkt = (PBM_Pkt *)PBMQ_deq( &PBMQ_free );
        else
        {
            /* Allocate header from memory */
            pPkt = (PBM_Pkt *)mmAlloc( sizeof(PBM_Pkt) );

    You can see that the behavior depends on the allocation size that was passed in.  If that size is less than or equal to the Ethernet frame size (ti_ndk_config_Global_sizeFrameBuf, typically 1514 or 1536 bytes, depending on your architecture) then it will get it off the free queue.

    But if it's greater than the frame size, it will then call mmAlloc and attempt to allocate from the memory manager pool.

    Steve

  • Hi Vinesh and Steven,

    I'm going to try my luck here since this is 3 years old. I'm working with the omapl137 and I actually have the hello world, config and client example working. This is under SYS/BIOS and NDK 2.24.3.35. I'm interested in reducing the footprint as much as I can and, for now, succesfully rung the HelloWorld example. Throughout the thread you mention a KB that will help or explain how to reduce the foot print. Was it ever created? If yes, could it be shared?

    Jaime
  • Jaime,

    There are several articles on the TI wiki that might help you - try googling "ndk memory site:processors.wiki.ti.com"  If you are running the NDK on the DSP of the OMAPL137, I think you can fit it inside the 256KB of DSP RAM plus possibly using the 128KB of shared RAM.

    Mark

  • Thanks I found the one for the Concerto series. It was extremely helpful. Sadly the DSP is at capacity, although I can trim a few bytes here and there, it does not seem like I will be able to fit my dsp code and the NDK in the omapl137 without external memory. After playing with the number of packets, stack size and buffer, I was able to trim it down to around 400K. I'm referring to the NDK. Do you think something like lwip might fit on the 128K shared ram?
  • Hoffiz,

    I just wanted to point out that there are some example config files (*.cfg) that show various footprint scenarios in the NDK. These benchmarks are for ARM, but the configuration settings would be similar for the DSP.

    You can find this in the NDK's release notes. Check under the "benchmarks" section, and there should be a link to "timing and sizing benchmarks". Click on that, then on "M3", which will bring you to a footprint table. The names of each application in the table are links to the *.cfg file used for the footprint benchmark app. Each app is based on different "sockets usage scenarios." I'm not sure how involved your app is, but perhaps one of them might match your use case, more or less.

    Steve