This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CC2640R2F - Optimized RAM Usage

Hi ,


I am using TI BLE stack v3.0.1 with Simple BLE peripheral as example base.

Our project requires additional RAM to include features.

Read write data of stack takes 1039 bytes of memory. ( Taken from .map file of stack compilation )

In application, ICALL_RAM0_START address is mentioned as 0x20003fe8 in IAR_Boundary.xcl

To use unused space by stack, we tried to change the ICALL_RAM0_START address , but, functionality doesn't work as expected.

Is it possible to use the RAM for application by changing the ICALL_RAM0_START ?

Many thanks in Advance !

  • Hi Krithiga,

    No, you shouldn't modify the boundary file - instead utilize RAM from the sensor controller (if you aren't using it) or use the Cache as Ram Feature.

    software-dl.ti.com/.../platform.html

    Or reduce static RAM usage and utilize the ICall Heap via dynamic memory

    Regards,
    Rebel
  • Hi Rebel,

    Thanks for your reply. As I am working together with Krithiga, I'd like to comment on your answer:

    Sensor Controller RAM
    Already in use, despite its slower access times with subsequent longer active cc2640r2 and resulting negative impact on battery life time.

    Cache
    Doesn't sound very appealing as (AFAIK) it has slower access times than RAM, and the cc2640r2 is additionally slowed down when fetching instructions from its internal Flash. Again, I'm expecting a negative impact on our battery's lifetime.

     

    Reduce static RAM usage
    This is *exactly* what I suggest for BLE Software Stack v3.0.1! On application side, reducing RAM usage is not feasible as we need to realize some data processing for our customer's use case. The more consuming use case is yet to be implemented.

     

    Heap usage
    This is a good idea in general. In our concrete case, however, we have to cater for two kinds of sensor data, that are very 'asymmetric' in volume:

    sensor1: need to buffer < 200 bytes
    sensor2: need to buffer 3072 bytes, later 12 kbytes

    Therefore, IMO it doesn't really help to free sensor1's buffer before allocating sensor2's buffer.

    Best regards,
    Frank.

  • Hi Frank,

    The stack linker file (in the case where you are using separate stack and app images) reserves around 3k for data segments used by the code in ROM; globals and such that are initialized by the rom code's cinit. This is, i suppose, hidden from view in the .map files as it's simply reserved and not linked in or even referenced by symbols to any great extent.

    You have a point with regards to Cache as GPRAM, it will make the application run slightly slower, and if you need to retain the contents while in standby it will also add some constant current. A colleague measured 0.2µA more current drawn on average for a 100ms advertisement interval scenario. Your mileage may vary, but it's worth looking into and profile. If you can live with the extra current draw it's an easy 8kB of extra RAM. I haven't understood that it's slower than regular RAM though. Do you have a reference or context for that?

    It is also an option to use the GPRAM dynamically. This can be done both with and without retention in sleep, by
    1. Calling VIMSModeSet(VIMS_BASE, VIMS_MODE_DISABLED); // Set as GPRAM [driverlib/vims.h]
    2. Optionally calling Power_setConstraint(PowerCC26XX_SB_VIMS_CACHE_RETAIN) // Retain cache/gpram contents in sleep [ti/drivers/power/PowerCC26XX.h]

    One issue with this is that the linker has no concept of transient memory, so you are best off using raw pointers into this ram area, or possibly declaring variables there as NOINIT. Another issue with GRAM is that if you use only 1 sector for SNV instead of 2, the SNV driver will blindly use the GPRAM area for compaction when it needs to.

    As for static RAM usage, you'll save ~200 bytes if you disable secure connections. You can also, if you haven't, reduce the size of the stack for the various tasks, including the idle task stack and the Hwi/main stack. This should be done after profiling your potential worst-case usage, but there could be some 100's of bytes to get this way. You can see the usage in IAR using the Debugger->Plugins->TI-RTOS, and then the menu item TI-RTOS->Hwi->Module and TI-RTOS->Task->Detailed. In CCS this is found under Tools -> RTOS Object View.

    As a last resort, is it at all possible to compress the sensor data in memory? I'd guess the entropy isn't super high so even something simple like a differential encoding could maybe save quite a bit, but again your mileage may vary.

    Best regards,
    Aslak

  • Hej Aslak,

    You unfortunately did not comment on whether and how the BLE Software Stack's RAM usage can be reduced from >14kBytes out of the available 20480Bytes.
    Any chance, please, you let me have some information on this?

     

    Re your comment 'stack linker file (in the case where you are using separate stack and app images) reserves around 3k for data segments used by the code in ROM'.

    I understand from your comment that as it's ROM code, there is little (read: nothing) to be done here. Is my understanding correct?

    BTW: Yes, we are using separate stack and app images. If we don't we'd have to re-certify our software after each change we make in our app. By keeping app and stack separate, we have to run a re-certification only after changes to the stack configuration.

     

    Re your question on slower program execution in case Cache is used as GPRAM

    My understanding of a cache is that the CM3 core implements some strategy of pre-fetching instructions from internal Flash memory. The instructions then can be loaded into CM3's processing unit from cache, that is RAM has thus has shorter access times than internal Flash does.

    If we now use the Cache as GPRAM, each instruction needs to be read from internal Flash (slower access times means longer execution) and cannot do so before it's absolutely sure what instruction needs to be fetches next (after the processing unit has finished computing a jump in a conditional branching instruction).

    Tak og best regards,
    Frank.

  • Hi Frank,

    I'm not sure I understand the 14kB. When you say stack, do you mean everything at all that is not your application c-files if you will, or the stack image?

    Per your statement about RAM0_START, you should have the memory range from 0x20000000-0x20003fe8 left over for the application image, or 16.3kB. So the stack image uses the 3k I mentioned and which there is nothing you can change about, and another kb for a total of 0x5000-0x3fe8 = 4120b.

    Out of those 16kB for the application the space is allocated as indicated in the .map file, for local variables, stacks etc. Anything left over is then given to the ICall heap.

    In simple_peripheral, 8 kB is used by default, where 1kB is Hwi stack, 1.6kB is BLE Stack stack/task structure (BIOS heap), 440 bytes is GAPRole task stack, 640 bytes is application stack and 512 bytes is Idle task stack. In total around 4k. These stacks can be reduced.

    You are right that the cache in fact caches, and it does not when it's RAM. There is still a instruction line prefetch that is active, which is 64-bit wide. With thumb2 instructions that would equate to ~4 instructions.

    I would recommend testing the GPRAM and see if you can live with it. And if you can give some more details about those 14kB that would also be helpful.

    Best regards,
    Aslak

  • Hi Aslak,

    Until having received your posting my understanding has been based on information I received from my colleagues from s/w development, i.e. the following numbers:

      20480 Bytes of RAM provided by CC2640R2
    - 14859 Bytes allocated by TI BLE software stack v3.0.1 .hex file (separate image)

    ================================================

    = 5621 Bytes of RAM left for application when not using dual ported RAM between CM3 and Sensor Controller.
    + 2048 bytes dual ported RAM between CM3 and Sensor Controller

    =====

    = 7669 bytes of RAM left for application when using dual ported RAM between CM3 and Sensor Controller

     

    From your posting I take that our application in fact has 16.3kByte of RAM. Next thing I'll do is to discuss your posting with my colleagues from development and scrutinize the calculation above.

    Until then, I'll try to follow the figures you have presented:

    You say our application in fact has 16.3kByte of RAM available.

    As our application bases on TI's Simple Peripheral example (for IAR toolchain), approx. 8kBytes are used, out of which approx. 4kByte are used by HWI Stack, BIOS HEAP, GAPRole task, app task, idle task.

    This means roughly 4kBytes are used by Simple Peripheral data (variables and structures) that we can prune out if not needed for our purposes.

    At this point, approx. 16.3kBytes - (8kBytes - 4kBytes) = approx. 12.3kBytes are left for our application data. Please do let me know if this is correct. Else, please let me know where I am erring.

    Thanks and best regards,
    Frank.

  • Frank,

    I listed really just the main contributors to RAM usage, so your optimistic subtraction is a bit wrong. Of the remaining 4kB i didn't account for, you also have

    • ~570 bytes from the Board file (objects for the drivers) of which 400 bytes are Display and UART
    • 580 bytes for the RF driver
    • 497 bytes for the Device Info Service (this can be reduced a lot, because depending on the Profiles used, these characteristics are all optional)
    • 400 bytes simple_gatt_profile, which can be removed entirely
    • 1.5kB misc kernel things
    • 577 bytes drivers globals.

    Important to note, you don't really have the full 8k remaining of System RAM to yourself, because some is needed dynamically for heap operations by the stack and application. This varies by application.

    I would invite you to familiarize yourself with the .map file produced by IAR, as it shows you (with the exception of RTOS and ICall heap) exactly where the money is spent.

    Best regards,
    Aslak