Program Execution from Ram?

memoryleak

Other Parts Discussed in Thread: MSP430F5438A, CC430F5137

I hadn't noticed it before, but some of the 5xx devices (such as the msp430f5438a) spec a different operating current when executing from ram vs flash. I don't really have a use case for this yet, but putting a few ISRs in RAM seems like something interesting that I'd love to play with :)

Maybe my google-fu is rusty but I haven't come across very many examples on that do this. Slaa103 (Flash self-programming guide) has some assembly that copies a chunk of flash code into ram and then jumps to it (Figure 3), but Id love to stay in my nice, civilized C environment because writing things in assembly makes it feel like work :(

Does anyone have a C example of how to do this, or should I set out to write my own?

over 14 years ago

0 Dung Dang over 14 years ago

TI__Genius 17802 points

Hi memoryleak,

we have created some sample code that demonstrates execution from RAM. Hopefully it could be a good starting point for you.

Regards,

Dung

4035.RunningfromRAM_IAR.zip

4276.RunningfromRAM_CCE.zip

RunningfromRAM_CCE.zip

0 Jens-Michael Gross over 14 years ago

Guru 227245 points

One main reason for executing something in ram is that you cannot do burst flash writing when the code that does it is in flash itself.

Some MSPs allow moving the vector table into RAM, so if you are flashing data and need some ISR being executed while the flash is busy erasing/writing, it might be useful to move it into RAM (and the vector table too, as it is inaccessible in flash during erase/write).

I don't think that executing code from ram instead of flash will save that much energy, but well, in theory the whoel applicaiton can be copied into ram and executed there. If there is enough ram, that is.

There are several way for putting code into ram. One way is to move the function into the data segment. Then it is stored in the flash area along with the variable init values and copied into ram at startup.How to do this depends on the compiler (#pragma, function attributes etc.)
The ddrawback is that the function can be overwritten, and even if not, it consumes ram all the time. Moving just the required code to ram temporarily might be the better solution.

It should be possible to generate fully relocatable code too in C, as long as you don't use any class stuff. (but usually, you cannot rely on this if not explicitely supported by the compiler), so you can have the funciton in flash, copy it to RAM when needed. Fully relocatable means that inside the code chunk only relative jumps are used, and calls are only used to functions that will never mode or through function pointers

0 memoryleak over 14 years ago in reply to Dung Dang

Intellectual 345 points

Dung,

Thanks for the examples! The procedure looks pretty straightforward, I can't wait to play with these later today :)

Jens,

As always, I appreciate your thoughts. I also wouldn't think that executing out of ram would save much power, but apparently it cuts CPU related power by a decent bit(at least in the 5438a). This is a purely academic pursuit for me right now, so I have no real power target other than "as low as I can possibly make it". Right now I spend most of my time in LPM3 (running on VLO), waking only to service and interrupt, do some math (running on DCO) and go back to sleep. If large portions (CPU time wise) of my code can fit in ram I might be able to reduce power significantly if I have the extra ram to devote to code storage.

I hadn't thought of moving the ISR vector table to ram or even modifying the ISR vectors, I had just planned to have a one-liner ISR that calls my volatile function. Once I get that working I think I'll look at moving the ISR vectors around :)

Thank you both for the help!

0 memoryleak over 14 years ago in reply to memoryleak

Intellectual 345 points

Executing the majority of my ISR out of ram reduced my CPU related power consumption by about ~20% (on my FE4272). My margin for error is pretty high (quickie experiment, hand dmm) but it does seem to be a viable way to squeeze few more electrons out of a design if you're feeling hardcore about it :)

Of course since I don't touch the CPU but every 16ms and only for a handful of cycles, the savings is somewhere around the difference between picking XCAP14PF vs XCAP0PF for my crystal. Still a fun optimization though :)

0 Jens-Michael Gross over 14 years ago in reply to memoryleak

Guru 227245 points

memoryleak said:
Still a fun optimization though

Indeed it is. I never used hte MSP because of its low power consumption. My predecessor seelcted the MSP because its hardware modules were superior to the PICs while still being cheap and easy to handle. Then I inherited all the PIC projects and some started MSP migrations and had to do all the new firmware. It was easier to writ ethe new ode than to understand the condensed and complicated (because of the hardware limitations) PIC code.

Anyway, many people pick the MSP because of its low power consumtion, and making it even lower is a nice thing.

For larger projects, it still needs to be calculated whether the reduced power consumption for RAM execution pays for the increased runtime required for the setup and copying. Especially if the code needs to be moved more than once (there's so much more flash than RAM and therefore larger projects might need to exchange the code in ram more or less often)

0 Wim De Kimpe over 14 years ago in reply to Dung Dang

Prodigy 20 points

Hello

I have been using your example to program a small bootloader program that can update the flash via an RF link.

I run into problems now because I erase some functions I use from a library, and although the function itself is also copied to RAM first, at some point the code still jumps to Flash rather than to RAM.

How can I make sure that library functions are also called from RAM?

Is this an IAR problem or is there some linker option to force it to use the functions in memory?

I have not copied all my code into RAM, to save memory, the initialisation of my system can be run from flash first. Once initialised, I then jump to running from RAM.

Is this ok to do? Or should I copy all functions to RAM first?

thanks

0 Jens-Michael Gross over 14 years ago in reply to Wim De Kimpe

Guru 227245 points

Only move the code to ram that is needed when there is no more flash.

However, if you use a library, fucntions of the library may use other functions and might not kow htat these are moved to ram. If the library was precompiled, the linker will link these 'internal' calls to the flash location.
Also, if the library is available in source code, either the calls inside or the segment definitions need to be changes so they will assume the code in RAM rather than flash. (if you change the segment definitions, the linker will automatically adjust the calls and also copy the functions from flash to ram on startup, but that's not trivial)

This kind of code tweaking usually only works with code you have written yourself or at least adjusted yourself.

Bout the segment definitions, on mspgcc, you simply say in yur source code that the function belongs to the data segment. Then it is treated liek initialized variables (copied to ram on startup) and all references to it are pointing to ram.

I don't know how to do it with IAR. Maybe some sort of #pragma.

0 Rahul over 11 years ago in reply to Jens-Michael Gross

Expert 1770 points

Hi All,

I want to run function form RAM in CC430F5137. I saw RAM example project.

I have once question, how do i decide how much memory to allocated to the function.

For ex: In linker file of the example I found this,

RAM_MEM : origin = 0x1C00, length = 0x0200 ????

FLASH_MEM : origin = 0x5C00, length = 0x0200 ???

How those numbers(red)are decided ??

Thanks,

Rahul

0 Jens-Michael Gross over 11 years ago in reply to Rahul

Guru 227245 points

For obvious reasons, FLASH_MEM and RAM_MEM must be same size.
The size of 0x200 likely was chosen because 0x200 is the size of a main flash segment. In general, the size must be of course large enough to hold your code. Aligning the segment sizes to flash segment borders allows runtime-updates of this part of flash without affecting other code in flash. AFAIK not a requirement for the simple case of a monolithic compiled/linked application.

0 Rahul over 11 years ago in reply to Jens-Michael Gross

Expert 1770 points

Thanks Michael,

I wanted to run piece code from ram. How do I decide the code size of that piece code. Because I need size to be mentioned in linker command file ! right? pls refer this source code,

1565.msp430x54xA_adc12_10_4.c

Fullscreen

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
//******************************************************************************
//  MSP430F543xA Demo - ADC12_A, Sample A10 Temp and Convert to oC and oF
//
//  Description: A single sample is made on A10 with reference to internal
//  1.5V Vref. Software sets ADC12SC to start sample and conversion - ADC12SC
//  automatically cleared at EOC. ADC12 internal oscillator times sample
//  and conversion. In Mainloop MSP430 waits in LPM4 to save power until
//  ADC10 conversion complete, ADC12_ISR will force exit from any LPMx in
//  Mainloop on reti.
//  ACLK = n/a, MCLK = SMCLK = default DCO ~ 1.045MHz, ADC12CLK = ADC12OSC
//
//  Uncalibrated temperature measured from device to devive will vary due to
//  slope and offset variance from device to device - please see datasheet.
//
//                MSP430F5438A
//             -----------------
//         /|\|              XIN|-
//          | |                 |
//          --|RST          XOUT|-
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

//******************************************************************************
//  MSP430F543xA Demo - ADC12_A, Sample A10 Temp and Convert to oC and oF
//
//  Description: A single sample is made on A10 with reference to internal
//  1.5V Vref. Software sets ADC12SC to start sample and conversion - ADC12SC
//  automatically cleared at EOC. ADC12 internal oscillator times sample
//  and conversion. In Mainloop MSP430 waits in LPM4 to save power until
//  ADC10 conversion complete, ADC12_ISR will force exit from any LPMx in
//  Mainloop on reti.
//  ACLK = n/a, MCLK = SMCLK = default DCO ~ 1.045MHz, ADC12CLK = ADC12OSC
//
//  Uncalibrated temperature measured from device to devive will vary due to
//  slope and offset variance from device to device - please see datasheet.
//
//                MSP430F5438A
//             -----------------
//         /|\|              XIN|-
//          | |                 |
//          --|RST          XOUT|-
//            |                 |
//            |A10              |
//
//   Andreas Dannenberg
//   Texas Instruments Inc.
//   November 2009
//   Built with CCSv4 and IAR Embedded Workbench Version V4.21
//******************************************************************************
// Filename:
//   msp430x54xA_adc12_10_3.c
// Description:
//   ULP Optimizations Step 4
// Modifications (compared to original code example from TI web):
//   - (ULP 4.1) Terminate unused GPIO pins
//   - (ULP 2.1) Use timer module for delays
//   - (ULP 7.1) Moved global variables as locals into main
//   - (ULP 5.1) Moved execution of divide operations into RAM
// Current consumption:
//   246.3uA - 247.3uA
//******************************************************************************
#include "msp430x54xA.h"

volatile long temp;
//volatile long IntDegF;
//volatile long IntDegC;

#pragma CODE_SECTION(CalculateTempInCandF,".FLASHCODE")
void CalculateTempInCandF(void)
{
  volatile long IntDegF;
  volatile long IntDegC;

  IntDegC = ((temp - 1857) * 666) / 4096;
  IntDegF = ((temp - 1857) * 1199) / 4096 + 32;
}

void main(void)
{
//  volatile long IntDegF;
//  volatile long IntDegC;

  WDTCTL = WDTPW + WDTHOLD;                 // Stop WDT

  // added per ULP Advisor info
  PAOUT  = 0;
  PADIR  = 0xFFFF;                          // P1 & P2
  PASEL  = 0;
  PBOUT  = 0;
  PBDIR  = 0xFFFF;                          // P3 & P4
  PBSEL  = 0;
  PCOUT  = 0x4000;                          // AMP_SD with R to VCC
  PCDIR  = 0xFFFF;                          // P5 & P6
  PCSEL  = 0;
  PDOUT  = 0;
  PDDIR  = 0xFFFF;                          // P7 & P8
  PDSEL  = 0;
  PEOUT  = 0;
  PEDIR  = 0xFFFF;                          // P9 & P10
  PESEL  = 0;
//  P11OUT = 0;
//  P11DIR = 0xFF;                            // P11
//  P11SEL = 0;
  PFOUT = 0;
  PFDIR = 0xFFFF;                           // ULP Advisor knows only 'letter' Ports
  PFSEL = 0;
  PJOUT  = 0;
  PJDIR  = 0xFF;                            // PJ
  
  // copy processing intensive routines into RAM
  memcpy((void*)0x1C00,(const void*)0x5C00,0x0200);

  /* Initialize the shared reference module */ 
  REFCTL0 |= REFMSTR + REFVSEL_0 + REFON;    // Enable internal 1.5V reference
  
  /* Initialize ADC12_A */ 
  ADC12CTL0 = ADC12SHT0_8 + ADC12ON;		    // Set sample time 
  ADC12CTL1 = ADC12SHP;                     // Enable sample timer
  ADC12MCTL0 = ADC12SREF_1 + ADC12INCH_10;  // ADC input ch A10 => temp sense 
  ADC12IE = 0x001;                          // ADC_IFG upon conv result-ADCMEMO
  
//  __delay_cycles(75);                       // 35us delay to allow Ref to settle
                                            // based on default DCO frequency.
                                            // See Datasheet for typical settle
                                            // time.

  TA1CTL = TASSEL_1 + TACLR;                // ACLK, clear TAR
// TA1CCR0 = 32768-1;                       // 1 second
  TA1CCR0 = (32768*2)-1;                    // 2 seconds
  TA1CCTL0 = CCIE;                          // CCR0 interrupt enabled
  TA1CTL |= MC_1;                           // Start timer in up-mode
  __bis_SR_register(LPM4_bits + GIE);       // LPM4 with interrupts enabled

  ADC12CTL0 |= ADC12ENC;

  while(1)
  {
    ADC12CTL0 |= ADC12SC;                   // Sampling and conversion start

    __bis_SR_register(LPM4_bits + GIE);     // LPM4 with interrupts enabled
    __no_operation();

/*
    // Temperature in Celsius
    // ((A10/4096*1500mV) - 680mV)*(1/2.25mV) = (A10/4096*666) - 302
    // = (A10 - 1857) * (666 / 4096)
    IntDegC = ((temp - 1857) * 666) / 4096;

    // Temperature in Fahrenheit
    // Tf = (9/5)*Tc + 32
    IntDegF = ((temp - 1857) * 1199) / 4096 + 32;
*/

    CalculateTempInCandF();

//    __no_operation();                       // SET BREAKPOINT HERE
    __bis_SR_register(LPM4_bits + GIE);       // LPM4 with interrupts enabled
  }
}

#pragma vector=ADC12_VECTOR
__interrupt void ADC12ISR (void)
{
  switch(__even_in_range(ADC12IV,34))
  {
  case  0: break;                           // Vector  0:  No interrupt
  case  2: break;                           // Vector  2:  ADC overflow
  case  4: break;                           // Vector  4:  ADC timing overflow
  case  6:                                  // Vector  6:  ADC12IFG0
    temp = ADC12MEM0;                       // Move results, IFG is cleared
    __bic_SR_register_on_exit(LPM4_bits);   // Exit active CPU
    break;
  case  8: break;                           // Vector  8:  ADC12IFG1
  case 10: break;                           // Vector 10:  ADC12IFG2
  case 12: break;                           // Vector 12:  ADC12IFG3
  case 14: break;                           // Vector 14:  ADC12IFG4
  case 16: break;                           // Vector 16:  ADC12IFG5
  case 18: break;                           // Vector 18:  ADC12IFG6
  case 20: break;                           // Vector 20:  ADC12IFG7
  case 22: break;                           // Vector 22:  ADC12IFG8
  case 24: break;                           // Vector 24:  ADC12IFG9
  case 26: break;                           // Vector 26:  ADC12IFG10
  case 28: break;                           // Vector 28:  ADC12IFG11
  case 30: break;                           // Vector 30:  ADC12IFG12
  case 32: break;                           // Vector 32:  ADC12IFG13
  case 34: break;                           // Vector 34:  ADC12IFG14
  default: break;
  }
}

// Timer A0 interrupt service routine
#pragma vector=TIMER1_A0_VECTOR
__interrupt void TIMER1_A0_ISR(void)
{
  __bic_SR_register_on_exit(LPM4_bits);     // Exit any LPM
}

In linker command file,

In sections,

.FLASHCODE : load = FLASH_MEM, run = RAM_MEM
/* CODE IN FLASH AND WILL BE COPIED
TO RAM AT EXECUTION HANDLED BY
USER */

In Memory,

RAM_MEM : origin = 0x1C00, length = 0x0200
RAM : origin = 0x1E00, length = 0x3E00

FLASH_MEM : origin = 0x5C00, length = 0x0200

FLASH : origin = 0x5E00, length = 0xA180

FLASH2 : origin = 0x10000,length = 0x25C00

Hows that 0x200 has been decided!?

Thanks,

Rahul

0 Jens-Michael Gross over 11 years ago in reply to Rahul

Guru 227245 points

The size of the segments must be equal or larger than the size of the code you want to place in. That's obvious.

You can figure out the size of the functions by looking at the map file before placing them into the ram segment. Or you just try and when the code is larger, the linker will tell you so and also give you the required size. You can then increase the segment size.

0 Rahul over 11 years ago in reply to Jens-Michael Gross

Expert 1770 points

Thanks Michael :)

0 tml over 11 years ago in reply to Dung Dang

Expert 1140 points

Thank you Dung for this example!

Does anyone know how to handle it in mspgcc? The syntax of TI linker's .cmd file is not supported by the GNU ld.

Best Regards,

tml

**Attention** This is a public forum

MSP low-power microcontrollers

MSP low-power microcontroller forum

Program Execution from Ram?