This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

MSP430FR4133: AM power consumption depends on code location in memory

Part Number: MSP430FR4133

In line with implementing a workaround for PMM32 errata, I wanted to know
how much current my MSP430FR4133 draws in AM at 16 MHz when spinning in an
endless loop w/o any interrupts or other "disturbances".

So I removed the "eint" which normally comes after the initialisation part and
before the main loop and checked the current. It was 1.15 mA which seems to be
OK for an FR4133 at 16 MHz in AM running entirely from the cache. Then I
changed some minor stuff in my initialisation code which does not have any
influence on the current consumption. But suddenly the supply current jumped
to 1.35 mA.

After a bit of searching it turned out that the current drawn depends on where
the main loop is running (it actually got moved around a bit while adding or
removing code in the initialisation part). The main loop is simply

forever:        jmp     forever

If it is located 2 bytes before a modulo 8 address (0x???6 oder 0x???E) it
consumes 1.35 mA. On every other address we'll have 1.15 mA. That means, if
we assume the following (gas) code

.balignw        8, 0x4303
                nop
                nop
                nop
forever:        jmp     forever

it will draw 1.35 mA. When removing any number of nops, we get 1.15 mA.

First thing that jumps into mind is that the jmp sits on the last word of a
cache line and the prefetcher fetches the next 4 words from FRAM. But as we
have a 2-way set-associative cache with 4 lines of 64 bits each, it should be
possible to cache two consecutive, 8-byte-aligned memory blocks with 8 bytes
of size each.

To make things more funny: The code

.balignw        8, 0x4303
                nop
                nop
                nop
forever:        jmp     forever
.word           0x0000

consumes 1.36 mA. But this one

.balignw        8, 0x4303
                nop
                nop
                nop
forever:        jmp     forever
.word           0xFFFF

only 1.28. And last but not least:

.balignw        8, 0x4303
                nop
                nop
                nop
forever:        jmp     forever1
forever1:       jmp     forever

uses 1.24 mA (irrespective of what comes after).

Any explanations?

  • I'd guess that the following word is prefetched (and even decoded), but because it is not actually used, it is not cached either.
  • That's what I thought as well. But this does not explain the higher current for

    my last example. forever1 should definitely make it into the next cache line.

    So in this case the code again should run entirely cached (even with the

    next word being cached) and just switching between two cache lines can't

    be that expensive...

  • Hello Andre,
    please keep in mind there are multiple components, which influence the active current of the MCU/CPU. The basic point is the nature of the CPU current. It is the current of a clocked digital logic, means it consists of current pulses resulting from charging and discharging the capacitances within this logic, like the CPU, address bus etc. So while it is correct assuming code execution from cache being lower than direct access to memory with cache miss, also the executed instruction including data is affecting the actual current or better to say charge needed. That's why it is difficult to specify the actual active current. It is usually an "average" value. Of course one could be more specific, but then you would have to specify the required charge for all the millions of of possible options.
    Thus I would recommend approaching it from a different point of view. Do you have a specific concern, in the sense not achieving the optimum current, or do you suspect a malfunction of the device?

    Best regards
    Peter
  • Yes, I know the principles.

    > Do you have a specific concern, in the sense not achieving the optimum current, or do you suspect a malfunction of the device?

    I did not observe any malfunction. I just wondered about this behaviour and would like to understand it. The whole caching and esp. the "intelligent logic" (which selects the cache lines) is not very well documented so I am curious. At least we have an almost 20% increase of power consumption only because of the location of a piece of code which should entirely fit within the cache...

  • Andre,
    understood. The 20% variation is due to the points I mentioned well in the possible range of the variations. The most users do not look into the current consumption in that very detail. If writing code in C, this is the task of the C-Compiler based on the selected optimization settings to generate the code with the lowest possible current consumption. The result is one of the quality factors of a C-Compiler. In assembler it is of course up to the engineer creating the optimum code.
    Do you have difficulties achieving a certain power budget target?

    Best regards
    Peter
  • No, I just want to understand what's gonig on. And, apart from that, I have my doubts that any compiler will consider the word follwing the "jmp" in order to optimise power consumption ;-).

    And, as I wrote: Initially, I just wanted to see how much the power rises when switching SMCLK from 8 to 16 MHz to work around PMM32. Of course, nobody who is concerned about power will have the CPU run in AM doing nothing. But in case one has a small loop that does something useful, it depends on where it sits within the cache line and what data is following.

  • All,

    there exists and Apps Nots which describes all these kind of effects and provides power focused users some ideas how to optimize Acitve Mode Power consumption:

  • After some investigation done by TI (thanks!) it turns out, that:

    1. The word after the jmp gets (pre-)fetched and even cached (but on a different cache line if it sits on a modulo-8 address!)

    2. If it sits on a modulo-8 address (and therefore in a different cache line) even though it is cached, the current consumption depends on the bits switching. This is due to the prefetch which crosses the cache line border. So even if all bits are in the cache, switching cache lines may be more expensive than staying in a single one.

**Attention** This is a public forum