This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Illegal opcode instruction actually executed...

Prodigy 30 points

Replies: 3

Views: 1979

Hi everybody,

I've been working these weeks to find a bug in a msp430 piece of firmware. It finally turned out that there was a unhappy branch to a hard-coded adress that was supposed to be filled with executable code : instead of that, there were only initialisation values for some registers.

By reading the datasheet I would have expect that the Reset Interrupt would be raised for any illegal instruction fetch.
Instead of that, this values have been interpreted as opcodes and instructions, although some of them were cleary not in the instruction map.

It seems that the values were decoded like if they were from CPUX instruction set, although the micro is a MSP430F123 and should not implement it. Would it be possible that the fetch/decode part of MSP hardware would decode these values as CPUX instructions, even if the actual chip does not support it ?

For instance, here are some examples taken from IAR disassembly, that should be illegal opcodes for my MSP :

0100 bra @SP

0029 012A mova &FCTL2,R9

Plus, the actual behavior does not fit with these instructions decoding, the chip does something else. (for instance not a branch to @SP, but a "push PC"...) Is there somebody from TI (or anywhere else) that could confirm how the decoding really works in such cases ?

We already have fixed this bug to avoid such a situation, but we really have to understand what's going on inside the chip to minimize any risk out there.

Thanks a lot for your help.

Sebastien

3 Replies

  • The maily orthogonal structure of the MSP doesn't know many 'illegal' instructions.
    Easch isntruction word is a combinationof isntruction and parameters, even if this combination doesn't make sense in some cases.

    BRA @SP, as your disassembler shoews it, is actually an alias for a MOV instruciton with PC as target.

    So BRA @SP is actially a MOVA @SP, PC. Whose instruction code, surprise, is 0x0100. Well, thsi instruction moves the value above top of stack to PC, which doesn't make much sense, but it is a valid instruction.

    the second instruction, MOVA &FCTL2, R9, is a totally valid operation too. It takes a 32 bit value from memory location FCTL2 and moves its lower 20 bits into R9.

    MOVA instruction is available on MSP430X cores only, but the disassembler doesn't know whether you have an MSP430 core or MSP430X core.

    The MSP itself won't trigger a reset when fetching an instruction that isn't  defined. It just executes something unpredictable that somehow matches into the interpretation matrix.
    Those kind of 'illegal opcodes' have often proven useful in the past and were, for example, commonly used on the 6510 processor to significantly speed up salculations such as packign algorithms. However, since there were not defined, they were subject to change on later revisions. and indeed, teh superseding 65SC816 processor re-used these opcodes for completely different tasks.
    That#s why you experience a behavior that doesn't match the disassembly: you apparently don't have an MSP430X core MSP (you didn't write which one you have), so the opcode does 'something' while on the newer MSP430X cores, where it is a defined opcode, it would do what the disassembler tells you.

    The mentioned illegal instruciton fetch reset refers to an instruciton fetch from an illegal position, such as trying to execute the content of the SFRs or vacant memory. This is done by the memory controller and triggered by the fetch, independently of the content fetched (so it would even trigger if it were a 'valid' instruction.
    It is the invalid memory access (or access mode), not the invalid value, that causes the invalid instruction fetch reset.

    _____________________________________

    Time to say goodbye - I don't have the time anymore to read and answer forum posts. See my bio for details.

    Before posting bug reports or ask for help, do at least quick scan over this article. It applies to any kind of problem reporting. On any forum. And/or look here.
    I'm sorry that  I can no longer provide help  in the forum or by private conversation.

  • In reply to Jens-Michael Gross:

    Hi Jens-Michael,

    Thanks a lot for your answer.

    The micro I'm using is a MSP430F123 (8KB Flash / 256B Ram). It does not support CPUX instruction set.

    I understand that the disassembler (IAR) can be decoding something that is different from the CPU itself (although IAR knows it's a F123 and should decode accordingly...)

    I checked the User's Guide for x1xx Family (SLAU049F) and did not find anything about illegal instruction fetch. It was in the x2xx Family User's Guide  (that I'm using on other projects) that I saw that the Reset could be raised from illegal instruction fetch. I didn't notice this difference before, I'm using F2350 chips most of the time.

    So I guess that there is no way to know what the CPU would do with illegal opcodes ? Except that it would "do something" in an unpredictable way.

    What I was trying to figure out, was why some of the products would suffer from that bug, and some other wouldn't. That is, every time the CPU would go through these instructions, it would end up jumping to an address in the 0x9600 range for some products, or in the 0x2200 range for some others.
    Both are pointing to nowhere, there is physically nothing on this F123 between 0x1100 and 0xE000, but at the end, those that had jumped to 0x960C would get stuck and need a hardware reset (which is not possible on finished products because of potting + soldered cover) , whereas those that had jumped to 0x2200 would finally find their way back to the real code and continue execution.

    What really makes the difference between the 2 cases could be some registers values, or Ram values, that could change the way these illegal instructions are interpreted by the CPU. Funny thing is that unpredictable execution is very repeatable : several products (same hardware, same firmware, same lot) in the same state behave exactly the same way. I just don't know exactly which register or value makes the difference at the end.

    That was more for the context - I think there is no way to go further in the root analysis.

    Except if some guys know deeply how the instruction decoding is done in the CPU ?

    (TI developers ?)

  • In reply to Sébastien Dubail:

    Sébastien Dubail
    although IAR knows it's a F123 and should decode accordingly...

    I don't think that the disassembly part knows or cares for the CPU. It gets the binary value and produces an output.
    The part-aware parts are the compiler and the linker.And of course the part od the debugger that connects the MSP for the physical connection. The rest (symbols, locaitons) is just taken from the linker output file. Everything found there and in MSP memory simply 'is', and is not quesitoned.

    Sébastien Dubail
    checked the User's Guide for x1xx Family (SLAU049F) and did not find anything about illegal instruction fetch.

    Well, 1x family is quite old and the users guide hasn't been update much lately. Not every thing that has been added to the later users guide was ported back to the 1x guide. (and of course much found on newer devices isn't on 1x devices at all).
    So it's difficult to say what's there and not documented, maybe documented elsewhere (some of the info in guide and datasheet have swapped their position on later families and are also not consistently found at the same place on all other families), or not present at all.

    Sébastien Dubail
    So I guess that there is no way to know what the CPU would do with illegal opcodes ? Except that it would "do something" in an unpredictable way.

    One could do extensive testing :) However, it isn't sure that teh results are consistent. Maybe te behavior changes for a different batch of chips (different silicon revision)

    Sébastien Dubail
    That is, every time the CPU would go through these instructions, it would end up jumping to an address in the 0x9600 range for some products, or in the 0x2200 range for some others.

    Since the MSP supports indirect, and also indexed addressing modes, it could depend on the current content of a memory cell, of a return value on the stack, whatever.
    Based on the other opcodes, you could try to figure out a pattern (remember, parts of the instruciton contain source and destination addressing modes and register numbers)

    However, the important question is: why does your MSP reach this instructions at all? It never should. So you have a bug in your code. If you fix it, the whole issue becomes unimportant.

    _____________________________________

    Time to say goodbye - I don't have the time anymore to read and answer forum posts. See my bio for details.

    Before posting bug reports or ask for help, do at least quick scan over this article. It applies to any kind of problem reporting. On any forum. And/or look here.
    I'm sorry that  I can no longer provide help  in the forum or by private conversation.

This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.