MSP430FR2355: MCU Lockup, Clock issue?

Part Number: MSP430FR2355

Tool/software:

We are facing the following issue: during batch production, involving several hundred units, we identified 3, 4 MSP430FR2355 microcontrollers exhibiting very strange behavior. Upon further analysis, we observed that these devices enter in a some lock-up state.

The setup involves three MSPs connected on one board as follows:

  • A main FPGA is connected to each MSP via SPI.
  • The FPGA generates a single-ended 24MHz clock signal, which is supplied to the first MSP.
  • The first MSP forwards the 24MHz clock to the second MSP, and the second MSP forwards it to the next, creating a daisy-chained clock configuration using the XIN and MCLK_OUT pins.
  • For programming, we use the Spy-Bi-Wire interface. The FPGA programs the first MSP, the first MSP programs the second MSP via the Spy-Bi-Wire interface, and so on, forming a daisy-chained Spy-Bi-Wire programming configuration.
  • We want the clock output from the MSP to be stable and to keep the next MSP in reset to avoid any clock-related issues during transitions or transient conditions.



With this setup, we have full control and the ability to program and manage each MSP. However, if the first or second MSP in the chain fails, the subsequent devices become sensitive to the failure. For example, if the first or second MSP stops generating the clock signal for the next MSP in the chain, the entire system is affected.

In the 3–4 affected units, we observed the following behavior:

  • During startup, the MSP sometimes enters a lock-up state and remains stuck.
  • When in this state, it does not generate the output clock for the next MSP in the chain.
  • Resetting the device via the Spy-Bi-Wire interface is either impossible or very difficult.
  • in Most cases, the devices work correctly without any issues.

Upon further investigation, we suspect that the device enters a lock-up state during the transition of MCLK from the internal 800kHz clock to the external 24MHz clock. Please review the following code snippet for potential issues and provide suggestions on how to detect and prevent this state.

This issue does not occur on every device, we have several hundred MSPs without this problem.


void main(void)
{
    /* Stop watchdog timer */
    WDT_A_hold(WDT_A_BASE);

    __delay_cycles(20000);

    init_ports();

    SPI_init();
    UART_init();

    // Activate port configuration
    PMM_unlockLPM5();

    memcpy((void*)ram_iv, (void*)fram_iv, sizeof(ram_iv));
    SysCtl_enableRAMBasedInterruptVectors();

    init_CS();
    RTClock_init();
    RTClock_start();
    ADCC_init();

    GPIO_setAsPeripheralModuleFunctionOutputPin(GPIO_PORT_P2, GPIO_PIN6, GPIO_PRIMARY_MODULE_FUNCTION);
    //GPIO_setAsInputPinWithPullUpResistor(SBW_OUT_PORT, SBW_OUT_DAT_PIN);
    GPIO_setOutputHighOnPin(SBW_OUT_PORT, SBW_OUT_DAT_PIN);

    reset_MSP_SBW();

    get_ID();
    init_ctrl();

    init_registers();
    SPI_enable(1);

    _EINT();
}

void init_ports(void) {
    // XIN and MCLK_OUT
    GPIO_setAsPeripheralModuleFunctionInputPin(GPIO_PORT_P2, GPIO_PIN7, GPIO_SECONDARY_MODULE_FUNCTION);
    //GPIO_setAsPeripheralModuleFunctionOutputPin(GPIO_PORT_P2, GPIO_PIN6, GPIO_PRIMARY_MODULE_FUNCTION);
    GPIO_setAsOutputPin(GPIO_PORT_P2, GPIO_PIN6);
    GPIO_setOutputLowOnPin(GPIO_PORT_P2, GPIO_PIN6);

    // SBW pins
    //GPIO_setAsInputPinWithPullUpResistor(SBW_OUT_PORT, SBW_OUT_DAT_PIN);
    GPIO_setAsOutputPin(SBW_OUT_PORT, SBW_OUT_DAT_PIN);
    GPIO_setOutputLowOnPin(SBW_OUT_PORT, SBW_OUT_DAT_PIN);
    //GPIO_setAsInputPinWithPullDownResistor(SBW_OUT_PORT, SBW_OUT_CLK_PIN);
    GPIO_setAsOutputPin(SBW_OUT_PORT, SBW_OUT_CLK_PIN);
    GPIO_setOutputLowOnPin(SBW_OUT_PORT, SBW_OUT_CLK_PIN);
}

void init_CS(void) {
    // Configure two FRAM waitstate as required by the device datasheet for MCLK
    // operation at 24MHz(beyond 8MHz) _before_ configuring the clock system.
    FRAMCtl_configureWaitStateControl(FRAMCTL_ACCESS_TIME_CYCLES_2);

    //Enable HF/LF mode
    HWREG16(CS_BASE + OFS_CSCTL6) |= XTS_1;

    //Switch OFF XT1 oscillator and enable BYPASS mode
    HWREG16(CS_BASE + OFS_CSCTL6) |= (XT1BYPASS_1 | XT1AUTOOFF_1 | XT1HFFREQ_3);

    do
    {
        GPIO_setOutputLowOnPin(GPIO_PORT_P4, GPIO_PIN1);
        //Clear XT1 and DCO fault flags
        HWREG8(CS_BASE + OFS_CSCTL7) &= ~(XT1OFFG | DCOFFG);

        // Clear the global fault flag. In case the XT1 caused the global fault
        // flag to get set this will clear the global error condition. If any
        // error condition persists, global flag will get again.
        HWREG8(SFR_BASE + OFS_SFRIFG1) &= ~OFIFG;
        GPIO_setOutputHighOnPin(GPIO_PORT_P4, GPIO_PIN1);
    } while (HWREG8(SFR_BASE + OFS_SFRIFG1) & OFIFG);              // Test oscillator fault flag

    GPIO_setOutputLowOnPin(GPIO_PORT_P4, GPIO_PIN1);
    CS_initClockSignal(CS_ACLK, CS_XT1CLK_SELECT, CS_CLOCK_DIVIDER_640);    // ACLK = 37500Hz
    CS_initClockSignal(CS_MCLK, CS_XT1CLK_SELECT, CS_CLOCK_DIVIDER_1);      // MCLK = 24MHz
    CS_initClockSignal(CS_SMCLK, CS_XT1CLK_SELECT, CS_CLOCK_DIVIDER_1);     // SMCLK = 24MHz

}

  • Hello zero_cool,

    The setup looks fine.

    Is the failure repeatable on these devices?  These failures are at room temp or other?  What is the VDD voltage?

    first or second MSP stops generating the clock signal for the next MSP in the chain

    I see in your code you are setting P4.1 = 1 during the oscillator test, then clearing it upon exiting the Do-While loop.  Have you looked at this pin to see what state it is in when the failure occurs? If high it would suggest something in the clock system is not working.  Are there any other signs of life, such as other pins that might toggle during normal operation?

    I also see you have commented out the following line. Is this true for the code that is running on these devices?
    //GPIO_setAsPeripheralModuleFunctionOutputPin(GPIO_PORT_P2, GPIO_PIN6, GPIO_PRIMARY_MODULE_FUNCTION);

  • Hi Dennis, thanks for the quick reply.

    The failure is repeatable on this MSP, occurring frequently, especially when the power supply is unplugged and then plugged back in after a couple of minutes. However, performing a quick power cycle (plugging in and out) or a software reset allows the MSP to usually start without any problems, and everything works fine. We have identified two more units with the same behavior out of a couple of hundred MSPs that we tested in batch.

    As you correctly noticed, during our bug testing, we monitored pin P4.1, which toggles during the oscillator test in the do-while loop. In the failure scenario, pin P4.1 toggles about 5-8 times before going low, suggesting that the oscillator test itself is functioning correctly. It appears that the MCU fails and gets locked at this point:

        GPIO_setOutputLowOnPin(GPIO_PORT_P4, GPIO_PIN1);
        CS_initClockSignal(CS_ACLK, CS_XT1CLK_SELECT, CS_CLOCK_DIVIDER_640);    // ACLK = 37500Hz
        CS_initClockSignal(CS_MCLK, CS_XT1CLK_SELECT, CS_CLOCK_DIVIDER_1);      // MCLK = 24MHz
        CS_initClockSignal(CS_SMCLK, CS_XT1CLK_SELECT, CS_CLOCK_DIVIDER_1);     // SMCLK = 24MHz

    The device is at room temperature and is powered by a 3.3V supply.

    The line GPIO_setAsPeripheralModuleFunctionOutputPin(GPIO_PORT_P2, GPIO_PIN6, GPIO_PRIMARY_MODULE_FUNCTION); has been moved after the init_CS() function. This is because, when configured at the beginning, the MSP generates an 800kHz signal on the MCLK_OUT. The MCU needs to switch from the internal RC clock to the 24MHz clock, and we think the first clock should be stable before moving forward clk the next MSP.

    code from main();

    init_CS();
    RTClock_init();
    RTClock_start();
    ADCC_init();
    
    GPIO_setAsPeripheralModuleFunctionOutputPin(GPIO_PORT_P2, GPIO_PIN6, GPIO_PRIMARY_MODULE_FUNCTION);

    We don't know how to prevent this or why it's happening, so we have currently halted production until we identify the cause of this behavior.

    Regards

  • Ok, this is good information.  I would like to focus on the power supply.  Do you know or have you measured with an oscilloscope how fast the +3.3V decays after the power supply is unplugged?  From your description then, when the power supply is unplugged for a long period of time, I'm guessing all the caps in the system have probably dropped to at or near zero volts. Then when power is re-applied, VDD starts from 0v and rises up to +3.3v, the MPS430 goes through a POR and this is when you see the issue.  Strange!

    Then in the other case, if you unplug the power supply and then plug it back in quickly, or a SW reset, you don't see this issue.  BTW, which SW reset do you use, PMMSWPOR or PMMSWBOR?  This tells me the system caps do not drop below the MSP430's minimum VDD and the device continues to operate normally and that the MSP430 doesn't go through a BOR or POR.

    Can you tell from the markings on the device if the failing units have same LOT/DATE code?

    Has the MSP430FR2355 been used in your production for some time now, or is this the first time.  In other words you have been building product successfully for a while and now all of a sudden you are seeing this issue?

    Is it possible to share the schematic showing the MSP430 and its connections?  If proprietary, you can maybe snip only the MSP430 section to share, or you can send me a "friendship" request and can share with me privately.

  • I’m sending you a friendship request. I’ll share more details in your inbox.

  • It seems the issue is related to device temperature. We conducted more tests today and placed a "good" device into a cooled/heated chamber. We observed the same behavior on those devices. The temperature in chamber was around -20°C. It appears the MSP gets stuck when switching between the internal 800kHz clock to the external 24MHz clock, staying in that state for a while. Usually, the MCU resumes normal operation when the temperature rises slightly. It can remain in this state for up to a couple of minutes, depending on the ambient temperature.

    We suspect this is the point where the MCU gets locked.


    CSCTL4 = SELMS__XT1CLK | SELA__REFOCLK;

    Init_cs function:

    void init_CS(void) {
        // Configure two FRAM waitstate as required by the device datasheet for MCLK
        // operation at 24MHz(beyond 8MHz) _before_ configuring the clock system.
        FRAMCtl_configureWaitStateControl(FRAMCTL_ACCESS_TIME_CYCLES_2);
    
        //Enable HF/LF mode
        //HWREG16(CS_BASE + OFS_CSCTL6) |= XTS_1;
    
        //Switch OFF XT1 oscillator and enable BYPASS mode
        //HWREG16(CS_BASE + OFS_CSCTL6) |= (XT1BYPASS_1 | XT1AUTOOFF_1 | XT1HFFREQ_3);
    
        HWREG16(CS_BASE + OFS_CSCTL6) = (XTS_1 | XT1BYPASS_1 | XT1AUTOOFF_1 | XT1HFFREQ_3);
    
        do
        {
            GPIO_setOutputLowOnPin(GPIO_PORT_P4, GPIO_PIN1);
            //Clear XT1 and DCO fault flags
            HWREG8(CS_BASE + OFS_CSCTL7) &= ~(XT1OFFG | DCOFFG);
    
            // Clear the global fault flag. In case the XT1 caused the global fault
            // flag to get set this will clear the global error condition. If any
            // error condition persists, global flag will get again.
            HWREG8(SFR_BASE + OFS_SFRIFG1) &= ~OFIFG;
            GPIO_setOutputHighOnPin(GPIO_PORT_P4, GPIO_PIN1);
        } while (HWREG8(SFR_BASE + OFS_SFRIFG1) & OFIFG);              // Test oscillator fault flag
    
        GPIO_setOutputLowOnPin(GPIO_PORT_P4, GPIO_PIN1);
    
        CSCTL4 = SELMS__XT1CLK | SELA__REFOCLK;        
                                                           // default DCOCLKDIV as MCLK and SMCLK source
        GPIO_setOutputHighOnPin(GPIO_PORT_P4, GPIO_PIN1);
        //CS_initClockSignal(CS_ACLK, CS_XT1CLK_SELECT, CS_CLOCK_DIVIDER_640);    // ACLK = 37500Hz
        //CS_initClockSignal(CS_MCLK, CS_XT1CLK_SELECT, CS_CLOCK_DIVIDER_1);      // MCLK = 24MHz
        //CS_initClockSignal(CS_SMCLK, CS_XT1CLK_SELECT, CS_CLOCK_DIVIDER_1);     // SMCLK = 24MHz
    /*
        CS_bypassXT1_HS();
    */
    }

    We also tried using the internal 24MHz clock, and the device is working properly for now. The temperature in the chamber was around -20°C.

  • I responded to your friendship request.

    I did take a look at the erratas on this device and there is one in particular, CS13, that may explain what you are seeing.

    Workaround #4 appears to something worth trying to see if it mitigates this issue.

  • Regarding the errata CS13, we saw this, but we do not change modes. The MCU should always work in AM after reset, and we do not enter LPM3/4 mode. Therefore, I cannot currently connect it to our issue, though I might be wrong. I’ll send you a private message so I can share the full code with you privately.

    Additionally, we noticed that the MSP gets stuck for a while and then resumes working from the point where it got stuck. This issue also seems to be temperature-dependent, occurring more frequently at lower device temperatures

  • At this point it may be worth submitting a Failure Analysis request.  You could send in  2 or 3 devices and we can verify what you are seeing and track down the issue.

**Attention** This is a public forum