This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RM44L920: nERROR Remains Low After nRST Pulsed - Cannot Reset MCU (Sometimes)

Part Number: RM44L920
Other Parts Discussed in Thread: HALCOGEN

I have to RM44L920 MCUs on a single board for redundancy, with a GPIO from each going to the other's nRST pin via an inverter. Due to a net naming error, we had to pull the external power monitoring IC from one of the two MCUs, and occasionally that MCU will not come up on power on, and its nERROR pin is low. This is due to a dirty power glitch that the hardware team is working on. (Before the voltage to the regulators comes up to full value, it fluctuates a bit and sometimes exceeds the turn on voltage for the regulators for a very short period.)

In the meantime, I have been using the cross reset functionality to pulse the nRST pin to recover the MCU in the nERROR condition to continue normal operation & testing. The cross reset works about 80% of the time. When it doesn't, the nERROR pin remains low.

I've tried longer duration pulses, but the 15 milliseconds I started with should be plenty of time.

How is the nERROR condition being maintained over a physical reset? I realize it's a warm reset, but still. I'm sure I'm missing something simple here as usual.

I read through the ESM section of the RM44 reference manual, but it seems as though the esmInit() function called in system_main.c should be clearing things up for me, though again I'm sure I'm missing something.

As a side note I am using RM46 LaunchXL boards for some of the development work, and was attempting to force an nERROR condition using esmREG->EKR = 0xAU, but I couldn't figure out how to get into privileged mode to do so. This is low priority.

Any help with clearing out the (occasionally) hanging nERROR condition would be greatly appreciated.

Thanks!

- Tom

  • Hi Tom,

    When nError is asserted, which bit of ESM registers is set? An uncorrectable ECC error will unconditionally cause the nERROR pin to toggle low.

    Once the nError pin outputs low, a nPORRST reset or a write of 5h to ESMEKR is required to release the nError pin back to normal state. nRST doesn't change the state of nError pin.

  • I haven't been able to hook up a debugger to check ESMSR3. I don't have any of the Group 1 or Group 2 error sources connected, so it's definitely a Group 3 error and related to the power glitch.

    I'm looking into adding a jumper to make the cross reset line use nPORRST instead of nRST, but that feels a bit heavy handed.

    What I don't understand is why the cross reset works most of the time, but not all the time. HALCoGen generated esmInit() gets called in sys_main.c, and contains:

        /** - Reset error pin */
        if (esmREG->EPSR == 0U)
        {
            esmREG->EKR = 0x00000005U;
        }
        else
        {
            esmREG->EKR = 0x00000000U;
        }

    This should clear the error and reset the nERROR pin to high, if I'm understanding things correctly.

    I just realized that the MCU is probably in User mode by the time it gets to sys_main.c, but esmInit() is also called in sys_startup.c. Ha! Maybe calling it twice is throwing things off.

    Thanks again for looking at this.

    - Tom

  • I just realized that the MCU is probably in User mode

    After reset, the CPU is in supervisor mode. Aftare calling _coreInitRegisters_() in _c_int00_(), the CPU switches to system mode which is privileged mode.

    On another device, sometimes a CPU compare error (ESM Group 2 channel 2) is generated when the debugger connects to device.

    The nERROR occurs and is not cleared before the system reset. After system reset, writing 0x5 to ESMEKR can not clear the nERROR to restore the ESM back to normal state. The ESM must be switched into the error forcing mode (write 0xA to ESMKYS) before the nERROR is cleared.

  • Thank you for the clarification re: System Mode - I may try the 0xA trick. I'm guessing that's the Device#60 errata thing I've come across while researching this topic.

    I hooked up the debugger and there are two cases that occur.

    1 - The MCU is able to be recovered via a cross reset. In this case the following ESM registers have non-zero values:

    • ESMEPSR - 0x00000001
      • But the other MCU, with a GIO pin connected directly to nERROR, indicates that nERROR is low, which is throwing me off
      • The boards are conformal coated, so it's not easy to probe pins directly - I'll have to see if I can dig out a non-coated board
    • ESMLTCR - 0x00003FFF
    • ESMLTCPR - 0x00003FFF

    2 - The MCU is not able to be recovered via a cross reset. Non-zero registers (other than ESMEPSR, which is all zeros)

    • ESMEPSR - 0x00000000
    • ESMSR3 - 0x00000002
      • If I'm reading the datasheet correctly, I believe this indicates "eFuse Farm - autoload error"
    • ESMLTCPR - 0x00003FFF
      • Apparently the Low Time Counter counted down since ESMLTCR is zero
    • ESMSR4 - 0x00000200
      • Again, if I'm reading the datasheet (Table 6-31) correctly, this indicates and eFuse Controller Error, which meshes with the ESMSR3 indication

    The eFuse faults make sense for a power glitch of a few milliseconds on startup - making progress on fixing that. Now I just need to be able to reliably cross reset once power is stable.

    Thanks again. I will be adding the "esmREG->EKR = 0xA" line to see if that helps.

    - Tom

  • If I'm reading the datasheet correctly, I believe this indicates "eFuse Farm - autoload error"

    yes, the ESM 3.1 and ESM 4.9 are the flags of EFUSE errors. 

    The Efuse autoload happens in the boot process before the CPU is released from reset. If autoload fails, we cannot guarantee behavior of the CPU. The Efuse Autoload error can also be generated by the autoload selftest in run time. I hope the efuse issue is gone on your updated PCB.