This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

UCD3138A: Reset/Power-off Loop possibly cause corruption of Checksum (or Program FLASH)?

Part Number: UCD3138A
Other Parts Discussed in Thread: UCD3138

UCD3138A Reset problem.

I've got a project that just suffered a very mysterious problem. Hoping to get some insight into possibilities and explanation. Will add relevant code at end of post.

Symptom:
Finished code with no backdoor to erase/remove checksum, suffered a restart/reset loop for 20-40 cycles and them stopped functioning. Checksum verification from resident bootloader now failing (full reason still unknown), and jump to c_int00(void) -> main() no longer happening.

Detail Description:
Finished code with no backdoor to erase checksum. Pmbus_handler() still called each time at the top of main's for(;;) loop.

During susceptibility testing we had a particular frequency - at high power levels - that appeared to be causing a repeated reset loop to occur (theory that !RESET line so noisy to be seen as reset by micro, or 3.3V rail extending either 3.6V+ or 3.0V- and bouncing across boundary.

After many cycles of this (trying to isolate and debug root cause), the micro appeared to stop functioning. After tear down we were able to confirm the 3.3V supply to the micro, put the programming/debug header on. No UCD3138A response from UCD3XXX Device GUI's "Scan for Device in Program Mode", but positive response "Scan Device in ROM Mode".

The only way I know for this symptom to occur is that checksum has been erased, or a byte w/n program flash has been corrupted (such that checksum is incorrect).

Question:
What are possible causes for this series of events? Obviously this is an unacceptable behavior for the project, so need to get some clarity.

Relevant Code:

void main(void){

Uint32 vsense_adc_val;

// initialize some variables
InputBits.aim9_launch_cnt = 0;
InputBits.aux_fire_cnt = 0;
InputBits.master_arm_cnt = 0;
PrevRegLoadState = RSTATE_BAD_UNKNOWN;


// this allows MASTER_ARM to be flipped to active for a powercycle, and when this is
// done, the checksums will both be erased and thus the micro will stay in boot (PMBUS capable)
// remove the CHKSUM from FLASH, forcing a hold in RAM + reprogram cycle
//if(GioRegs.FAULTIN.bit.TMS_IN == 1)
//{
// clear_integrity_word();
//}

// ================================================================

// ================================================================
// Setup Hardware / Peripherals / DPPs
SystemStartup();

TmrMSecSet(GENERAL_PURPOSE_DLY_TMR, 200);
Event_Add(EVT_STARTUP, Ram_EADC_DAC0);

for(;;)
{
// DWH: process PMBus traffic
pmbus_handler();

..... procedural project code ......

}

}

  • It sounds like you have some kind of a rise time/noise issue at power up.  i would strongly suggest that you look at the UCD3138 Practical Design Guideline at:

    It has descriptions specifically about timing and rise time for start up, as well as layout and other guidelines to increase noise immunity in the difficult environment of power supplies.  

    Most of the time noise on power up causes reset, as you mention.  The ARM7 core has exception detection for accessing illegal addresses and undefined instructions, and generally when noise affects the processor, it gets reset.  However, once in a while it can run for a little while after noise hit, and before the reset.  Normally this doesn't have any lasting effect, but if it happens to get lost and go to the checksum clear function, it will have the effect you observe.  

    We recommend two steps to fix this:

    1. Fix the startup timing and the PC layout.

    2. You can also go to what we call a RAM flash key.  Flash erases and writes require a write of a special 32 bit key to a lock register to permit the write or erase.

    This is done to prevent the kinds of issues you are seeing.  With a hardware backdoor, it is necessary to hard code the key to give it a better chance if there is a code bug.  However with production code, presumably there is no bug to cause a lockup, so the hardware backdoor can be removed..

    Normally you use a PMBus command to clear the checksum on a production code, if there is a clear at all.  

    So it's not hard to add another PMBus command to write the flash key to RAM before sending the PMBus command to clear the checksum.\

    If the correct flash key is not in the RAM variable, even if the ARM7 finds its way to the clear checksum function, it won't be able to write to the flash.  

  • Ian,

    Thanks for the information!  Digging through it now. This board/design dates back prior to 2016 (believe it or not), so the guide is probably going to be VERY helpful indeed. I'll look into the idea of a FLASH lock key as well. Not sure if it helps or stimulates the discussion, but here is a quick expansion of details. Injection of a modulated high-power noise on the AC leads (HI and LOW). The AC is split off and used to create 5V and 3.3VDC rails via transformers.

    Injected Noise Freq    Description

    *20 - 40 Mhz (peaking at 33 Mhz)   I'm seeing my regulated output voltage be pulled upward. At 33, this deflection peaks at 1.2V (regulating 25VDC target).

    89.962 Mhz                 Microprocessor appears to be held in RESET; injected noise is switched OFF, micro still in RESET. A control signal is removed that shuts 3.3VDC off, then micro comes back up

    91.77 Mhz                   Looping RESETs of micro (or a reset to power control - IPK, OVER-CURRENT, etc. - is occuring). Continues until injected noise is switched OFF.

    93.614 Mhz                Back to correct behavior, small output 25vdc deflection (0.1VDC) when injected noise is present.

    * Not sure if related, or a different problem??

    Will update if I find the proper problem/solution, or if I stumble into more useful information!

    Thanks again, -D

  • If you have an update, feel free to put in a new post.  You can do a discussion, rather than a question if you want to.  I'm going to close this one, since we seem to have answered it.