This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

  • Resolved

RTOS/EK-TM4C1294XL: MCU's are burning out like XMAS lights

Guru 27280 points

Part Number: EK-TM4C1294XL

Tool/software: TI-RTOS

Launch pad MCU may randomly go to 85% CPU usage, hit 75*C or more yet may continue to run the application with a temperature fault (unattended). Other launch pads might allow POR but jump to 85% CPU usage slowly climb to 75*C until unplugged from USB power yet run the application.  Launch pads like the first one described after cool down will pull down +3v3 LDO and PWRLED won't light. Those MCU have a very low ohmic reading (42ohms) JP2 pulled. Good MCU JP2 pulled has ohmic reading 900ohm-1k.

No body touched the launch pad as it went into melt down. Oddly two MCU that burned out occurred shortly after RTOS kernel had been installed and randomly crashing application. They both were recently flashed back to the basic application when the MCU few days later did the XMAS light burn out thing.

  • During the past week (perhaps even longer) the (near) record (and sustained) cold temperatures have drastically reduced the Humidity (inside) our nicely heated (US Midwest) homes/offices.      As such - despite your "disclaimer" (didn't do nothing) - ESD "hogs the Police Line-UP" as, "Prime Suspect."     Your town's news headline, "Multiple inanimate objects (LPads) - "Throw themselves" - from tall cliff" - proves impossible to miss!

    At home - I've noted during (incidental/slight) "dog-cat" contact  - "shocks occur!"     (never past noted!)    At the office - if we don't "strap in" (our ESD wrist-bands) - "NUM Lock" (on the keyboard) goes out.

    I would weigh "ESD" as far more likely - than (any) RTOS as "cause agent!"        Do confirm date/time for MCU's wake.     Flowers (via FTD) are, "On the wing!"    (Weeping Willow appears (most) appropriate...)

    And - should a "dim" (even unlit) tree - next Xmas - prove too depressing ... might you switch to "candles" (long (past) employed) - and likely to survive (even) the "extreme" ESD  - you (aka: nobody) regularly generate yet disclaim.         For those interested in their (very own) "ESD Resistant" Candles:        https://e2e.ti.com/group/helpcentral/f/301/t/653117

  • Guru 27280 points

    In reply to cb1_mobile:

    Honeywell bulk furnace humidifier placed airflow out keeps all room nicely 36% relative humidity. Water drops forming on glass inside windows does tend to keep static zapping at bay. Yet the other launch pad similarly failed reported this forum last summer and near the same RTOS flashing and later removal of RTOS. No great mystery at that time amid a few other forum LP failures gone unanswered meantime.

    The odd thing is BOTH afflicted launch pads (same lot) MCU internal temp sensor was detecting 52-54*C, 10-11% CPU usage (NoRTOS). Earlier lots report 45-48*C temperature range 24% CPU usage(NoRTOS). 

    Notice CPU usage dropped in 54*C  MCU reporting 10-11% indicates the PLL was perhaps running much faster than 480mHz, SYSCLK was well above120Mhz. So RTOS Clock_tick was running well above 1000us and aborting tasks with (if) directive.

    Quote of the day; "Fool me once shame on you, fool me twice shame on me."  

    Seemingly these MCU silicon (RA2) were marginal to being with and loading RTOS somehow pushed them over the edge.

    Anyone up for MCU extraction replacement party, BYORP "Bring your own reflow paste" LOL.

  • In reply to BP101:

    Long ago - famed "Tech Cookbook author" Don Lancaster advised - "Blame yourself FIRST - blame the IC LAST!"    Firm/I have LONG found this TRUE!

    Should you (really) seek to monitor MCU's temperature - via a (more correct, "MCU independent means") - a small, temperature sensor may be "thermally bonded" atop the MCU. (and monitored via separate means)

    As you know - it proves (always "unwise") to, "Employ the DUT* to make/provide its (own) - always suspect, measurements.

    * DUT (Device Under Test)

    Might your use of the "DUT" to "make & report" key findings, echo, "He who serves as his (own) attorney - has a "fool" for a client?"

  • Guru 27280 points

    In reply to cb1_mobile:

    Had expected to give green stamps to all PLL overclocking enthusiasts.

    Two in the hand is better than 2 in the burning bush. Never could get CPU% to match between all. Total 4 launch pads (RA2 silicon); 2 CPU% behaved exactly alike after POR and other 2 PLL over clocking right out of the static bag/box. Those 2 had 10*C higher MCU temperature double CPU%. Hind sight SYSCLK makes sense if PLL register indicated MOSC/2 but PLL was not actually dividing as indicated in the SYSCTL register.

    Question is perhaps one of PLL divisor (errata SYSCTL #22) not being completely the same between all RA2 MCU's. TI Lisa's customer this forum TM4C1294ENC RA2 silicon may have been reporting the same issue but was more intermittent.
  • In reply to BP101:

    I'd "pray" - but (somehow) - have "forgotten the words."     Mr. Lancaster's - (very) correct "Tech" guideline - likely warrants further (i.e. some) consideration...

    BTW - my Xmas Tree lights (those not (yet) destroyed by shelter dog/kat) - continue to "glow brightly" - apparently "immune" to BP misfortune...

  • Guru 27280 points

    In reply to cb1_mobile:

    More seriously how or does TI have any plans to deal with errata SYSCTL #22, at the silicon level? Otherwise after connecting the seriously overclocking dots starting to wonder how external 120Mhz OSC and bypass PLL can produce 25Mhz for EMAC0 use.

    The one MCU launch pad put back in box last July still works. Seemingly PLL will no longer divide (/2) as MCU temp quickly reaches 60*C thus requires a heat sink to keep it from thermal runaway. If that don't describe an overclocking PLL my XMAS tree lights (all 700) don't require a wall switch dimmer either.

  • Guru 27280 points

    In reply to BP101:

    What's a kernel exploit that also effects devices with ARM processors. Not an over clocked PLL but equally interesting.  

    https://www.digitaltrends.com/computing/intel-cpus-suffer-bug-requiring-performance-reducing-fix/?utm_content=bufferac5da&utm_medium=socialm&utm_source=facebook.com&utm_campaign=DT-FB

  • In reply to BP101:

    Hi BP101,
    I am concerned with your findings. Use of the updated function SysCtlClockFreqSet() from TivaWare instead of ROM_SysCtlClockFreqSet() "should" have resolved erratum SYSCTL #22. I have not heard of reports of "over clocking" once the new function is used, but do have a valid case of "under clocking". Do you have any direct evidence of the actual system clock speed, such as UART, TIMER, PMW or CAN operating at the incorrect frequency? Or do you output a divided version of the system clock? Sorry, but it would be very helpful to know if the system is running at 240MHz instead of the expected 120MHz, or if it is running at some other frequency.

    Best Regards,
    Bob Crosby

  • Guru 27280 points

    In reply to Bob Crosby:

    Hi Bob,

    Think it was Todd or another TI engineer this very forum recently stated SYSCTRL#22 is not consistent across all RA2 MCU. The MOSC function calls effect Tivaware 2.1.2 earlier, yet later libraries still seem to effect some but not all RA2 MCU. Check again SYSCTRL#22 is library version specific errata not that ROM loading effects PLL.

    Earlier RTOS version prior to 2.16.1.14 PLL was not being divided by 2 yet configured in RTOS to do so. Until adding a second SysCtlClockFreqSet() in (main.c) did the PLL divider actually get programmed prior to Bios_start directive. Besides the application has MAP_SysCtlClockFreqSet() in both RTOS and no RTOS projects.

    These two RA2 launch pads seemingly were running random throttled PLL out of the box, PLL seems to loose lock then relocks synchronously. The MCU temperature was typically 10*C or more above the other two launch pads (purchased 8 months apart) all 4 launch pads RA2 silicon. The (identifiable) trouble with both MCU started after running RTOS analyzer CPU load monitor and seeing the CPU load bounce 20%-90% build meandering load graphs eventually Bus faulting precise 0x3. Confirmed debug register SYSCTL PLL/2 was indeed programmed. Suspect the RA2 silicon in certain MCU PLL becomes more damaged after loading RTOS and reprogramming the MOSC before the BIOS start command is asserted.

  • Guru 27280 points

    In reply to Bob Crosby:

    Bob Crosby
    have not heard of reports of "over clocking" once the new function is used, but do have a valid case of "under clocking".

    If VCO is set 480Mhz and PLL is not being divided by 4  after invoking SysCtlClkFreqSet() do we not end up with 240Mhz SYSCLK? 

    The odd thing is after flashing RTOS and setting SysCtlClkFreqSet() in (main.c) checked debug register = PLL/2, in hind sight would that not produce 240Mhz SYSCLK? Became aware over clocking MCU only after loading RTOS debug ROV indicated PLL was not being enabled. Yet debug register PLL was /2 seemed ok since the application configured MOSC (main.c), not RTOS. I only become aware the PLL was not being enabled in RTOS then noticed PLL/2 not evaluating it should be PLL/4.

This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.