AM620-Q1: ECC TCM self‑test failed

Expert 2420 points

Part Number: AM620-Q1

We deployed SDL on our project.
Regarding the ECC TCM self‑test function, a power‑on self‑test failure rate of approximately 10% has been observed in specific individual LiDAR units.
Our Lidar project has just entered mass production, and no large‑scale customer feedback has been received to date.
We require support from TI to investigate the failure mode of this individual LiDAR unit, so as to determine whether the issue stems from individual hardware defects or other root causes.
The chip type is AM6204BTGFHIALWRQ1.
Logs list the 10 times error logs and normal log of this failed unit.
  • Dear Zheng.

    would you please check below?

    1. MCU_PLUS/LINUX SDK version. I think it should be 10.1, please correct me.

    2. we saw error log in the attachment, please help provide good log.

    Regarding the ECC TCM self‑test function, a power‑on self‑test failure rate of approximately 10% has been observed in specific individual LiDAR units.

    3. please help fill this table for this problem silicon.

    SEYOND_AM6204_ECC_TCM_RETURN_ISSUE.xlsx

    thanks a lot!

    yong

  • Hi Zheng,

      According to the source code, the log message "UC-2: Got Low priority ESM Interrupt" should appear in the log. I suspect the cause is that R5F crashed without logging the message.                                                   

      I am not sure whether 2-bit error injection is necessary for your project. To my understanding, the system cannot automatically recover from 2-bit errors.

      Here are my suggestions for you:

    1. Remove UC-1 and try again.
    2. Connect to CCS to check the context for further debugging.

    Regards,

    Linjun

  •    Could we provide a patch to recovery the 2-bit TCM error injection testing context?  

  • Hi Wang,

    Logs list the 10 times error logs and normal log of this failed unit.

    Can you connect to the debugger and let me know where the R5F core is getting stuck in the failure case? Is it getting stuck or is the test failing and application continues to run?

    Regards,

    Nihar Potturu

  • /*
     *  Copyright (C) 2018-2024 Texas Instruments Incorporated
     *
     *  Redistribution and use in source and binary forms, with or without
     *  modification, are permitted provided that the following conditions
     *  are met:
     *
     *    Redistributions of source code must retain the above copyright
     *    notice, this list of conditions and the following disclaimer.
     *
     *    Redistributions in binary form must reproduce the above copyright
     *    notice, this list of conditions and the following disclaimer in the
     *    documentation and/or other materials provided with the
     *    distribution.
     *
     *    Neither the name of Texas Instruments Incorporated nor the names of
     *    its contributors may be used to endorse or promote products derived
     *    from this software without specific prior written permission.
     *
     *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
     *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
     *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
     *  A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
     *  OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
     *  SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
     *  LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
     *  DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
     *  THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
     *  (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
     *  OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
     */
    
    #include <stdlib.h>
    #include <kernel/dpl/DebugP.h>
    #include <kernel/dpl/ClockP.h>
    #include "ti_drivers_config.h"
    #include "ti_board_config.h"
    #include "ti_drivers_open_close.h"
    #include "ti_board_open_close.h"
    #include "FreeRTOS.h"
    #include "task.h"
    #include <drivers/device_manager/sciserver/sciserver_init.h>
    #include "dcc_uc1.h"
    #include "ecc_main.h"
    #include "ecc_app_main.h"
    
    #define TASK_PRI_MAIN_THREAD  (2)
    #define TASK_SIZE (16384U/sizeof(configSTACK_DEPTH_TYPE))
    
    #define RESET_REG_ADDR 0x43018178
    
    StackType_t gMainTaskStack[TASK_SIZE] __attribute__((aligned(32)));
    StaticTask_t gMainTaskObj;
    TaskHandle_t gMainTask;
    DM_LPMData_t gDMLPMData __attribute__((section(".lpm_data"), aligned(4)));
    
    
    uint8_t  g_u8SdlFaultIndex = 0U;
    int32_t rom_checksum_test_main(void *args);
    void ipc_rpmsg_echo_main(void *args);
    
    
    StackType_t gR5SelfTestStack[TASK_SIZE] __attribute__((aligned(32)));
    StaticTask_t gR5SelfTestTaskObj;
    TaskHandle_t gR5SelfTestTask;
    
    void r5_selftest_thread(void *args)
    {
        int32_t status = SystemP_SUCCESS;
        uint32_t u32ResetRegStatus = 0U;
    
        u32ResetRegStatus = CSL_REG32_RD(RESET_REG_ADDR);
        DebugP_log("ResetReg:0x%08x\r\n", u32ResetRegStatus);
        if (u32ResetRegStatus == 0U)
        {
            /*Bit0 ROM Checksum status */
            status = rom_checksum_test_main(NULL);
            if(status != SystemP_SUCCESS)
            {
                g_u8SdlFaultIndex |= (1 << 0);
            }
    
            /*Bit1 DCC Test status */
            status = dcc_test_main();
            if(status != SystemP_SUCCESS)
            {
                g_u8SdlFaultIndex |= (1 << 1);
            }
    
            /*Bit2 ECC TCM Test status */
            status = ecc_tcm_main();
            if(status != SystemP_SUCCESS)
            {
                g_u8SdlFaultIndex |= (1 << 2);
            }
    
            /*Bit3 ECC RAM Test status */
            status = ecc_app_main();
            if(status != SystemP_SUCCESS)
            {
                g_u8SdlFaultIndex |= (1 << 3);
            }
        }
        else
        {
            g_u8SdlFaultIndex = 0U;
        }
        DebugP_log("u8SdlFaultIndex:0x%02x\r\n", g_u8SdlFaultIndex);
        vTaskDelete(NULL);
    }
    
    void main_thread(void *args)
    {
        int32_t status = SystemP_SUCCESS;
    
        /* Open drivers */
        Drivers_open();
        /* Open flash and board drivers */
        status = Board_driversOpen();
        DebugP_assert(status==SystemP_SUCCESS);
        DebugP_log("Enter into R5 App 20260312\n\r");
    
        /* Init LPM specific data */
        Sciclient_initDeviceManagerLPMData(&gDMLPMData);
        sciServer_init();
    
        gR5SelfTestTask = xTaskCreateStatic( r5_selftest_thread,   /* Pointer to the function that implements the task. */
                                       "r5_selftest_thread", /* Text name for the task.  This is to facilitate debugging only. */
                                       TASK_SIZE,  /* Stack depth in units of StackType_t typically uint32_t on 32b CPUs */
                                       NULL,            /* We are not using the task parameter. */
                                       TASK_PRI_MAIN_THREAD,   /* task priority, 0 is lowest priority, configMAX_PRIORITIES-1 is highest */
                                       gR5SelfTestStack,  /* pointer to stack base */
                                       &gR5SelfTestTaskObj ); /* pointer to statically allocated task object memory */
    
        ipc_rpmsg_echo_main(NULL);
    
        /* Close board and flash drivers */
        Board_driversClose();
    
        vTaskDelete(NULL);
    }
    
    
    int main()
    {
        /* init SOC specific modules */
        System_init();
        Board_init();
    
        gMainTask = xTaskCreateStatic( main_thread,   /* Pointer to the function that implements the task. */
                                      "main_thread", /* Text name for the task.  This is to facilitate debugging only. */
                                      TASK_SIZE,  /* Stack depth in units of StackType_t typically uint32_t on 32b CPUs */
                                      NULL,            /* We are not using the task parameter. */
                                      TASK_PRI_MAIN_THREAD,   /* task priority, 0 is lowest priority, configMAX_PRIORITIES-1 is highest */
                                      gMainTaskStack,  /* pointer to stack base */
                                      &gMainTaskObj ); /* pointer to statically allocated task object memory */
        configASSERT(gMainTask != NULL);
    
        /* Start the scheduler to start the tasks executing. */
        vTaskStartScheduler();
    
        /* The following line should never be reached because vTaskStartScheduler()
        will only return if there was not enough FreeRTOS heap memory available to
        create the Idle and (if configured) Timer tasks.  Heap management, and
        techniques for trapping heap exhaustion, are described in the book text. */
        DebugP_assertNoLog(0);
    
        return 0;
    }
    

    You can see the main.c.

    The failed log shows that main_thread is still working, but r5_selftest_thread seems stuck.

    Becuase I did not see DebugP_log("\r\nESM_ECC_Example_run: UC-1 has failed...\r\n");

  • Dear Mr. Wang,

    Aligned with Nihar, the UC-1 did not corrupt TCM data directly; it just modified the ECC aggregator register. So my previous assumption isn't correct.

    I also dumped the R5FSS0_COMMON0_EVNT_BUS_VBUSP_MMRS register on the EVM before and after executing UC-1 and UC-2, and no difference was found.

    To perform further debugging, as Nihar suggested, a CCS connection is needed. Thanks.

    Linjun

  • Yesterday, I just removed the ECC TCM selftest(B0TCM0 Bank0 Double bit error inject test), kept the other same  and did 1869 cycles power-on test.

    There is no failure case.

    It may have a relationship.