This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

M3 video crashed after a long time running

Other Parts Discussed in Thread: SYSBIOS

Hi all,

 DVRRDK version is 3.5 .

My process:

 

    I got this error message:

216247366:!!!SLAVE CORE [VIDEO-M3] DOWN!!! SystemLink_copySlaveCoreExceptionContext:120 mmap of [0xbe9c0000:36864] mmap virt addresss:0x400ab000 munmap of [0x400ab000:36864] SystemLink_copySlaveCoreExceptionContext:127 SystemLink_handleSlaveCoreException:149

216247370:!!!SLAVE CORE DOWN!!!.EXCEPTION INFO DUMP

 !!HW EXCEPTION ACTIVE (0/1): [0]

 !!EXCEPTION CORE NAME      : [VIDEO-M3]

 !!EXCEPTION TASK NAME      : []

 !!EXCEPTION LOCATION       : [ti.sysbios.knl.Semaphore: line 204: ]

 !!EXCEPTION INFO           : [assertion failure: A_badContext: bad calling context. Must be called from a Task.]  [m3video] MSGQ:Warning!! Forcing waitAck = TRUE as waitAck = FALSE is not supported.Fix send cmd [0x6000] to linkId [0x10000021  [m3video]  216270247: SYSTEM: Opening MsgQ [VIDEO-M3_MSGQ] ...  [m3video] !!!XDC RUNTIME ASSERT FAILED  [m3video] xdc.runtime.Error @ ti.sysbios.knl.Semaphore: line 204:  [m3video] assertion failure: A_badContext: bad calling context. Must be called from a Task.

 !!EXCEPTION CCS CRASH DUMP FORMAT FILE STORED @ ./CCS_CRASH_DUMP_VIDEO-M3.txt SystemLink_handleSlaveCoreException:154

8688.CCS_CRASH_DUMP_VIDEO-M3.txt

How can I solve this problem?

 

  • Pls attach the contents of /dvr_rdk/build/dvr_rdk/bin/ti816x-evm folder also.

  • The crash sequence is as below:

    EncLink_codecPrdCalloutFcn

               -- System_sendLinkCmd

                        -- System_getSelfProcId

                        -- System_ipcMsgQSendMsg

                                      -- MessageQ_open

                                     -- Semaphore_pend

                                           --- Crash as Semaphore Pend cannot be invoked from SWI context which is the context from which EncLink_codecPrdCalloutFcn  is called.

    System_ipcMsgQSendMsg is invoked because

    if ((procId != System_getSelfProcId()) && (procId != SYSTEM_PROC_INVALID))

    condition occurs in file /dvr_rdk/mcfw/src_bios6/links_common/system/system_linkApi.c

    Int32 System_sendLinkCmd(UInt32 linkId, UInt32 cmd)

     System_getSelfProcId reads the selfProcId from a static variable and returns it.

    Based on the call flow it looks like there is a bit flip in DDR memory causing the crash.

    Are you always seeing same assert or do you see random failures. I would expect random failures if it is DDR memory bit flip.

    If you have not done any changes in mcfw links looks like there is a board issue which is causing DDR memory corruption.

    -- Make sure all the steps in http://processors.wiki.ti.com/index.php/DM816x_C6A816x_AM389x_DDR3_Init are followed and you have done byte wise s/.w leveling correctly.

    -- CHeck if reducing DDR frequency improves stability.

    -- Check there is no voltage ripple on DDR 1V constant supply.

    -- Do DDR memory stress test for atleast a couple of days on entire memory range.

     

     

     

     

  • 1.We don't change the MCFW Links,and  the steps in http://processors.wiki.ti.com/index.php/DM816x_C6A816x_AM389x_DDR3_Init we use the default macro to control.

    2.our ddr frequency is 796MHZ.

    3. The voltage of DDR constant supply is 0.75V 

    4.The test  records are as follows:

    2013-4-2:

    5556.4-2.CCS_CRASH_DUMP_VIDEO-M3.txt

    3362.4-2-M3videoCrashPrint.txt
    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    [m3video] Unhandled Exception:
    [m3video] Exception occurred in ThreadType_Task
    [m3video] handle: 0x9cca054c.
    [m3video] stack base: 0x3d989440.
    [m3video] stack size: 0x4000.
    [m3video] R0 = 0x00000000 R8 = 0xffffffff
    [m3video] R1 = 0x10000000 R9 = 0xffffffff
    [m3video] R2 = 0x00000001 R10 = 0xffffffff
    [m3video] R3 = 0x00000001 R11 = 0xffffffff
    [m3video] R4 = 0x00000001 R12 = 0x00000001
    [m3video] R5 = 0x9cbf5120 SP(R13) = 0x3d996150
    [m3video] R6 = 0x00000000 LR(R14) = 0xfffffffd
    [m3video] R7 = 0x000000b1 PC(R15) = 0x9cc5421a
    [m3video] PSR = 0x0100000e
    [m3video] ICSR = 0x00426003
    [m3video] MMFSR = 0x00
    [m3video] BFSR = 0x00
    [m3video] UFSR = 0x0001
    [m3video] HFSR = 0x40000000
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    2013-4-9:

    0878.4-9-.CCS_CRASH_DUMP_VIDEO-M3.txt

    7750.4-9.M3videoCrashPrint.txt
    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    12815240:!!!SLAVE CORE [VIDEO-M3] DOWN!!!
    SystemLink_copySlaveCoreExceptionContext:120
    mmap of [0xbe9c0000:36864]
    mmap virt addresss:0x40006000
    munmap of [0x40006000:36864]
    SystemLink_copySlaveCoreExceptionContext:127
    SystemLink_handleSlaveCoreException:149
    12815244:!!!SLAVE CORE DOWN!!!.EXCEPTION INFO DUMP
    !!HW EXCEPTION ACTIVE (0/1): [1]
    !!EXCEPTION CORE NAME : [VIDEO-M3]
    !!EXCEPTION TASK NAME : [IPC_BITS_OUT0]
    !!EXCEPTION LOCATION : []
    !!EXCEPTION INFO : [H/W EXCEPTION]
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

  • Sorry,our working voltage of DDR constant supply is 1.5V .

  • Hi all,

        the three crashs have a common point : in the logs "mmap of [0xbe9c0000:", 

        I think 0xbe9e0000 is a physical address, but I don't know what exactly is that mean.

        I guess this address is the source of this problem.

  • pls check following in dvr_rdk/build/dvr_rdk/bin/ti816x-evm/dvr_rdk_m3video_release.xem3.map

    MEMORY CONFIGURATION

             name            origin    length      used     unused   attr    fill
    ----------------------  --------  ---------  --------  --------  ----  --------
      L2_ROM                00000000   00004000  0000018c  00003e74  RWIX
    ...
      VIDEO_M3_EXCEPTION_CT be9c0000   00020000  000088bc  00017744  RWIX
    ...

    the location you said is for saving core dump data when exception happened....

     

  • sorry, the above path is for rdk 3.0

    for rdk 3.5, it should be ./dvr_rdk/build/dvr_rdk/bin/ti816x-evm/dvr_rdk_m3video_release_1024M_256M.xem3.map

     

  • The address is the exception context address and is unrelated to the actual crash.When a exception occurs the information of processor state like register value is copied to a shared memory location which is mmaped on A8 side and printed. All exceptions will print the mmap of the same address and is unrelated to the actual crash reason. Both the crashes are due to jump to invalid PC values. Considering the previous crash was also due to memory corruption in DDR all seem to be related to DDR issue.This can be a board issue in which case you will have to contact your FAE to get your board layout reviewed.Also have you done byte wise s/w leveling ? You cannot just use the macro values on your board, You should do the s/w leveling procedure and modify the uboot to use the obtained DQS etc, values.Also try with AVS disabled and constant 1.05 V supply to eliminate any issue due to board power supply ripple. Have you tried out running DDR at lower frequency ? Is there a improvement in stability.

  • Our DDR3 frequency is 1333, but we overclocked to 1600.

    I've tested DDR 800 no problem.

    Now I'm testing DDR. 1066 and 1350.