M3 video crashed after a long time running

Michael Chen

Prodigy 190 points

Other Parts Discussed in Thread: SYSBIOS

Hi all,

DVRRDK version is 3.5 .

My process:

I got this error message:

216247366:!!!SLAVE CORE [VIDEO-M3] DOWN!!! SystemLink_copySlaveCoreExceptionContext:120 mmap of [0xbe9c0000:36864] mmap virt addresss:0x400ab000 munmap of [0x400ab000:36864] SystemLink_copySlaveCoreExceptionContext:127 SystemLink_handleSlaveCoreException:149

216247370:!!!SLAVE CORE DOWN!!!.EXCEPTION INFO DUMP

!!HW EXCEPTION ACTIVE (0/1): [0]

!!EXCEPTION CORE NAME : [VIDEO-M3]

!!EXCEPTION TASK NAME : []

!!EXCEPTION LOCATION : [ti.sysbios.knl.Semaphore: line 204: ]

!!EXCEPTION INFO : [assertion failure: A_badContext: bad calling context. Must be called from a Task.] [m3video] MSGQ:Warning!! Forcing waitAck = TRUE as waitAck = FALSE is not supported.Fix send cmd [0x6000] to linkId [0x10000021 [m3video] 216270247: SYSTEM: Opening MsgQ [VIDEO-M3_MSGQ] ... [m3video] !!!XDC RUNTIME ASSERT FAILED [m3video] xdc.runtime.Error @ ti.sysbios.knl.Semaphore: line 204: [m3video] assertion failure: A_badContext: bad calling context. Must be called from a Task.

!!EXCEPTION CCS CRASH DUMP FORMAT FILE STORED @ ./CCS_CRASH_DUMP_VIDEO-M3.txt SystemLink_handleSlaveCoreException:154

8688.CCS_CRASH_DUMP_VIDEO-M3.txt

How can I solve this problem?

over 12 years ago

0 Badri Narayanan over 12 years ago

TI__Guru 59700 points

Pls attach the contents of /dvr_rdk/build/dvr_rdk/bin/ti816x-evm folder also.

0 Michael Chen over 12 years ago in reply to Badri Narayanan

Prodigy 190 points

0116.ti816x-evm.rar

0 Badri Narayanan over 12 years ago in reply to Michael Chen

TI__Guru 59700 points

The crash sequence is as below:

EncLink_codecPrdCalloutFcn

-- System_sendLinkCmd

-- System_getSelfProcId

-- System_ipcMsgQSendMsg

-- MessageQ_open

-- Semaphore_pend

--- Crash as Semaphore Pend cannot be invoked from SWI context which is the context from which EncLink_codecPrdCalloutFcn is called.

System_ipcMsgQSendMsg is invoked because

if ((procId != System_getSelfProcId()) && (procId != SYSTEM_PROC_INVALID))

condition occurs in file /dvr_rdk/mcfw/src_bios6/links_common/system/system_linkApi.c

Int32 System_sendLinkCmd(UInt32 linkId, UInt32 cmd)

System_getSelfProcId reads the selfProcId from a static variable and returns it.

Based on the call flow it looks like there is a bit flip in DDR memory causing the crash.

Are you always seeing same assert or do you see random failures. I would expect random failures if it is DDR memory bit flip.

If you have not done any changes in mcfw links looks like there is a board issue which is causing DDR memory corruption.

-- Make sure all the steps in http://processors.wiki.ti.com/index.php/DM816x_C6A816x_AM389x_DDR3_Init are followed and you have done byte wise s/.w leveling correctly.

-- CHeck if reducing DDR frequency improves stability.

-- Check there is no voltage ripple on DDR 1V constant supply.

-- Do DDR memory stress test for atleast a couple of days on entire memory range.

0 Michael Chen over 12 years ago in reply to Badri Narayanan

Prodigy 190 points

1.We don't change the MCFW Links,and the steps in http://processors.wiki.ti.com/index.php/DM816x_C6A816x_AM389x_DDR3_Init we use the default macro to control.

2.our ddr frequency is 796MHZ.

3. The voltage of DDR constant supply is 0.75V

4.The test records are as follows:

2013-4-2:

5556.4-2.CCS_CRASH_DUMP_VIDEO-M3.txt

3362.4-2-M3videoCrashPrint.txt

Fullscreen

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
 [m3video] Unhandled Exception:
 [m3video] Exception occurred in ThreadType_Task
 [m3video] handle: 0x9cca054c.
 [m3video] stack base: 0x3d989440.
 [m3video] stack size: 0x4000.
 [m3video] R0 = 0x00000000  R8  = 0xffffffff
 [m3video] R1 = 0x10000000  R9  = 0xffffffff
 [m3video] R2 = 0x00000001  R10 = 0xffffffff
 [m3video] R3 = 0x00000001  R11 = 0xffffffff
 [m3video] R4 = 0x00000001  R12 = 0x00000001
 [m3video] R5 = 0x9cbf5120  SP(R13) = 0x3d996150
 [m3video] R6 = 0x00000000  LR(R14) = 0xfffffffd
 [m3video] R7 = 0x000000b1  PC(R15) = 0x9cc5421a
 [m3video] PSR = 0x0100000e
 [m3video] ICSR = 0x00426003
 [m3video] MMFSR = 0x00
 [m3video] BFSR = 0x00
 [m3video] UFSR = 0x0001
 [m3video] HFSR = 0x40000000
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

 [m3video] Unhandled Exception:
 [m3video] Exception occurred in ThreadType_Task
 [m3video] handle: 0x9cca054c.
 [m3video] stack base: 0x3d989440.
 [m3video] stack size: 0x4000.
 [m3video] R0 = 0x00000000  R8  = 0xffffffff
 [m3video] R1 = 0x10000000  R9  = 0xffffffff
 [m3video] R2 = 0x00000001  R10 = 0xffffffff
 [m3video] R3 = 0x00000001  R11 = 0xffffffff
 [m3video] R4 = 0x00000001  R12 = 0x00000001
 [m3video] R5 = 0x9cbf5120  SP(R13) = 0x3d996150
 [m3video] R6 = 0x00000000  LR(R14) = 0xfffffffd
 [m3video] R7 = 0x000000b1  PC(R15) = 0x9cc5421a
 [m3video] PSR = 0x0100000e
 [m3video] ICSR = 0x00426003
 [m3video] MMFSR = 0x00
 [m3video] BFSR = 0x00
 [m3video] UFSR = 0x0001
 [m3video] HFSR = 0x40000000
 [m3video] DFSR = 0x00000000
 [m3video] MMAR = 0xe000ed34
 [m3video] BFAR = 0xe000ed38
 [m3video] AFSR = 0x00000000
 [m3video] Terminating Execution...


393042445:!!!SLAVE CORE [VIDEO-M3] DOWN!!!
SystemLink_copySlaveCoreExceptionContext:120
mmap of [0xbe9c0000:36864]
mmap virt addresss:0x4007f000
munmap of [0x4007f000:36864]
SystemLink_copySlaveCoreExceptionContext:127
SystemLink_handleSlaveCoreException:149


393042449:!!!SLAVE CORE DOWN!!!.EXCEPTION INFO DUMP

 !!HW EXCEPTION ACTIVE (0/1): [1]

 !!EXCEPTION CORE NAME      : [VIDEO-M3]

 !!EXCEPTION TASK NAME      : []

 !!EXCEPTION LOCATION       : []

 !!EXCEPTION INFO           : [H/W EXCEPTION]

 !!EXCEPTION CCS CRASH DUMP FORMAT FILE STORED @ ./CCS_CRASH_DUMP_VIDEO-M3.txt

2013-4-9:

0878.4-9-.CCS_CRASH_DUMP_VIDEO-M3.txt

7750.4-9.M3videoCrashPrint.txt

Fullscreen

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
12815240:!!!SLAVE CORE [VIDEO-M3] DOWN!!!
SystemLink_copySlaveCoreExceptionContext:120
mmap of [0xbe9c0000:36864]
mmap virt addresss:0x40006000
munmap of [0x40006000:36864]
SystemLink_copySlaveCoreExceptionContext:127
SystemLink_handleSlaveCoreException:149
12815244:!!!SLAVE CORE DOWN!!!.EXCEPTION INFO DUMP
 !!HW EXCEPTION ACTIVE (0/1): [1]
 !!EXCEPTION CORE NAME      : [VIDEO-M3]
 !!EXCEPTION TASK NAME      : [IPC_BITS_OUT0]
 !!EXCEPTION LOCATION       : []
 !!EXCEPTION INFO           : [H/W EXCEPTION]
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

12815240:!!!SLAVE CORE [VIDEO-M3] DOWN!!!
SystemLink_copySlaveCoreExceptionContext:120
mmap of [0xbe9c0000:36864]
mmap virt addresss:0x40006000
munmap of [0x40006000:36864]
SystemLink_copySlaveCoreExceptionContext:127
SystemLink_handleSlaveCoreException:149

12815244:!!!SLAVE CORE DOWN!!!.EXCEPTION INFO DUMP

 !!HW EXCEPTION ACTIVE (0/1): [1]

 !!EXCEPTION CORE NAME      : [VIDEO-M3]

 !!EXCEPTION TASK NAME      : [IPC_BITS_OUT0]

 !!EXCEPTION LOCATION       : []

 !!EXCEPTION INFO           : [H/W EXCEPTION]
 [m3video] Unhandled Exception:
 [m3video] Exception occurred in ThreadType_Task
 [m3video] handle: 0x3cf21590.
 [m3video] stack base: 0x3d8fa3c0.
 [m3video] stack size: 0x8000.
 [m3video] R0 = 0x00010158  R8  = 0x00010158
 [m3video] R1 = 0x00000000  R9  = 0x3d8c95c8
 [m3video] R2 = 0x88888889  R10 = 0x3d8d0e34
 [m3video] R3 = 0xffffffff  R11 = 0x3d8cc608
 [m3video] R4 = 0xfffefea8  R12 = 0x9cc2ebf1
 [m3video] R5 = 0x00000000  SP(R13) = 0x3d9021b0
 [m3video] R6 = 0xbf40c5d8  LR(R14) = 0x0032302b
 [m3video] R7 = 0x00000001  PC(R15) = 0x9cc2ebfe
 [m3video] PSR = 0x81000200
 [m3video] ICSR = 0x0440f803
 [m3video] MMFSR = 0x00
 [m3video] BFSR = 0x00
 [m3video] UFSR = 0x0001
 [m3video] HFSR = 0x40000000
 [m3video] DFSR = 0x00000000
 [m3video] MMAR = 0xe000ed34
 [m3video] BFAR = 0xe000ed38
 [m3video] AFSR = 0x00000000
 [m3video] Terminating Execution...
 !!EXCEPTION CCS CRASH DUMP FORMAT FILE STORED @ ./CCS_CRASH_DUMP_VIDEO-M3.txt
SystemLink_handleSlaveCoreException:154

0 Michael Chen over 12 years ago in reply to Michael Chen

Prodigy 190 points

Sorry,our working voltage of DDR constant supply is 1.5V .

0 lastshad0w over 12 years ago in reply to Michael Chen

Expert 1385 points

Hi all,

the three crashs have a common point : in the logs "mmap of [0xbe9c0000:",

I think 0xbe9e0000 is a physical address, but I don't know what exactly is that mean.

I guess this address is the source of this problem.

0 1330hayacool7102 over 12 years ago in reply to lastshad0w

Guru 10655 points

pls check following in dvr_rdk/build/dvr_rdk/bin/ti816x-evm/dvr_rdk_m3video_release.xem3.map

MEMORY CONFIGURATION

         name            origin    length      used     unused   attr    fill
---------------------- -------- --------- -------- -------- ---- --------
L2_ROM                00000000   00004000 0000018c 00003e74 RWIX
...
VIDEO_M3_EXCEPTION_CT be9c0000   00020000 000088bc 00017744 RWIX
...

the location you said is for saving core dump data when exception happened....

0 1330hayacool7102 over 12 years ago in reply to 1330hayacool7102

Guru 10655 points

sorry, the above path is for rdk 3.0

for rdk 3.5, it should be ./dvr_rdk/build/dvr_rdk/bin/ti816x-evm/dvr_rdk_m3video_release_1024M_256M.xem3.map

0 Badri Narayanan over 12 years ago in reply to 1330hayacool7102

TI__Guru 59700 points

The address is the exception context address and is unrelated to the actual crash.When a exception occurs the information of processor state like register value is copied to a shared memory location which is mmaped on A8 side and printed. All exceptions will print the mmap of the same address and is unrelated to the actual crash reason. Both the crashes are due to jump to invalid PC values. Considering the previous crash was also due to memory corruption in DDR all seem to be related to DDR issue.This can be a board issue in which case you will have to contact your FAE to get your board layout reviewed.Also have you done byte wise s/w leveling ? You cannot just use the macro values on your board, You should do the s/w leveling procedure and modify the uboot to use the obtained DQS etc, values.Also try with AVS disabled and constant 1.05 V supply to eliminate any issue due to board power supply ripple. Have you tried out running DDR at lower frequency ? Is there a improvement in stability.

0 Michael Chen over 12 years ago in reply to Badri Narayanan

Prodigy 190 points

Our DDR3 frequency is 1333, but we overclocked to 1600.

I've tested DDR 800 no problem.

Now I'm testing DDR. 1066 and 1350.

Processors

Processors forum

M3 video crashed after a long time running