question about demo code

1330hayacool7102

Other Parts Discussed in Thread: SYSBIOS

Hi TI-driends,

dm8168, rdk3,0

when we use demo code, the command will be executed one by one. And in realistic application, some command like "Core Status: Active/In-active" will be execute inside periodic function to check if the status every core is ok. My question is, is there a problem if we put the command in another function which may be executed with other commands at the same time?

over 12 years ago

0 Badri Narayanan over 12 years ago

TI__Guru 59700 points

There is no problem because there is mutex which ensures there in only one outstanding system_linkControl command always.

0 1330hayacool7102 over 12 years ago in reply to Badri Narayanan

Guru 10655 points

hello Badri-SuperMan

we got a signal like the following,

/lib/libc.so.6(__default_rt_sa_restorer_v2+0) [0x2acd2630]

/lib/libpthread.so.0 [0x2abcc808]

/lib/libpthread.so.0(pthread_mutex_lock+0x1a0) [0x2abc5c20]

/opt/dvr_rdk/ti816x/bin/dvr_rdk_demo_mcfw_api.out [0x56898]

we check 0x56898 and found

00056864 <System_ipcMsgQSendMsg>:

56864: e92d4ff0 stmdb sp!, {r4, r5, r6, r7, r8, r9, sl, fp, lr}

56868: e24dd02c sub sp, sp, #44 ; 0x2c

5686c: e58d2014 str r2, [sp, #20]

56870: e1dd25b0 ldrh r2, [sp, #80]

56874: e3530801 cmp r3, #65536 ; 0x10000

56878: e1a08003 mov r8, r3

5687c: e1a09000 mov r9, r0

56880: e58d1018 str r1, [sp, #24]

56884: e58d2010 str r2, [sp, #16]

56888: 8a000095 bhi 56ae4 <System_ipcMsgQSendMsg+0x280>

5688c: e1a0ae29 mov sl, r9, lsr #28

56890: e59f0378 ldr r0, [pc, #888] ; 56c10 <$d>

56894: eb0009a9 bl 58f40 <OSA_mutexLock>

56898: e35a0003 cmp sl, #3 ; 0x3

5689c: 8a000082 bhi 56aac <System_ipcMsgQSendMsg+0x248>

568a0: e288b034 add fp, r8, #52 ; 0x34

568a4: e3a00000 mov r0, #0 ; 0x0

568a8: e1a0100b mov r1, fp

568ac: eb003b1e bl 6552c <MessageQ_alloc>

568b0: e3500000 cmp r0, #0 ; 0x0

568b4: e1a07000 mov r7, r0

568b8: e1a06000 mov r6, r0

568bc: 0a0000ad beq 56b78 <System_ipcMsgQSendMsg+0x314>

568c0: e59d2014 ldr r2, [sp, #20]

did you have further idea about this?

0 1330hayacool7102 over 12 years ago in reply to 1330hayacool7102

Guru 10655 points

I test stability for long time. By my test without HDMI output initially, and several days later it may hang without any information.

According to the our messages, it always stuck right behind OSA_mutexLock, I'm not very sure if it happens deadlock under some case.

Did you have further idea?

0 Badri Narayanan over 12 years ago in reply to 1330hayacool7102

TI__Guru 59700 points

This looks like glibc code or data structures are corrupted. Deadlock situation will not cause SEGFAULT.Also what is the error reported by the seg fault exception ?Is it invalid memory access ? Are you using a nand file system ? If so do you see the same issue with NFS file system ? We have seen issues with some customer board where there are bit flips in nand which can cause crash in libc code.

0 1330hayacool7102 over 12 years ago in reply to Badri Narayanan

Guru 10655 points

Hello Basri-SuperMan,

thanks for your reply.

my environment, 8168evm, nfs, after long time test, there's no log display but the shown video was gone(I connect the output to the monitor to observe). I have no idea about what happen then I Ctrl+C to go thru signal handler and use backtrace to look for previous steps and got the previous info. Here's no SIGSEGV so it's not invalid memory access. Any further idea?

0 Badri Narayanan over 12 years ago in reply to 1330hayacool7102

TI__Guru 59700 points

I think this issue is unrelated to OSA_mutex . If you press Ctrl+C it will just unblock threads which were blocked on MessageQ_get.If display is blanked out do you get "No signal" or just black background color ? Are all the displays showing blank or only HDMI ? Are you running remote debug client ? Do you see any M3 exception logs ? Display of black background color indicates either VPSS m3 exception or Display list hang. This is most likely due to your board issue .Check the below post for things to verify on your board:

http://e2e.ti.com/support/dsp/davinci_digital_media_processors/f/717/p/250529/878774.aspx#878774

http://e2e.ti.com/support/dsp/davinci_digital_media_processors/f/717/p/264273/924280.aspx#924280

0 1330hayacool7102 over 12 years ago in reply to Badri Narayanan

Guru 10655 points

hello SuperMan-Badri,

thanks for your response.

If you press Ctrl+C it will just unblock threads which were blocked on MessageQ_get.

- mm... got it

If display is blanked out do you get "No signal" or just black background color ?

- just grey background color

Are all the displays showing blank or only HDMI

- we just turn on HDMI

Are you running remote debug client ?

- no

Do you see any M3 exception logs ?

- no

Display of black background color indicates either VPSS m3 exception or Display list hang.

- but I just see grey color back ground, what does grey color background mean?

0 Badri Narayanan over 12 years ago in reply to 1330hayacool7102

TI__Guru 59700 points

It is not clear what you mena by grey screen.Pls attached screen shots of the TV when you see hang. Also check if graphics logo is getting displayed when you see hang or not. Always run with remote_debug_client logs enabled and attach the logs when you see hang. Also when you see hang share logs of Vsys_printBufferStatistics and Vsys_printDetailedStatistics

0 1330hayacool7102 over 12 years ago in reply to Badri Narayanan

Guru 10655 points

Hello SuperMan-Badri,

Q1. It is not clear what you mena by grey screen.Pls attached screen shots of the TV when you see hang.

- see ..it's like what we see after "load.sh"

Q2. Also check if graphics logo is getting displayed when you see hang or not.

- we didn't use the logo in our application.

Q3. Always run with remote_debug_client logs enabled and attach the logs when you see hang. Also when you see hang share logs of Vsys_printBufferStatistics and Vsys_printDetailedStatistics

- What I saw is just the following messages, no more

videoSourceStatus.numChannels 8
DEMO: 0: Detected video at CH [0,0] (720x240@59Hz, 1)!!!
DEMO: 1: Detected video at CH [0,1] (720x240@59Hz, 1)!!!
DEMO: 2: Detected video at CH [0,2] (720x240@59Hz, 1)!!!
DEMO: 3: Detected video at CH [0,3] (720x240@59Hz, 1)!!!
DEMO: 4: Detected video at CH [1,0] (720x240@59Hz, 1)!!!
DEMO: 5: Detected video at CH [1,1] (720x240@59Hz, 1)!!!
DEMO: 6: Detected video at CH [1,2] (720x240@59Hz, 1)!!!
DEMO: 7: Detected video at CH [1,3] (720x240@59Hz, 1)!!!

I tried to telnet and use "top" command for the attach

and use "vmstat" command for the attach

Any further idea?

0 Badri Narayanan over 12 years ago in reply to 1330hayacool7102

TI__Guru 59700 points

This looks like VPSS M3 crash. Connect CCS+JTAG to M3VPSS core and check the status.Also I see remote_debug_client running in the process list Check the last prints from [m3vpss] to see if you get any error or exception msg print

0 1330hayacool7102 over 12 years ago in reply to Badri Narayanan

Guru 10655 points

hello SuperMan-Badri,

because I print too lots of message, I miss the print form [m3vpss]. But I got the attach

4048.CCS_CRASH_DUMP_VPSS-M3.txt

Could I got any info from the attach?

I'll tried again to log...and will update if happen

0 Badri Narayanan over 12 years ago in reply to 1330hayacool7102

TI__Guru 59700 points

zip and attach the contents of /dvr_rdk/build/dvr_rdk/bin/ti816x-evm

0 1330hayacool7102 over 12 years ago in reply to Badri Narayanan

Guru 10655 points

Hello SuperMan-Badri,

see attach for requirement

http://e2e.ti.com/cfs-file.ashx/__key/communityserver-discussions-components-files/717/8171.ti816x_2D00_evm.7z

because I add more printf and now I remove them and back to the original condition...and recompile then provide you...I'm not sure if it's ok....just tell you first..

0 Badri Narayanan over 12 years ago in reply to 1330hayacool7102

TI__Guru 59700 points

I need the exact same firmware image corresponding to the CRASH DUMP provided previously .Otherwise no analysis is possible.

0 1330hayacool7102 over 12 years ago in reply to Badri Narayanan

Guru 10655 points

hello Super-badri,

pls use the following files in the above artile, because I found I add more print was only in A8 side instead of DSP/M3 side.

4048.CCS_CRASH_DUMP_VPSS-M3.txt
8171.ti816x-evm.7z

Thanks.

0 Badri Narayanan over 12 years ago in reply to 1330hayacool7102

TI__Guru 59700 points

Below is the crash dump backtrace:

It indicates a s/w exception was raised by SharedRegion. This should have been printed the reason for the s/w exception on the console if you log is correct. From the exception it looks like this is HeapMemMP buffer overflow where some component is writing beyond allocated memory. Check if your application is ensuring that it is not writing beyond allocated size of bitstream buffer and check that your swms layout is correct.ALso if your application is allocating some buffer using Vsys_allocBuf ensure you are not writing beyond allocated memory. You will have to connect CCS to determine exact memory location that is corrupted to debug the issue further.

0 Vps_rprintf(unsigned char *) at /home/medwin/Projects/TI-8168/DVRRDK_03.00.00.00_ori_2012-04-29/ti_tools/hdvpss/hdvpss_01_00_01_37_patched/packages/ti/psp/vps/common/src/remote_d
ebug_server.c:168 PC = 0x9DF03894 FP = 0x3F00DDDC
1 Utils_errorRaiseHook(struct xdc_runtime_Error_Block *) at /home/medwin/Projects/TI-8168/DVRRDK_03.00.00.00_ori_2012-04-29/dvr_rdk/mcfw/src_bios6/utils/src/utils_execp_trace.c:247
PC = 0x9DEFC68E FP = 0x3F00DDF0
2 ti_sysbios_BIOS_errorRaiseHook__I(struct xdc_runtime_Error_Block *) at /home/medwin/Projects/TI-8168/DVRRDK_03.00.00.00_ori_2012-04-29_ori/ti_tools/bios/bios_6_33_05_46/packages/
ti/sysbios/BIOS.c:193 PC = 0x9DF13DD6 FP = 0x3F00DE18
3 xdc_runtime_Error_raiseX__F(struct xdc_runtime_Error_Block *, unsigned short, unsigned char *, int, unsigned int, int, int) at /db/rtree/install/trees/products/xdcprod/xdcprod-p4
7/product/Linux/xdctools_3_23_02_47/packages/xdc/runtime/Error.c:153 PC = 0x9DEF6994 FP = 0x3F00DE28
4 xdc_runtime_Error_raiseX__E(struct xdc_runtime_Error_Block *, unsigned short, unsigned char *, int, unsigned int, int, int) at /home/medwin/Projects/TI-8168/DVRRDK_03.00.00.00_or
i_2012-04-29_ori/dvr_rdk/../dvr_rdk/build/dvr_rdk/obj/ti816x-evm/m3vpss/release/dvr_rdk_configuro/package/cfg/MAIN_APP_m3vpss_pem3.c:24897 PC = 0x9DF17742 FP = 0x3F00DE98
5 xdc_runtime_Assert_raise__I(unsigned short, unsigned char *, int, unsigned int) at /db/rtree/install/trees/products/xdcprod/xdcprod-p47/product/Linux/xdctools_3_23_02_47/packages
/xdc/runtime/Assert.c:34 PC = 0x9DF1393A FP = 0x3F00DEB0
6 SharedRegion_getPtr(unsigned int) at /home/medwin/Projects/TI-8168/DVRRDK_03.00.00.00_ori_2012-04-29_ori/ti_tools/ipc/ipc_1_24_03_32/packages/ti/sdo/ipc/SharedRegion.c:305 PC = 0
x00406D8A FP = 0x3F00DED0
7 ti_sdo_ipc_heaps_HeapMemMP_getStats__E(struct ti_sdo_ipc_heaps_HeapMemMP_Object *, struct xdc_runtime_Memory_Stats *) at /home/medwin/Projects/TI-8168/DVRRDK_03.00.00.00_ori_2012
-04-29_ori/ti_tools/ipc/ipc_1_24_03_32/packages/ti/sdo/ipc/heaps/HeapMemMP.c:909 PC = 0x0040A1EE FP = 0x3F00DEF0
8 xdc_runtime_IHeap_getStats(struct xdc_runtime_IHeap___Object *, struct xdc_runtime_Memory_Stats *) at /home/medwin/Projects/TI-8168/DVRRDK_03.00.00.00_ori_2012-04-29_ori/dvr_rdk/
../ti_tools/xdc/xdctools_3_23_03_53/packages/xdc/runtime/IHeap.h:152 PC = 0x9DF186A0 FP = 0x3F00DF08
9 utils_sw_exception_copy_info() at /home/medwin/Projects/TI-8168/DVRRDK_03.00.00.00_ori_2012-04-29/dvr_rdk/mcfw/src_bios6/utils/src/utils_execp_trace.c:193 PC = 0x9DEE984E FP = 0x
3F00DF08
10 <symbol is not available> PC = 0x9DF02D90 FP = 0x3F00DF48

0 1330hayacool7102 over 12 years ago in reply to Badri Narayanan

Guru 10655 points

hi super-badri,

thanks for your reply.

Check if your application is ensuring that it is not writing beyond allocated size of bitstream buffer

- I'll check

and check that your swms layout is correct.

- I'm not clear about this meaning? could you describe more detail??

ALso if your application is allocating some buffer using Vsys_allocBuf ensure you are not writing beyond allocated memory.

- By my checking, we don't use that...

0 1330hayacool7102 over 12 years ago in reply to Badri Narayanan

Guru 10655 points

hi badri,

by following inside CCS_CRASH_XX

M 0 0x3f005f60 0x00008000

and following in dvr_rdk_m3vpss_release.xem3.map

3effdf60 00008000 : captureLink_tsk.oem3 (.bss:taskStackSection)
3f005f60 00008000 : systemLink_tsk_m3vpss.oem3 (.bss:taskStackSection)
3f00df60 00008000 : system_common.oem3 (.bss:taskStackSection)

I know something stuck in systemLink_tsk_m3vpss.c and found our added function inside SystemLink_cmdHandler()

our added function is as below

case SYSTEM_COMMON_CMD_GET_FREE_SPACE:
{
SystemCommon_GetFreeSpace *prm = (SystemCommon_GetFreeSpace *) pPrm;

prm->framefreeSpace = Utils_memGetBufferHeapFreeSpace();

prm->bitfreeSpace = Utils_memGetBitBufferHeapFreeSpace();

and we just call the above two functions from TI, and I check these two functions more detail and got below

UInt32 Utils_memGetBufferHeapFreeSpace(void)
{
UInt32 size;
Memory_Stats stats;

Memory_getStats(gUtils_heapMemHandle[UTILS_MEM_VID_FRAME_BUF_HEAP], &stats);

size = stats.totalFreeSize;

return ((UInt32) (size));
}

UInt32 Utils_memGetBitBufferHeapFreeSpace(void)
{
Memory_Stats stats;

Memory_getStats(gUtils_heapMemHandle[UTILS_MEM_VID_BITS_BUF_HEAP], &stats);

return ((UInt32) (stats.totalFreeSize));
}

The two functions ask the status by Memory_getStats and the behavior is like your item7 operation.

I want to check more inside Memory_getStats() but I don't know where the source code is. Could you help me to trace more deeply??

0 Badri Narayanan over 12 years ago in reply to 1330hayacool7102

TI__Guru 59700 points

The info is available in the backtrace I shared above:

Memory_getStats -> xdc_runtime_IHeap_getStats -> HeapMemMP_getStats ->SharedRegion_getPtr(s/w exception here).

The source code for HeapMemMP_getStats is present in

/home/medwin/Projects/TI-8168/DVRRDK_03.00.00.00_ori_2012 -04-29_ori/ti_tools/ipc/ipc_1_24_03_32/packages/ti/sdo/ipc/heaps/HeapMemMP.c:909

As I mentioned previously the issue is due to memory corruption due to buffer overflow. You will have to debug the cause of the corruption and not debug this function.

You can get the address that is corrupted and debug from there.

0 SuitJune Young over 12 years ago in reply to Badri Narayanan

Genius 3985 points

We have seen issues with some customer board where there are bit flips in nand which can cause crash in libc code.

Could you please explain it in detail?

What may cause the bit flips in nand?Wrong ECC config?wrong-configured ubifs?

Thanks in advance!

Processors

Processors forum

question about demo code