This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Codec engine sometimes hangs

Hello,

I apologize in advance if this is not the appropriate forum to post this, and would appreciate it if a TI employee could move it to a more appropriate forum if it exists.

I am having an intermittent problem with a codec engine algorithm I developed.  When it occurs, it happens on the first call to VIDANALYTICS_control after calling VIDANALYTICS_create.  The following is a CE_DEBUG=2 log of the error:

 

@5,104,675us: [+0 T:0x45a82460 S:0x45a81644] ti.sdo.ce.vidanalytics.VIDANALYTICS - VIDANALYTICS_create> Enter (engine=0xf3fd0, name='vmd', params=0x45a818b0 (size=0x24))

@5,104,827us: [+0 T:0x45a82460 S:0x45a81614] CV - VISA_create(0xf3fd0, 'vmd', 0x45a818b0, 0x3060, 'ti.sdo.ce.vidanalytics.IVIDANALYTICS')

@5,104,888us: [+0 T:0x45a82460 S:0x45a81514] CV - VISA_create2(0xf3fd0, 'vmd', 0x45a818b0, 0x24, 0x3060, 'ti.sdo.ce.vidanalytics.IVIDANALYTICS')

@5,104,949us: [+0 T:0x45a82460 S:0x45a814fc] OM - Memory_alloc> Enter(0x34)

@5,105,010us: [+0 T:0x45a82460 S:0x45a814fc] OM - Memory_alloc> return (0x45b00c98)

@5,105,041us: [+0 T:0x45a82460 S:0x45a814fc] OM - Memory_alloc> Enter(0x4)

@5,105,102us: [+0 T:0x45a82460 S:0x45a814fc] OM - Memory_alloc> return (0x45b00c60)

@5,105,133us: [+0 T:0x45a82460 S:0x45a814f4] OC - Comm_alloc> Enter(poolId=0x0, msg=0x45b00c60, size=3060)

@5,105,224us: [+0 T:0x45a82460 S:0x45a814f4] OC - Comm_alloc> msg=0x44e76880, returning (0)

@5,105,285us: [+0 T:0x45a82460 S:0x45a8149c] CE - Engine_createNode(0xf3fd0, 'vmd', 3060, 0x45a818b0, 0x24, 0x45a8160c)

@5,105,316us: [+0 T:0x45a82460 S:0x45a8149c] CE - Engine> allocNode Enter(engine=0xf3fd0, impId='vmd')

@5,105,377us: [+0 T:0x45a82460 S:0x45a81484] OM - Memory_alloc> Enter(0x20)

@5,105,407us: [+0 T:0x45a82460 S:0x45a81484] OM - Memory_alloc> return (0x45b00eb0)

@5,105,468us: [+0 T:0x45a82460 S:0x45a8149c] CE - Engine> allocNode(). Calling (Comm_create(gppfromnode_1479_1, 0x45b00eb8, NULL)

@5,105,560us: [+0 T:0x45a82460 S:0x45a81474] OC - Comm_create> Enter(queueName='gppfromnode_1479_1', queue=0x45b00eb8, attrs=0x0)

@5,105,590us: [+0 T:0x45a82460 S:0x45a8145c] OM - Memory_alloc> Enter(0x4)

@5,105,651us: [+0 T:0x45a82460 S:0x45a8145c] OM - Memory_alloc> return (0x45b00c70)

@5,106,323us: [+0 T:0x45a82460 S:0x45a81474] OC - Comm_create> return (0x45b00c70)

@5,106,414us: [+0 T:0x45a82460 S:0x45a8147c] OC - Comm_put> Enter(queue=0x0, msg=0x44e75880)

@5,106,506us: [+0 T:0x45a82460 S:0x45a8147c] OC - Comm_put> return (0)

@5,106,567us: [+0 T:0x45a82460 S:0x45a81474] OC - Comm_get> Enter(queue=0x10000, msg=0x45a81524, timeout=-1)

@5,159,637us: [+0 T:0x45a82460 S:0x45a81474] OC - Comm_get> MSGQ_get() status=0x8000, return (0)

@5,159,759us: [+0 T:0x45a82460 S:0x45a8147c] OC - Comm_put> Enter(queue=0x0, msg=0x44e75880)

@5,159,820us: [+0 T:0x45a82460 S:0x45a8147c] OC - Comm_put> return (0)

@5,159,881us: [+0 T:0x45a82460 S:0x45a81474] OC - Comm_get> Enter(queue=0x10000, msg=0x45a81524, timeout=-1)

@5,160,003us: [+0 T:0x45a82460 S:0x45a81474] OC - Comm_get> MSGQ_get() status=0x8000, return (0)

@5,160,064us: [+4 T:0x45a82460 S:0x45a8149c] CE - Engine_createNode> created node(stdIn=0x2, stdOut=0x10001, msgq=0x45b00c70, algName='vmd', rmsNode=0x87f19098, algHandle=0x87f19180)

@5,160,125us: [+0 T:0x45a82460 S:0x45a81404] OC - Comm_put> Enter(queue=0x0, msg=0x44e75880)

@5,160,186us: [+0 T:0x45a82460 S:0x45a81404] OC - Comm_put> return (0)

@5,160,247us: [+0 T:0x45a82460 S:0x45a813fc] OC - Comm_get> Enter(queue=0x10000, msg=0x45a8149c, timeout=-1)

@5,160,552us: [+0 T:0x45a82460 S:0x45a813fc] OC - Comm_get> MSGQ_get() status=0x8000, return (0)

[DSP] @0,217,556tk: [+0 T:0x87f145f4 S:0x87f185b4] CR - processRmsCmd(0x873048a8, 4056): cmd = 0

[DSP] @0,217,622tk: [+0 T:0x87f145f4 S:0x87f1849c] OM - Memory_alloc> Enter(size=0x18)

[DSP] @0,217,676tk: [+0 T:0x87f145f4 S:0x87f1849c] OM - Memory_alloc> return (0x87f19098)

[DSP] @0,217,728tk: [+0 T:0x87f145f4 S:0x87f1849c] OM - Memory_alloc> Enter(size=0x6)

[DSP] @0,217,774tk: [+0 T:0x87f145f4 S:0x87f1849c] OM - Memory_alloc> return (0x87f190b0)

[DSP] @0,217,833tk: [+0 T:0x87f145f4 S:0x87f1845c] OM - Memory_alloc> Enter(size=0x20)

[DSP] @0,217,880tk: [+0 T:0x87f145f4 S:0x87f1845c] OM - Memory_alloc> return (0x87f190b8)

[DSP] @0,217,932tk: [+0 T:0x87f145f4 S:0x87f1842c] OM - Memory_alloc> Enter(size=0x24)

[DSP] @0,217,979tk: [+0 T:0x87f145f4 S:0x87f1842c] OM - Memory_alloc> return (0x87f190d8)

[DSP] @0,218,052tk: [+0 T:0x87f145f4 S:0x87f1843c] ti.sdo.ce.vidanalytics.VIDANALYTICS - VIDANALYTICS_create> Enter (engine=0x0, name='vmd', params=0x873048c0 (size=0x24))

[DSP] @0,218,142tk: [+0 T:0x87f145f4 S:0x87f18404] CV - VISA_create(0x0, 'vmd', 0x873048c0, 0x3060, 'ti.sdo.ce.vidanalytics.IVIDANALYTICS')

[DSP] @0,218,217tk: [+0 T:0x87f145f4 S:0x87f182f4] CV - VISA_create2(0x0, 'vmd', 0x873048c0, 0x24, 0x3060, 'ti.sdo.ce.vidanalytics.IVIDANALYTICS')

[DSP] @0,218,305tk: [+0 T:0x87f145f4 S:0x87f1828c] CE - Engine_open> Enter('local', 0x87f182dc, 0x873048c0)

[DSP] @0,218,366tk: [+0 T:0x87f145f4 S:0x87f1826c] OM - Memory_alloc> Enter(size=0x2c)

[DSP] @0,218,415tk: [+0 T:0x87f145f4 S:0x87f1826c] OM - Memory_alloc> return (0x87f19150)

[DSP] @0,218,476tk: [+0 T:0x87f145f4 S:0x87f1828c] CE - Engine_open> return(-2014211760)

[DSP] @0,218,544tk: [+0 T:0x87f145f4 S:0x87f182d4] OM - Memory_alloc> Enter(size=0x34)

[DSP] @0,218,591tk: [+0 T:0x87f145f4 S:0x87f182d4] OM - Memory_alloc> return (0x87f19180)

[DSP] @0,218,646tk: [+0 T:0x87f145f4 S:0x87f182ac] ti.sdo.ce.alg.Algorithm - Algorithm_create> Enter(fxns=0x87e75268, idma3Fxns=0x87e758ec, iresFxns=0x0, params=0x873048c0, attrs=0x87f183f0)

[DSP] @0,218,743tk: [+0 T:0x87f145f4 S:0x87f1828c] OM - Memory_alloc> Enter(size=0x10)

[DSP] @0,218,791tk: [+0 T:0x87f145f4 S:0x87f1828c] OM - Memory_alloc> return (0x87f191b8)

[DSP] @0,297,064tk: [+5 T:0x87f145f4 S:0x87f18204] VMD - +VMD_TI_init(0x87400000, 0x87f19538, 0x00000000, 0x873048c0)

[DSP] @0,297,143tk: [+6 T:0x87f145f4 S:0x87f18204] VMD - +VMD_TI_init

[DSP] @0,297,184tk: [+6 T:0x87f145f4 S:0x87f18204] VMD - VMD: Parameters OK

[DSP] @0,297,235tk: [+6 T:0x87f145f4 S:0x87f181a4] VMD - +vmd_init_conv_mask()

[DSP] @0,297,277tk: [+6 T:0x87f145f4 S:0x87f181a4] VMD - -vmd_init_conv_mask() -> OK

[DSP] @0,297,323tk: [+6 T:0x87f145f4 S:0x87f181dc] VMD - +vmd_init_fg_mask()

[DSP] @0,305,072tk: [+6 T:0x87f145f4 S:0x87f181dc] VMD - -vmd_init_fg_mask() -> OK

[DSP] @0,305,171tk: [+6 T:0x87f145f4 S:0x87f18204] VMD - VMD: params->

[DSP]   .size=36

[DSP]   .maxHeight=96

[DSP]   .maxWidth=144

[DSP]   .maxFrameRate=60000

[DSP]   .maxRules==10

[DSP] @0,305,245tk: [+6 T:0x87f145f4 S:0x87f18204] VMD - VMD: params->

[DSP]   .mean_weight=0x03ff

[DSP]   .bg_global_thres=0x0800

[DSP]   .bg_thres_factor=0x7800

[DSP] @0,305,315tk: [+6 T:0x87f145f4 S:0x87f18204] VMD - -VMD_TI_init

[DSP] @0,305,453tk: [+6 T:0x87f145f4 S:0x87f1824c] VMD - +VMD_TI_dmaGetChannels()

[DSP] @0,305,497tk: [+6 T:0x87f145f4 S:0x87f1824c] VMD - -VMD_TI_dmaGetChannels()

[DSP] @0,305,576tk: [+6 T:0x87f145f4 S:0x87f1824c] VMD - +VMD_TI_dmaGetChannels()

[DSP] @0,305,620tk: [+6 T:0x87f145f4 S:0x87f1824c] VMD - -VMD_TI_dmaGetChannels()

[DSP] @0,305,819tk: [+6 T:0x87f145f4 S:0x87f1824c] VMD - +VMD_TI_dmaInit()

[DSP] @0,305,869tk: [+6 T:0x87f145f4 S:0x87f1824c] VMD - -VMD_TI_dmaInit()

[DSP] @0,305,918tk: [+0 T:0x87f145f4 S:0x87f182ac] ti.sdo.ce.alg.Algorithm - Algorithm_create> return (0x87f191b8)

[DSP] @0,305,983tk: [+5 T:0x87f145f4 S:0x87f182f4] CV - VISA_create> local codec created (name='vmd', handle=0x87f191b8)

[DSP] @0,306,051tk: [+0 T:0x87f145f4 S:0x87f1843c] ti.sdo.ce.vidanalytics.VIDANALYTICS - VIDANALYTICS_create> return (0x87f19180)

[DSP] @0,306,125tk: [+4 T:0x87f145f4 S:0x87f18404] OT - Thread_create > name: "vmd#0", pri:  -1, stack size:  8448, stack seg: 0

[DSP] @0,307,154tk: [+0 T:0x87f145f4 S:0x87f185b4] CR - processRmsCmd(0x873048a8, 4056): cmd = 1

[DSP] @0,307,744tk: [+0 T:0x87f145f4 S:0x87f185b4] CR - processRmsCmd(0x873048a8, @5,164,367us: [+0 T:0x45a82460 S:0x45a81404] OC - Comm_put> Enter(queue=0x0, msg=0x44e75880)

@5,164,459us: [+0 T:0x45a82460 S:0x45a81404] OC - Comm_put> return (0)

@5,164,520us: [+0 T:0x45a82460 S:0x45a813fc] OC - Comm_get> Enter(queue=0x10000, msg=0x45a8149c, timeout=-1)

@5,164,642us: [+0 T:0x45a82460 S:0x45a813fc] OC - Comm_get> MSGQ_get() status=0x8000, return (0)

4056): cmd = 5

[DSP] @0,307,800tk: [+0 T:0x87f145f4 S:0x87f185b4] CR - remote time = 0x0, trace buffer size = 4032

[DSP] @0,314,896tk: [+0 T:0x87f145f4 S:0x87f185b4] CR - processRmsCmd(0x873048a8, 4056): cmd = 5

[DSP] @0,314,952tk: [+0 T:0x87f145f4 S:0x87f185b4] CR - remote time = 0x0, trace buffer size = 4032

@5,169,586us: [+0 T:0x45a82460 S:0x45a81434] CE - Engine_fwriteTrace> returning count [4650]

@5,169,647us: [+2 T:0x45a82460 S:0x45a8149c] CE - Engine_createNode> Returning 0x45b00eb0

@5,169,738us: [+5 T:0x45a82460 S:0x45a81514] CV - VISA_create> remote codec created (name='vmd', localQueueID=0x10001, remoteQueueID=0x0002)

@5,169,799us: [+0 T:0x45a82460 S:0x45a81644] ti.sdo.ce.vidanalytics.VIDANALYTICS - VIDANALYTICS_create> return (0x45b00c98)

@5,169,891us: [+0 T:0x45a82460 S:0x45a8146c] ti.sdo.ce.vidanalytics.VIDANALYTICS - VIDANALYTICS_control> Enter (handle=0x45b00c98, id=3, dynParams=0x41a84924 (size=0x62c), status=0x45a81538 (size=0x158)

@5,169,952us: [+4 T:0x45a82460 S:0x45a813dc] CV - VISA_getMaxMsgSize(0x45b00c98): returning 0x1000

@5,170,013us: [+5 T:0x45a82460 S:0x45a813e4] CV - VISA_allocMsg> Allocating message for messageId=0x0002fe01

@5,170,074us: [+0 T:0x45a82460 S:0x45a813c4] CV - VISA_call(visa=0x45b00c98, msg=0x44e76880): messageId=0x0002fe01, command=0x1

@5,170,135us: [+0 T:0x45a82460 S:0x45a8138c] OC - Comm_put> Enter(queue=0x2, msg=0x44e76880)

@5,170,227us: [+0 T:0x45a82460 S:0x45a8138c] OC - Comm_put> return (0)

@5,170,288us: [+0 T:0x45a82460 S:0x45a81384] OC - Comm_get> Enter(queue=0x10001, msg=0x45a8147c, timeout=-1)

 

 

At this point it hangs and requires a reboot to recover. As you can see, it fails very early in the VIDANALYTICS_control call.  When it works normally (I can post an example if requested), there are several more lines printed out before finally entering my algorithm's control function.

Other forum threads on the topic have suggested that this could occur if the code on the DSP crashes. I have instrumented my code with debug statements that show when a function I wrote is entered & exited and in my analysis, every entry into one of my functions is followed by an exit of the same function.  This tells me at least that the code did not crash within one of my custom functions.

My question is, how do I go about debugging and fixing this problem?  I am using DVSDK 3.01.00.10, Codec Engine 2.25.02.11, DMAI 2.05.00.12, XDAIS 6.25.02.11, and our custom algorithm doing VMD using VLIB is integrated into cs1omap3530 1.01.00. 

Thanks,

Dennis

 

  • Hi Dennis,

    From the trace, it looks like the VIDANALYTICS_control call is sent down to the DSP and the ARM side code is waiting for a reply using Comm_get. The code on the DSP seems to be "stuck", possibly in your algorithm's control function. Do you see the control function of your algorithm getting called on the DSP given you said you have instrumented your code? How do you normally print out your debug statements (i.e. via CCS, using the GT module, or by some other means) on the DSP? If you have Code Composer Studio, one possibility is to connect to the DSP and take a look at where the code is halted (see http://processors.wiki.ti.com/index.php/Debugging_the_DSP_side_of_a_CE_application_using_CCS). Another quick test would be to replace your control function on the DSP with a dummy function (like the one in the codec engine vidanalytics_copy codec example) so that it is more likely to return properly, and then either backtrack to see what your control function is missing, or isolate the issue to something outside of the function.

    Best regards,

    Vincent

  • Hi Vincent,

     

    You're right, it appears to be "stuck" waiting for the DSP side to respond to a message from the ARM.  When it is working, the CE_DEBUG=2 output for the VIDANALYTICS_control call looks like this:

     

     

    [+0 T:0x400204c0 S:0xbe9a96e4] ti.sdo.ce.vidanalytics.VIDANALYTICS - VIDANALYTICS_control> Enter (handle=0x5fa38, id=1, dynParams=0xbe9a97cc (size=0x104), status=0xbe9a98d0 (size=0x158)

    [+4 T:0x400204c0 S:0xbe9a9654] CV - VISA_getMaxMsgSize(0x5fa38): returning 0x1000

    [+5 T:0x400204c0 S:0xbe9a965c] CV - VISA_allocMsg> Allocating message for messageId=0x0002ff03

    [+0 T:0x400204c0 S:0xbe9a963c] CV - VISA_call(visa=0x5fa38, msg=0x4160a880): messageId=0x0002ff03, command=0x1

    [+0 T:0x400204c0 S:0xbe9a9604] OC - Comm_put> Enter(queue=0x2, msg=0x4160a880)

    [+0 T:0x400204c0 S:0xbe9a9604] OC - Comm_put> return (0)

    [+0 T:0x400204c0 S:0xbe9a95fc] OC - Comm_get> Enter(queue=0x10001, msg=0xbe9a96f4, timeout=-1)

    [+0 T:0x400204c0 S:0xbe9a95fc] OC - Comm_get> MSGQ_get() status=0x8000, return (0)

    [+0 T:0x400204c0 S:0xbe9a958c] OC - Comm_put> Enter(queue=0x0, msg=0x41609880)

    [+0 T:0x400204c0 S:0xbe9a958c] OC - Comm_put> return (0)

    [+0 T:0x400204c0 S:0xbe9a9584] OC - Comm_get> Enter(queue=0x10000, msg=0xbe9a9624, timeout=-1)

    [+0 T:0x400204c0 S:0xbe9a9584] OC - Comm_get> MSGQ_get() status=0x8000, return (0)

    [DSP] @0,208,961tk: [+5 T:0x87f195c4 S:0x87f1b68c] CN - NODE> 0x87f190b8(vmd#0) call(algHandle=0x87f19180, msg=0x87305880); messageId=0x0002ff03

    [DSP] @0,209,053tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> Enter(addr=0xffff00, sizeInBytes=16776960)

    [DSP] @0,224,132tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> return

    [DSP] @0,224,173tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> Enter(addr=0xffff00, sizeInBytes=16776960)

    [DSP] @0,239,247tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> return

    [DSP] @0,239,289tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> Enter(addr=0xffff00, sizeInBytes=16776960)

    [DSP] @0,254,366tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> return

    [DSP] @0,254,408tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> Enter(addr=0xffff00, sizeInBytes=16776960)

    [DSP] @0,269,481tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> return

    [DSP] @0,269,523tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> Enter(addr=0xffff00, sizeInBytes=16776960)

    [DSP] @0,284,597tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> return

    [DSP] @0,284,640tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> Enter(addr=0xffff00, sizeInBytes=16776960)

    [DSP] @0,299,713tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> return

    [DSP] @0,299,755tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> Enter(addr=0xffffff, sizeInBytes=16776960)

    [DSP] @0,314,831tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> return

    [DSP] @0,314,873tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> Enter(addr=0xffff00, sizeInBytes=16776960)

    [DSP] @0,329,950tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> return

    [DSP] @0,329,992tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> Enter(addr=0xffff00, sizeInBytes=285212416)

    [DSP] @0,591,456tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> return

    [DSP] @0,591,498tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> Enter(addr=0xffff00, sizeInBytes=16776960)

    [DSP] @0,606,572tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> return

    [DSP] @0,606,614tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> Enter(addr=0xffff00, sizeInBytes=16776960)

    [DSP] @0,621,687tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> return

    [DSP] @0,621,728tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> Enter(addr=0xffff00, sizeInBytes=16776960)

    [DSP] @0,636,802tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> return

    [DSP] @0,636,844tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> Enter(addr=0xffff00, sizeInBytes=16776960)

    [DSP] @0,651,918tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> return

    [DSP] @0,651,960tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> Enter(addr=0xffff00, sizeInBytes=16776448)

    [DSP] @0,667,033tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> return

    [DSP] @0,667,075tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> Enter(addr=0xffff00, sizeInBytes=16776960)

    [DSP] @0,682,149tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> return

    [DSP] @0,682,191tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> Enter(addr=0xffff00, sizeInBytes=16776960)

    [DSP] @0,697,265tk: [+0 T:0x87f195c4 S:0x87f1b4ac] OM - Memory_cacheInv> return

    [DSP] @0,697,309tk: [+0 T:0x87f195c4 S:0x87f1b44c] ti.sdo.ce.vidanalytics.VIDANALYTICS - VIDANALYTICS_control> Enter (handle=0x87f19180, id=1, dynParams=0x873058b4 (size=0x104), status=0x873059b8 (size=0x158)

    [DSP] @0,697,425tk: [+5 T:0x87f195c4 S:0x87f1b42c] CV - VISA_enter(visa=0x87f19180): algHandle = 0x87f191b8

    [DSP] @0,697,488tk: [+0 T:0x87f195c4 S:0x87f1b40c] ti.sdo.ce.alg.Algorithm - Algorithm_activate> Enter(alg=0x87f191b8)

    [DSP] @0,697,563tk: [+0 T:0x87f195c4 S:0x87f1b40c] ti.sdo.ce.alg.Algorithm - Algorithm_activate> Exit

    [DSP] @0,697,617tk: [+5 T:0x87f195c4 S:0x87f1b414] VMD - +VMD_TI_control() -> OK

     

     

    So as you can see, during the failure condition, it never gets to my algorithm's control function (the successful entry into that function is demonstrated by the last line, "VMD_TI_control() -> OK").

    I use the GT module to print out debug statements.  We don't use CCS and I don't think that switching to use it is an option at this time.

    Thanks for your reply,

    Dennis

     

  • Hi Dennis,

    The DSP trace (lines that start with [DSP]) do not get printed out til after the remote call has completed successfully. So if your control function gets "stuck" on the DSP, you will not see the GT output unless the call completes and a message is sent back to the ARM to unblock it. As I mentioned, it may be worth substituting in a simple control function that is certain to work to allow you to better isolate the problem.

    Best regards,

    Vincent

  • I have some new information about this issue that I hope someone will be able to use to let me know what I'm doing wrong.

    I have a situation where I can run a custom (I wrote it) video processing algorithm (video motion detection using IVIDANALYTICS) and it seems to work fine. Then I run the H.264 encoder to encode a video file and it works fine.  Then I run the same custom VMD algorithm, and it fails on the first call after create, every time. It doesn't matter if it's a IVIDANALYTICS_control() or IVIDANALYTICS_process() call, it causes the application to hang or crash with the DSP MMU fault.  The same thing happens if I attempt to run the H.264 encoder at the same time as the VMD algorithm.  It doesn't matter what the contents of my control() or process() functions are, I have even dummied them out so they just return success, and the results are the same.  It's almost as if the statically allocated function pointer table that provides the pointers to the implementation of the VMD algorithm is somehow getting corrupted when I run the H.264 encoder.

    My codec server is a modified version of cs.x64P from cs1omap3530_1_01_00 from dvsdk 3.01.00.10.  I have added the TI AAC encoder and my custom VMD algorithm.  I have all codecs using stackMemId 0, the video codecs are using groupId 0, the audio codecs are using groupId 2, and the VMD algorithm is using groupId 1.  VMD uses DMA so I have adjusted DMAN3.numPaRamGroup[] and DMAN3.numTccGroup[] accordingly.

    This problem is getting very serious and we need to solve it ASAP. Thanks in advance for any help you may be able to provide.

    Dennis

  • Dennis Estenson said:
    I have a situation where I can run a custom (I wrote it) video processing algorithm (video motion detection using IVIDANALYTICS) and it seems to work fine. Then I run the H.264 encoder to encode a video file and it works fine.  Then I run the same custom VMD algorithm, and it fails on the first call after create, every time.

    Can you describe this failure case in more detail?  Do you have 2 separate apps, one for each codec, that use the same server?  Or is there one app that does VMD create/process/delete, then does h264_create/process/delete, then does VMD_create/process <-- crash?

    I'm also interested in where Engine_open()/close() is happening in this flow.  If two separate apps, each one probably Engine_open() (which loads the Server on the DSP), VMD_create/process/delete, then Engine_close() (which unloads the DSP).  Those mountains of trace are useful to help explain the call flows like this, too... I know they can be a pain, but maybe you can post the CE_DEBUG=3 output of a recent failed run?  Best would be to link the trace enabled FC libraries into the Server as well so we can see DSKT2 and DMAN3 resource allocations, too.

    Finally, does your VMD algorithm request/use any internal scratch memory via its memTabs?  If so, it doesn't access it until it's been activated, right?  If you're using scratch resources, does it crash if you stub out your control() call _and_ the activate()/deactivate() calls?

    Chris

  • Chris,

    Chris Ring said:
    Can you describe this failure case in more detail?  Do you have 2 separate apps, one for each codec, that use the same server?  Or is there one app that does VMD create/process/delete, then does h264_create/process/delete, then does VMD_create/process <-- crash?

    In my test case, I am using a gstreamer pipeline.  There is a single app, gst-launch, that I use, and 2 separate plugins to access the codecs.  For VMD only I'm using essentially the command line: gst-launch v4l2src ! vmd ! fakesink; for H.264 I'm using: v4l2src ! dmaienc_h264 ! fakesink; and for both I use v4l2src ! vmd ! dmaienc_h264 ! fakesink.  In my test script, I run the vmd only command line, then the H.264 command line 3 times, then the vmd command line again.  It always fails on the first vmd command line after doing H.264.

    I have attached logs with CE_DEBUG=3.  I have grown used to looking at these logs, I don't mind capturing them.  There is one log of VMD working successfully (without crashing or hanging), one log of H.264 working successfully, and one log of VMD causing the DSP MMU fault:
    DSP MMU Error Fault!  MMU_IRQSTATUS = [0x1]. Virtual DSP addr reference that generated the interrupt = [0xbebebeb8].

    I do use external scratch memory, and external persistent memory via memTabs.  The scratch memory is initialized and used only from within the process function.  If I stub out the control, and activate/deactivate functions, it still crashes.


    Thanks,

    Dennis

    4034.celogs.zip

    (Per Prateek's request, I've also included the codec.cfg, server.cfg, and server.tcf files for the codec server):
    0511.cfgs.zip

  • Hi Dennis,


    What is CE version you are using? Found that the error DSP MMU Error Fault!  MMU_IRQSTATUS = [0x1]. Virtual DSP addr reference that generated the interrupt = [0xbebebeb8] could be part of some CE issues in older version with regards to supporting IUNIVERSAL

    http://tiexpressdsp.com/index.php/Codec_Engine_Known_Issues


    Also, check out thiswhy MMU Faults can happen -

    http://wiki.omap.com/index.php/DSP_MMU_Faults


    Prateek


     

  • Prateek Bansal said:
    What is CE version you are using?

    We're using Codec Engine version 2.25.02.11.

  • The trace logs show CE 2.25.02.11.  I don't think it's the IUNIVERSAL issue.

    The MMU fault does concern me.  It says someone on the DSP is trying to access the address 0xbebebeb8 - obviously not a valid address.  The thing that most interests me is that stacks are initialized to 0xbebebebe.  So that addr seems like a bad ptr dereference.

    Stacks look ok as best I can tell.  From the successful runs, the ARM-side logs report only 15% of the codec stacks were used.

    It sure would be nice to attach to the DSP with CCS, set a breakpoint on the DSP-side and step through this...

    Chris

  • It seems to me that the error:

    DSP MMU Error Fault! MMU_IRQSTATUS = [0x1]. Virtual DSP addr reference that generated the interrupt = [0xbebebeb8].

    must've occurred  when some memory containing the value 0xbebebeb8 was treated like a pointer and dereferenced.

     

  • A sanity check you might do - can you run with CE_CHECK=1?  I'm kinda wondering about some of the buffer descriptors.

    Chris

  • Chris Ring said:
    The trace logs show CE 2.25.02.11.  I don't think it's the IUNIVERSAL issue.

    I concur, though the symptoms do sound very similar.
    Chris Ring said:
    It sure would be nice to attach to the DSP with CCS, set a breakpoint on the DSP-side and step through this...

    JTAG is not working on our boards.  I just asked our hardware guys how hard it would be to get it working.  They're looking into it but it may be 5 days or more before it's ready. For now, I'm going to attempt to replicate the problem on the OMAP EVM Rev. G board.

     

    Dennis

  • Chris Ring said:
    A sanity check you might do - can you run with CE_CHECK=1?

    I don't see any differences in the logs.

    Dennis

    6560.cedebugchk.zip

     

  • Chris,

    We've successfully attached CCS (v3.3.82.13) to the Mistral OMAP EVM and I am able to set breakpoints and step through the code.  Do you have any suggestions for where to set my breakpoints? I have tried putting it in my algorithm's alg*() functions and came to the conclusion that the Algorithm_create() appears to succeed, but none of my breakpoints are hit after that and before the DSP MMU error fault.  Is there a way to have the debugger break on that error fault so I could see the call stack and other useful debugging data?

    Thanks,

    Dennis

  • Great to hear.

    Dennis Estenson said:
    Do you have any suggestions for where to set my breakpoints?

    There's some similar debugging described in this thread:  http://e2e.ti.com/support/embedded/f/353/p/70176/254585.aspx

    I'd like to see what happens on the DSP side after the ARM sends the control() msg.  The DSP-side breakpoint is likely in the NODE fxns described in the other thread.  If we receive the DSP-side MSGQ_get() but aren't getting to your alg fxns, it may be something in the framework between the DSP-side MSGQ_get() and your alg's control() call.

    Chris

  • Hi Chris,

    I was able to put a breakpoint at the beginning of NODE_EXECFXN() function, but it is unable to step from there.  It's as if that code that implements that function is in release mode.

    I was also able to fix one bug that was definitely causing some problems.  Unfortunately there must be more, because it still crashes in the same way.

    Thanks,

    Dennis

  • Cool.  At the very end of the thread I pointed you at were instructions on ensuring 'debug' libraries are being linked into the server.  Can you double check that you're configuring those into your build?
     
    And/or perhaps you can set a breakpoint in VIDANALYTICS_control() (which is also called on the DSP-side) and see if you get that far?
     
    Chris
  • I was missing

    xdc.useModule('ti.sdo.ce.global.Settings').profile = "debug";
    xdc.useModule('ti.sdo.fc.global.Settings').profile = "debug";

    in my codec.cfg file.  Now I have much more fully-featured debugging capabilities.  Unfortunately, now that debugging works so well, the crash didn't happen the first time I ran it.  I will keep trying.


    Hopefully I can get this problem nailed down before the webex conference.

    Thanks,

    Dennis

  • After 5-6 runs, it appears that the problem occurs much less often with all of the debugging enabled.  It only crashed once, and I wasn't able to gather anything useful from that time.  I am continuing to try to replicate the problem.  What does the fact that the crash happens less often with full debugging enabled tell you?

    Thanks,

    Dennis

  • Woohoo, I've got some new data.  It appears that in call() in vidanaytics_skel.c, in the _VIDANALYTICS_CCONTROL case, the crash occurs when attempting to "invalidate data buffers".  I have a breakpoint at the first for loop in that case and a breakpoint at the call to VIDANALYTICS_control, and it crashed between the two.  From my visual inspection of the structure pointed to by pStatus, it appears that it is (largely or completely) uninitialized.  It seems to be invalidating random memory.

    Dennis

  • For the archives, this was found to be a bug in the DSP-side vidanalytics_skel.c file - exactly as Dennis described.  Good hunting, Dennis!

    I've filed SDOCM00075995 to track this, and we'll plan to release a fix in an upcoming CE 2.26.02 patch release.  An initial characterization of the bug is that it affects both VIDANALYTICS_process() and VIDANALYTICS_control() calls when the in/out data buffer arrays have zero buffers (e.g. control()'s status->data.numBufs == 0).

    Thanks for bringing it to our attention.

    Chris