This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DM368 H264 encoder version 2.10.00.07 Hangs up after multiple times profile add and delete

Hi,

after multiple times h264 profile add-delete( 30-35 times appx ) it hangs up with the following message.

*** glibc detected *** ./mainServer: corrupted double-linked list: 0x003d4fd8 ***

while debugged the source code we found the point at

VIDENC1_delete(m_pHandle); call, or some times

refers to the point  Buffer_delete(m_pOutBuffer) which is called inside our H264 instance close function.

what could cause this issue.

  • Hi Sujit,

    To reproduce the issue i did following experiments but i didnot see any error.

    1: on demo application( video capture -->encode-->filewrite)

    loop start {

    encoder create

    buftab create

    buftab delete

    encoder delete.

    }loop end

    I ran above loop for 250 times i didnt see any error.

    2: on stand alone test app

    encoder create -->process_frame (5 frame process calls) --> encoder delete.

    I ran above loop for 50 times i didnt get any error.

    And there is no change in ver 02.10.00.07 as compare to 02.10.00.06 which causes 'glibc' error or any other error.

    Can you please provide more info how you are doing create - delete? what are components you are cleaning?

    What are the components you are using like dmai etc... And also please provide CE_DEBUG=3 level log by using debug libary.

     

     

     

     

  • Hi

    veeranna,

    attached the CE_DEBUG=3 log.

    i want to share the source code of create and delete. can you please provide your TI id, in my id sujit.mahapatro@gmail.com or can you send a friend request so that i can post the source.

    here i can not post as per our company confidential policy.

  • Hi Sujit,

    In log i seen you are using face detector and audio also, can you turn of them? Can you have simple loop which will do create and delete?

    And you said that some times it hangs in videnc1_delete, in log i didnot see any videnc1_delete call I was expecting videnc1_delete printf just before glibs error. Your close calls looks simple deleting video and audio. To make more simple please have create delete calls in a loop and send the logs, please use codec debug library.

  • Hi Veeranna,

    as yashwant already tested a simple case where he mentioned clearly this problem is not happening.

    we are facing this problem in stress test, and it doesn't come always, some times it comes after 30-40 time continuously add and delete H264 profile and some times comes after 76 times like this.

    se we can not expect a simple process to find this problem, which has been already tested by Yashwant.

    Also as we are already in final stage of production, we want the same to be tested in our enviornment.

    Also as you said i have already commented audio and mjpeg before taking the CE_DEBUG log. but face detection instance is running but it won't affect as smart codec features has not been enabeled in the web browser.

     

  • Hi Veeranna,

    i just checked with version 2.10.00.06, where similar problem is happening but version 2.00.00.09 doesn't have this problem.

    This kind of testing has not been done earlier, now as it is on production stage these tests are coducted.

    *** glibc detected *** ./mainServer: corrupted double-linked list: 0x003c7978 ***

  • Hi Sujit,

    02.00.00.09 version has memory leak problem, it is fixed in 02.00.00.13.  It will be nice to have more debug logs from your system.

  • Hi,

    little confused on your last message.

    do you want me to check the same with 2.00.00.13 ?

    if yes, i don't find the same version in codecs release website.

    http://software-dl.ti.com/dsps/dsps_public_sw/codecs/DM365/index_FDS.html

  • Hi Sujit,

    I expected 02.00.00.09 will get fail in mutliple create and deletion And version 13 was supersed by 02.10 encoder so it is not in web. Please run with 02.10.00.07 codec debug library  and debug level3 send us log.

  • Sujit,

    Where is the following error getting reported from? Is it coming from some application code or from inside FC/CE? I feel it is application error.

    Also, what are the useCache settings in your <application>.cfg file? Are you using cache buffers for memTabs? Can you try increasing pool sizes of CMEM for specific memTab that resulted in error?

    sujit mahapatro said:

    *** glibc detected *** ./mainServer: corrupted double-linked list: 0x003c7978 ***

    Regards,

    Anshuman

  •  

    Hi Anshuman,

    this message is coming from application.while deleting the encoder buffers.

    we are doing following 3 steps while deleting the encoder profile.

      m_subEncoder[runtimeId]->close();
     
      delete m_subEncoder[runtimeId];
      // anshuman always this glibc error comes here after delete m_subEncoder & before m_subEncoder //
      m_subEncoder[runtimeId] = NULL;

    and inside close we are doing 

    ---------------------------------------

    VIDENC1_delete(m_pHandle);

    Buffer_delete(m_pOutBuffer);
      m_pOutBuffer = 0;

    delete m_pOutArgs;
      m_pOutArgs = 0;

    delete m_pInArgs;
      m_pInArgs = 0;

    delete m_pStatus;
      m_pStatus = 0;

     delete m_pDynParams;
      m_pDynParams = 0;

    delete m_pParams;
      m_pParams = 0;

    m_pEngine = NULL;

    --------------------------------

    i will let you know the application.cfg file setting after reaching office may be in 3hrs after.

    Do you mean 0x003c7978 address mentioned in the glibc error is related to some memTabs buffer in CMEM pool ?

     

  • Hi Anshuman,

    attached the application.cfgle.

    application.cfg
  • Hi Anshuman,

    currently we have done a work around to reduce the frequency of occurence of this problem by adjustting the timing for memory handling mainly closing  memory handling part in advance, prior to codec close.

    we suspect incomplete memory close of previous codec is causing this problem.

    Can you explain detail how the memory operation is terminated when codec is closed ?

    Still we are not sure, exactly what part of codec is creating the problem.

  • Sujith,

    - Codec requests for buffers using memTab[]. Actual allocation happens inside application.  We verified that all the pointers which we get at the time of create is given back to the application.

    - Is it possible to run the app with CE_DEBUG=3. Such memory related issues can be easily found there.

    - Can you please try the below and see if it helps. I dont think this is the problem, but just to try out.

    ------------------------

    var OSAL_SETTINGS = xdc.useModule('ti.sdo.ce.osal.linux.Settings');

    OSAL_SETTINGS.maxCbListSize = 200;

    -------------------------

     

    regards

    Yashwant

  • let me try with your above suggestion, and inform you.

    also CE_DEBUG=3 while running after the dubug messages camera reboots and so in the log you will not be able to find the glibc error message.

    already i have shared the CE_DEBUG level 3 log for the same in this thread.

    if you want once again i can share with you.

  • Hi Yashwant,

    the following changes does't help.

    var OSAL_SETTINGS = xdc.useModule('ti.sdo.ce.osal.linux.Settings');
    OSAL_SETTINGS.maxCbListSize = 200;

    As you said with the above changes 1st i tried to capture the error message and then again run the CE_DEBUG=3 and captured in the same log.

    currently h264venc_ti_arm926_debug.a library has been used for debug purpose.

  • Sujith,

    I am not able to see the codec debug logs, can you use h264venc_ti_arm926_debug.a and give us the log before the glibc error. You should see lot of "CODEC_DEBUG_ENABLE" in the log file.

    regards

    Yashwant

  • Yashwant,

    i don't see the CODEC_DEBUG_ENABLE message in the log.

    where as the h264enc has been build with h264venc_ti_arm926_debug.a.

    after building h264enc inside dvsdk_2_10_01_18\dm365_codecs_01_00_06\packages\ti\sdo\codecs\h264enc\apps\client\build\arm926 folder

    dvsdk has been build by complete clean and make.

    also the *.ko files has been updated inside nfs directory.

    i don't know whats the reason i am not finding the right  debug log.

    for your reference i have attched the h264enc package i have build and using in dvsdk.

     

    h264enc.rar
  • Yaxhwant,

    i am waiting for your reply on solving this issue.

    as i mention CE_DEBUG=3 , i am not getting the log with CODEC_DEBUG_ENABLE .

    i also shared the codec package with you.

    how can we proceed further ?

    with out the correct DEBUG enabled log you also can not find the proper cause.

     

  • Yashwant,

    i am waiting for your reply on solving this issue.

    as i mention CE_DEBUG=3 , i am not getting the log with CODEC_DEBUG_ENABLE .

    i also shared the codec package with you.

    how can we proceed further ?

    with out the correct DEBUG enabled log you also can not find the proper cause.

     

  • Hi Sujit,

    To build application with codec debug library, you need need rename h264venc_ti_arm926_debug.a as h264venc_ti_arm926.a have you did that?. The package you attached \h264enc\lib\*  still has h264venc_ti_arm926_debug.a library. Please take backup of h264venc_ti_arm926.a lib and rename h264venc_ti_arm926_debug.a to h264venc_ti_arm926.a and build the application.

  • Veerana,

    how does it metter wether i rename in the lib folder or i directly change the makefile ?

    if you look in to the make file  in side " h264enc\apps\client\build\arm926"  i have changed the library name from h264venc_ti_arm926.a to h264venc_ti_arm926_debug.a

    ALG_LIB1 = ../../../../lib/h264venc_ti_arm926_debug.a

    ALG_LIB2 = ../../../../lib/h264v_ti_dma_dm365.a

    so it's taking the same debug library what was suppose to be taken by renaming.

     

  • Hi,

    Yes you can change in makefile also, I just got doubt on it. Not aure why you are not able get debug prints. We are able to see when we do that, it really helps if we get debug prints from codec.

  • Hi Verrana,yashwant

    i tried even what Veraana said, both are same but still i tried. the result is same.

    attached the log.

  • Ohh Sorry Veranna,

    CODEC_DEBUG_ENABLE is happening in the log after what you said.

    very sorry for the precious wrong message, by mistake i checked the previous log. Don't know why just changing the lib name worked but changing the make file doesn't work.

     

    So yashwant, the putty_4.log attched in the previous message has the DEBUG messages as asked by you.

    will it help you to find out the possible cause ?

    please let me know further if you want any more info for our analysis.

     

  • Hi Yashwant,

    attched the DEBUG log putty_5.log where i have captured the glibc error at run time while continuously add - deleting the h264 profiles.

    i hope this log will lead us the possible cause.

  • Hi Sujith,

    We went through the logs,

    1. In the previous post, you said the the glicb is coming while doing codec delete. But in the above trace, i see it coming just after the control goes to HDVICP. Are we missing something ? or  Is it because the glibc print comes from kernel and gets displayed on screen before the scheduled codec prints comes?

    2. Does you app has create and delete happen in different thread than process call ? looks like. Is it possible to have them in a same thread ? We want the process and codec create not to happen concurrently.

    regards

    Yashwant

     

  • Hi Yashwant,

    both create and delete are happening in the same thread. so no doubt for process and create to happen concurrently.

    As you guessed the kernel message printing is happening at proper place.

    but i dubugged the point and it was happening after delete.

    how can i share with you the source code , the way our delete is happening ?

     

  • Sujit,

    Not sure if this post is related to your problem or not, but thought of bringing this up

    http://e2e.ti.com/support/embedded/f/354/p/63547/229656.aspx#229656

    Part from that post which i would like you to try.

    Can you try changing this line in Buffer_create():

            hBuf->physPtr = Memory_getBufferPhysicalAddress(hBuf->userPtr,
                                                            4, NULL);

    to read:

            hBuf->physPtr = Memory_getBufferPhysicalAddress(hBuf->userPtr,
                                                            size, NULL);

    and recompiling DMAI?

  • Hi Anshuman,

    Thanks for your finding. this problem is not happening with the above suggesed changes.

    but still i am not able to understand how does the size parameter helps here.

    if possible please provide some input for my understanding.