This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

[ISSUE]DVTB decode speed too slow

 

Hi, experts

1.I'm using dvsdk_3_10_00_19/dvtb_4_20_10 for DM6467T

2. I run

    <Target>#./dvtb-r -s h264dec2.dvs,

    and find out the decode speed is about 4fps(I have delete the code which will output YUV data to YUV file)

3.can anyone give some suggestion?

BTW: I'm using NFS and the LAN payload is only 300KB/S, so this isn't the bottleneck.

 

  •  

    I finally locate the issue,

    dvsdk_3_10_00_19/dvtb_4_20_10/Packages/Ti/Sdo/Dvtb/Dm6467/Linux/dvtbVidPlay2.c

    in function

    DvevmStRetCode
    dvtb_vidDec2WriteOutputData(IVIDEO1_BufDesc *displayBuf, FILE *fOutFile, VIDDEC2_Status *vdec2Status)

    {

     //Init input subPic_buf with  outArgs
     lumaData = displayBuf->bufDesc[0].buf;
     cbcrData = displayBuf->bufDesc[1].buf;

    }

    I change the code in this way

    DvevmStRetCode
    dvtb_vidDec2WriteOutputData(IVIDEO1_BufDesc *displayBuf, FILE *fOutFile, VIDDEC2_Status *vdec2Status)

    {

     char buff[1920],data[1920];

    //Init input subPic_buf with  outArgs
     lumaData = displayBuf->bufDesc[0].buf;
     cbcrData = displayBuf->bufDesc[1].buf;

    memcpy(buff,lumaData,1920);//code 1

    memcpy(buff,data,1920);//code 2

    }

    and I find out that it takes 10 times more time to execute code 1 than code2.

    Can any one gives a suggestion?

  • xiaoyang,

              I feel that the lumaData buffer (In code1) is comming from CMEM region, whcih might be allocated as non-cached buffer. while the data buffer (In code 2) is comming from linux region, which i think must be cached, that is the reason being this huge time gap.

             You might check the code for displayBuf->bufDesc[0].buf allocation code to see if cached allocation improves. I am not sure side effects of making this buffer cached if required

    Best regards

    Velan

  • Hi,Velan

    Thanks for your reply.

    I have verified this on another processor which has an ARM920T core yesterday 

        First, I assiged the page  of src buffer as no cached no buffered

        Second, I assigned the page src buffer as cached and write through

    And the speed is the same(in both cases, the code section and the dst buffer is in cached and write through page)

    But as DM6467 has a ARM926 core, this may be different, I will look deep into the code and try.

  • Hi, Velan

         I test just now by enabling the cache, and the memcpy() speed up dramatically,

     the two codes now run at almost the same speed.I will test whether it will take in side effect.