[ISSUE]DVTB decode speed too slow

xiaoyang xie

Intellectual 280 points

Hi, experts

1.I'm using dvsdk_3_10_00_19/dvtb_4_20_10 for DM6467T

2. I run

<Target>#./dvtb-r -s h264dec2.dvs,

and find out the decode speed is about 4fps(I have delete the code which will output YUV data to YUV file)

3.can anyone give some suggestion?

BTW: I'm using NFS and the LAN payload is only 300KB/S, so this isn't the bottleneck.

over 14 years ago

0 xiaoyang xie over 13 years ago

Intellectual 280 points

I finally locate the issue,

dvsdk_3_10_00_19/dvtb_4_20_10/Packages/Ti/Sdo/Dvtb/Dm6467/Linux/dvtbVidPlay2.c

in function

DvevmStRetCode
dvtb_vidDec2WriteOutputData(IVIDEO1_BufDesc *displayBuf, FILE *fOutFile, VIDDEC2_Status *vdec2Status)

{

//Init input subPic_buf with outArgs
lumaData = displayBuf->bufDesc[0].buf;
cbcrData = displayBuf->bufDesc[1].buf;

}

I change the code in this way

DvevmStRetCode
dvtb_vidDec2WriteOutputData(IVIDEO1_BufDesc *displayBuf, FILE *fOutFile, VIDDEC2_Status *vdec2Status)

{

char buff[1920],data[1920];

//Init input subPic_buf with outArgs
lumaData = displayBuf->bufDesc[0].buf;
cbcrData = displayBuf->bufDesc[1].buf;

memcpy(buff,lumaData,1920);//code 1

memcpy(buff,data,1920);//code 2

}

and I find out that it takes 10 times more time to execute code 1 than code2.

Can any one gives a suggestion?

0 Velan over 13 years ago in reply to xiaoyang xie

TI__Intellectual 2785 points

xiaoyang,

I feel that the lumaData buffer (In code1) is comming from CMEM region, whcih might be allocated as non-cached buffer. while the data buffer (In code 2) is comming from linux region, which i think must be cached, that is the reason being this huge time gap.

You might check the code for displayBuf->bufDesc[0].buf allocation code to see if cached allocation improves. I am not sure side effects of making this buffer cached if required

Best regards

Velan

0 xiaoyang xie over 13 years ago in reply to Velan

Intellectual 280 points

Hi,Velan

Thanks for your reply.

I have verified this on another processor which has an ARM920T core yesterday

First, I assiged the page of src buffer as no cached no buffered

Second, I assigned the page src buffer as cached and write through

And the speed is the same(in both cases, the code section and the dst buffer is in cached and write through page)

But as DM6467 has a ARM926 core, this may be different, I will look deep into the code and try.

0 xiaoyang xie over 13 years ago in reply to Velan

Intellectual 280 points

Hi, Velan

I test just now by enabling the cache, and the memcpy() speed up dramatically,

the two codes now run at almost the same speed.I will test whether it will take in side effect.

Processors

Processors forum

[ISSUE]DVTB decode speed too slow