How can I accelerate the speeds of "VLIB_mixtureOfGaussiansS32"?

Dabo Guo

Hi,

I use DM6437 to track a moving object, for example, an aeroplane, but the function "VLIB_mixtureOfGaussiansS32" consumes too much time, longer than 40 ms. It is too long for controling an autonomous plane. How can I accelerate the speeds of "VLIB_mixtureOfGaussiansS32" ?

Thanks

Dabo Guo

over 14 years ago

0 Viet Dinh over 14 years ago

TI__Genius 15310 points

Dabo,

Could you let us know what is your video resolution and which part of DM64x are you using?

Assuming you are using a 720x480 image and a part with a 400Mhz DSP, the numbers seem about right.

If you look at the VLIB offering there are two 2 Gaussian modeling function VLIB_mixtureOfGaussiansS32 and VLIB_mixtureOfGaussiansS16. I believe the 2 functions are basically the same but the 32 bit functions provides higher accuracy of mean and variance of the Gaussian model. Are you using the 16 bit function. The performance difference between the 2 functions is 8 cycles/ pixel which could be significant improvement(7ms) for an 720x480 image. Other potential improvements can be achieved by optimizing the memory bandwidth by using DMA to move data from external to internal memory.

Regards,

Viet

0 Dabo Guo over 14 years ago in reply to Viet Dinh

Prodigy 100 points

Viet,

Thanks for your answer.

I use DM6437 and the image size is 640*576. According to your assumption, it will cost 5ms for a frame of image. But from VLIB reference guide(2.0), On-chip memory performance has been measured as 31.30 cycles/pixel for VLIB_mixtureOfGaussiansS16 and On-chip memory performance has been measured as 39.13 cycles/pixel for VLIB_mixtureOfGaussiansS32. According to the guide, it will cost 20ms by using 16 bit function and 25 ms by using 32 bit function. Additionally, I use the evaluation board, videos are captured by VPFE and display by VPBE.

The main routine is listed as follows:

     FVID_exchange(hGioVpfeCcdc, &frameBuffPtr);
     src=frameBuffPtr->frame.frameBufferPtr;
     VLIB_extractLumaFromUYUV(src,640,640,576,imageData);
     test_gaussian_mixture_models(imageData,currentMeans,currentVars,currentWgts,compIndex,intBuffer,height,width,fgMask);
     test_dilate_and_erode(fgMask,height,width,imageTempData,imageOutputData);
    test_connected_components_labeling(UartHandle,imageOutputData,height,width,primaryBuff1,primaryBuff2,overFlowBuff1,overFlowBuff2,handle,src);
     FVID_exchange(hGioVpbeVid0, &frameBuffPtr);

best regards,

Dabo Guo

0 Dabo Guo over 14 years ago in reply to Dabo Guo

Prodigy 100 points

Hi, all

I am looking forward to your answers.

Videos are captured by VPFE and display by VPBE, so data are transpoted by edma. I expect vlib2.1 can be optimized than vlib 2.0 so that the speed can be greatly improved.

Waiting.......

Dabo Guo

Processors

Processors forum

How can I accelerate the speeds of "VLIB_mixtureOfGaussiansS32"?