This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VM: OpenVx vxMapArrayRange overhead.

Part Number: TDA4VM


Hi TI Team,

I am currently trying to integrating ti openvx's vision functions to my application to reduce down to the host arm's cpu resource use.

Integrating it self is quite straight forward, but I found that overhead for I/O takes big portion.

For example, If I run FAST or Harris opencv function on Arm, it takes arround 3% of arm,

and if I run those on DSP with openvx, it uses dsp resource (arround 6%) , but it also increase 3% of arm resource use to I/O with vxMapArrayRange function. 

(I call vxMapArrayRange / vxUnmapArrayRange every time after FAST graph run.)

vxQueryArray(vxArr_kp_corners_, VX_ARRAY_NUMITEMS, &num_items1, sizeof(num_items1));

unsigned int index_vxarray_start = 0;
unsigned int index_vxarray_end = num_items1;
vx_map_id map_id;
vx_size stride = 0;
char *point_array = 0;

if(num_items1>0)
{
//mapping dsp mem to host mem.
vxMapArrayRange(vxArr_kp_corners_, index_vxarray_start, index_vxarray_end, &map_id, &stride, (void**)&point_array, VX_READ_ONLY, VX_MEMORY_TYPE_HOST, 0);

//copy items from map to the std vector
kps_vx_.resize(num_items1);
for(int i = 0; i < num_items1; i++)
{
  vx_keypoint_t *keypoint = (vx_keypoint_t*)point_array;
  point_array += stride;
  auto & cvpt = kps_vx_[i];
  cvpt.pt.x = keypoint->x;
  cvpt.pt.y = keypoint->y;
}

vxUnmapArrayRange(vxArr_kp_corners_, map_id);

}

So, there is no merit to use openVx in this situation.

Do you have any idea / trick to remove those overhead? 

or 

Can I just copy the detected result without vxMapArrayRange / vxUnmapArrayRange ?