TDA4VM: OpenVx vxMapArrayRange overhead.

Junhee Lee55

Part Number: TDA4VM

Hi TI Team,

I am currently trying to integrating ti openvx's vision functions to my application to reduce down to the host arm's cpu resource use.

Integrating it self is quite straight forward, but I found that overhead for I/O takes big portion.

For example, If I run FAST or Harris opencv function on Arm, it takes arround 3% of arm,

and if I run those on DSP with openvx, it uses dsp resource (arround 6%) , but it also increase 3% of arm resource use to I/O with vxMapArrayRange function.

(I call vxMapArrayRange / vxUnmapArrayRange every time after FAST graph run.)

vxQueryArray(vxArr_kp_corners_, VX_ARRAY_NUMITEMS, &num_items1, sizeof(num_items1));

unsigned int index_vxarray_start = 0;

unsigned int index_vxarray_end = num_items1;

vx_map_id map_id;

vx_size stride = 0;

char *point_array = 0;

if(num_items1>0)

{

//mapping dsp mem to host mem.

vxMapArrayRange(vxArr_kp_corners_, index_vxarray_start, index_vxarray_end, &map_id, &stride, (void**)&point_array, VX_READ_ONLY, VX_MEMORY_TYPE_HOST, 0);

//copy items from map to the std vector

kps_vx_.resize(num_items1);

for(int i = 0; i < num_items1; i++)

{

vx_keypoint_t *keypoint = (vx_keypoint_t*)point_array;

point_array += stride;

auto & cvpt = kps_vx_[i];

cvpt.pt.x = keypoint->x;

cvpt.pt.y = keypoint->y;

}

vxUnmapArrayRange(vxArr_kp_corners_, map_id);

}

So, there is no merit to use openVx in this situation.

Do you have any idea / trick to remove those overhead?

Can I just copy the detected result without vxMapArrayRange / vxUnmapArrayRange ?

over 4 years ago

0 Brijesh Jadav over 4 years ago

TI__Guru**** 484325 points

Hi,

I doubt that mapping of array is taking a lot time. You are anyway mapping it for readonly, so should not take so much time. Could you please profile this API and see how much time it takes?

Regards,

Brijesh

Processors

Processors forum

TDA4VM: OpenVx vxMapArrayRange overhead.