This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

PROCESSOR-SDK-J721E: CaptureNode_pileline_FPS

Part Number: PROCESSOR-SDK-J721E

According to TI multi camera demo, refer to captureNode,

A new kernel (bydCameraTransferNode (vx_graph graph, vx_object_array output)) has been written,

The Process function on the target side mainly obtains the images of four cameras from the interface provided by the third-party camera service,

The third party service provides an interface for acquiring images: autoCameraManager getOpenvxImage(image_type, isWait, &outType);

Where isWait: whether to wait for a new frame.

We have written a flow chart, which only includes the bydCameraTransferNode node. Then, we add the start time before entering the queue and the end time after leaving the queue to count the time from entering the queue to leaving the queue. At the same time, we count the number of frames executed within one second.

At the same time, we also add the start time at the TI multi camera demo queue entry point, and add the end time at the queue exit point. Similarly, we only retain the captureNode node to count the time from a queue entry to a queue exit, and count the number of frames executed within one second.

Then, we compared the data between the two, and found that the time spent from queue entry to queue exit of TI multi camera demo is about 33ms, while the time spent from queue entry to queue exit of nodes written by ourselves is not uniform.

In addition, the CPU usage of our graph during operation is 15%, much higher than that of TI multi-camera demo. For pipeline scheduling, what logical strategies should be done inside the node to reduce the CPU usage?

We are confused about this. What mechanism is used to control the time uniformity of captureNode? How do we control the time uniformity of the nodes we write?

byd_camera_transfer_target.c
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/stat.h>
#include <sys/time.h>
#include "VX/vx.h"
#include "TI/tivx.h"
#include "TI/tivx_target_kernel.h"
#include <TI/tivx_task.h>
#include "tivx_kernels_target_utils.h"
#include "byd/byd_camera_transfer.h"
#include "byd_camera_transfer_kernels_priv.h"
#include "byd_camera_transfer_host.h"
#include <utils/mem/include/app_mem.h>
#include "camera_client.h"
#include <common_log.h>
// #include <TI/tivx_mutex.h>
// #include <inttypes.h>

#define NUM_CHANNELS (4U)

#define TIVX_FILEIO_FILE_PATH_LENGTH    (512U)
#define MAX_FNAME                     (256u)

static tivx_target_kernel byd_camera_transfer_target_kernel = NULL;
// static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;


struct timeval now_tv3 = {0};
struct timeval pre_tv3 = {0};
uint32_t tv_ms3;

struct timeval now_tv4 = {0};
struct timeval pre_tv4 = {0};
uint32_t tv_ms4;

struct timeval now_tv0 = {0};
struct timeval pre_tv0 = {0};
uint32_t tv_ms0;
static int kernel_process_cnt = 0;

static uint64_t kernel_process_id = 0;


#if 0
static int file_index = 0;
struct timeval now_tv8 = {0};
struct timeval pre_tv8 = {0};
uint32_t tv_ms6;

static char *app_get_test_file_path() {
    char *tivxPlatformGetEnv(char *env_var);

    #if defined(SYSBIOS)
    return tivxPlatformGetEnv("VX_TEST_DATA_PATH");
    #else
    return getenv("VX_TEST_DATA_PATH");
    #endif
}

#endif

int get_time_diff2(struct timeval *pT1, struct timeval *pT2) {
    int t_ms;
    if (pT1->tv_sec >= pT2->tv_sec) {
        t_ms = (pT1->tv_sec * 1000000 + pT1->tv_usec) - (pT2->tv_sec * 1000000 + pT2->tv_usec);
    } else {
        t_ms = (pT2->tv_sec * 1000000 + pT2->tv_usec) - (pT1->tv_sec * 1000000 + pT1->tv_usec);
    }

    return t_ms;
}

vx_status ptkdemo_load_vximage_from_yuvfile(vx_image image, char *filename)
{
    vx_status  vxStatus = (vx_status)VX_SUCCESS;

    vx_rectangle_t             rect;
    vx_imagepatch_addressing_t image_addr;
    vx_map_id                  map_id;
    void                     * data_ptr;
    vx_uint32                  img_width;
    vx_uint32                  img_height;
    vx_df_image                img_format;
    vx_int32                   j;

    FILE *fp= fopen(filename, "rb");
    if(fp==NULL)
    {
        VX_PRINT(VX_ZONE_ERROR, "# ERROR: Unable to open input file [%s]\n", filename);
        return(VX_FAILURE);
    }

    vxQueryImage(image, VX_IMAGE_WIDTH, &img_width, sizeof(vx_uint32));
    vxQueryImage(image, VX_IMAGE_HEIGHT, &img_height, sizeof(vx_uint32));
    vxQueryImage(image, VX_IMAGE_FORMAT, &img_format, sizeof(vx_df_image));

    rect.start_x = 0;
    rect.start_y = 0;
    rect.end_x = img_width;
    rect.end_y = img_height;

    // Copy Luma or Luma+Chroma
    vxStatus = vxMapImagePatch(image,
                               &rect,
                               0,
                               &map_id,
                               &image_addr,
                               &data_ptr,
                               VX_WRITE_ONLY,
                               VX_MEMORY_TYPE_HOST,
                               VX_NOGAP_X);

    for (j = 0; j < image_addr.dim_y; j++)
    {
        fread(data_ptr, 1, image_addr.dim_x*image_addr.stride_x, fp);
        data_ptr += image_addr.stride_y;
    }
    vxUnmapImagePatch(image, map_id);


    // Copy Chroma for NV12
    if (img_format == VX_DF_IMAGE_NV12)
    {
        vxStatus = vxMapImagePatch(image,
                                   &rect,
                                   1,
                                   &map_id,
                                   &image_addr,
                                   &data_ptr,
                                   VX_WRITE_ONLY,
                                   VX_MEMORY_TYPE_HOST,
                                   VX_NOGAP_X);

        for (j = 0; j < img_height/2; j++)
        {
            fread(data_ptr, 1, image_addr.dim_x, fp);
            data_ptr += image_addr.stride_y;
        }

        vxUnmapImagePatch(image, map_id);
    }

    fclose(fp);
    return vxStatus;
}

static vx_status VX_CALLBACK bydCameraTransferProcess(
       tivx_target_kernel_instance kernel,
       tivx_obj_desc_t *obj_desc[],
       uint16_t num_params, void *priv_arg);

static vx_status VX_CALLBACK bydCameraTransferCreate(
       tivx_target_kernel_instance kernel,
       tivx_obj_desc_t *obj_desc[],
       uint16_t num_params, void *priv_arg);

static vx_status VX_CALLBACK bydCameraTransferDelete(
       tivx_target_kernel_instance kernel,
       tivx_obj_desc_t *obj_desc[],
       uint16_t num_params, void *priv_arg);

static vx_status VX_CALLBACK bydCameraTransferProcess(
       tivx_target_kernel_instance kernel,
       tivx_obj_desc_t *obj_desc[],
       uint16_t num_params, void *priv_arg) {

    gettimeofday(&now_tv3, NULL);
    pre_tv3.tv_sec = now_tv3.tv_sec;
    pre_tv3.tv_usec = now_tv3.tv_usec;
    LOGD("%s -------------------pre_tv3.tv_sec %ld, pre_tv3.tv_usec %ld \n", __FUNCTION__, pre_tv3.tv_sec, pre_tv3.tv_usec);

    kernel_process_id++;

    gettimeofday(&now_tv0, NULL);
    if (pre_tv0.tv_sec == 0 && pre_tv0.tv_usec == 0) {
        pre_tv0.tv_sec = now_tv0.tv_sec;
        pre_tv0.tv_usec = now_tv0.tv_usec;
        // LOGD("pre_tv0.tv_sec %ld \n", pre_tv0.tv_sec);
        // LOGD("pre_tv0.tv_usec %ld \n", pre_tv0.tv_usec);
    }

    tv_ms0 = get_time_diff2(&now_tv0, &pre_tv0) / 1000;
    // LOGD("tv_ms0 %d \n", tv_ms0);
    if (tv_ms0 > 1000) {
        LOGD("==================1s kernel process cnt: %d \n", kernel_process_cnt);
        pre_tv0.tv_sec = now_tv0.tv_sec;
        pre_tv0.tv_usec = now_tv0.tv_usec;
        kernel_process_cnt = 0;
    } else {
        if (kernel_process_id >= 0 && kernel_process_id <= 3000) {
            LOGD("==================current kernel_process_id: %ld \n", kernel_process_id);
        }
    }
    kernel_process_cnt++;


    vx_status status = (vx_status)VX_SUCCESS;
    tivx_obj_desc_object_array_t *out_object_array_desc;
    tivx_obj_desc_image_t *out_image_desc[4];
    tivx_obj_desc_image_t *image_out_object_array_desc[TIVX_OBJECT_ARRAY_MAX_ITEMS];
    uint32_t i;
    vx_enum state;

    if ( (num_params != BYD_KERNEL_CAMERA_TRANSFER_MAX_PARAMS)
        || (NULL == obj_desc[BYD_KERNEL_CAMERA_TRANSFER_OUTPUT_IDX]) ) {
        status = (vx_status)VX_FAILURE;
    }

    if (VX_SUCCESS == status) {
        out_object_array_desc = (tivx_obj_desc_object_array_t *)obj_desc[BYD_KERNEL_CAMERA_TRANSFER_OUTPUT_IDX];
    }

    if (VX_SUCCESS == status) {
        tivxGetObjDescList(out_object_array_desc->obj_desc_id, (tivx_obj_desc_t**)image_out_object_array_desc, out_object_array_desc->num_items);

        for (i = 0; i < NUM_CHANNELS; i++) {
            out_image_desc[i] = image_out_object_array_desc[i];
        }

        if (VX_SUCCESS == status) {
            status = tivxGetTargetKernelInstanceState(kernel, &state);

            if (VX_SUCCESS == status) {
                if (VX_NODE_STATE_STEADY == state) {
                    gettimeofday(&now_tv4, NULL);
                    pre_tv4.tv_sec = now_tv4.tv_sec;
                    pre_tv4.tv_usec = now_tv4.tv_usec;
                    LOGD("%s -------------------pre_tv4.tv_sec %ld, pre_tv4.tv_usec %ld \n", __FUNCTION__, pre_tv4.tv_sec, pre_tv4.tv_usec);

                    // camera_image_dequeue(out_image_desc, NUM_CHANNELS);

                    // pthread_mutex_lock(&mutex);
                    // pthread_t actual_tid = pthread_self();
                    // uint64_t current_id = 0;
                    // memcpy(&current_id, &actual_tid, sizeof(actual_tid));
                    // LOGD("%s syncronized camera_image_dequeue ================================ Thread id = ", __FUNCTION__);
                    // LOGD("%" PRIu64 "\n", current_id);
                    camera_image_dequeue(out_image_desc, NUM_CHANNELS);
                    // pthread_mutex_unlock(&mutex);

                    for (i = 0; i < NUM_CHANNELS; i++) {
                       LOGD("%s ------------------------out_image_desc[%d]->base.timestamp %ld \n", __FUNCTION__, i, out_image_desc[i]->base.timestamp);
                    }

#if 0
        gettimeofday(&now_tv8, NULL);
        if (pre_tv8.tv_sec == 0 && pre_tv8.tv_usec == 0) {
            pre_tv8.tv_sec = now_tv8.tv_sec;
            pre_tv8.tv_usec = now_tv8.tv_usec;
            // LOGD("pre_tv8.tv_sec %ld \n", pre_tv8.tv_sec);
            // LOGD("pre_tv8.tv_usec %ld \n", pre_tv8.tv_usec);
        }

        tv_ms6 = get_time_diff2(&now_tv8, &pre_tv8) / 1000;
        // LOGD("tv_ms6 %d \n", tv_ms6);
        if (tv_ms6 > 2000) {
            char file_name[TIVX_FILEIO_FILE_PATH_LENGTH * 2];
            char failsafe_test_data_path[3] = "./";
            char * test_data_path = app_get_test_file_path();
            struct stat s;

            if (NULL == test_data_path) {
                LOGD("Test data path is NULL. Defaulting to current folder \n");
                test_data_path = failsafe_test_data_path;
            }

            if (stat(test_data_path, &s)) {
                LOGD("Test data path %s does not exist. Defaulting to current folder \n", test_data_path);
                test_data_path = failsafe_test_data_path;
            }

            if (out_image_desc[0]->format == VX_DF_IMAGE_NV12) {
                strcat(file_name, ".yuv");
            } else if (out_image_desc[0]->format == VX_DF_IMAGE_RGB) {
                strcat(file_name, ".rgb");
            } else {
                strcat(file_name, ".bin");
            }

            snprintf(file_name, MAX_FNAME, "%s/%s_%04d.yuv", test_data_path, "cap_img_uyvy", file_index);

            LOGD("Writing %s ..\n", file_name);

            void* in_img_target_ptr[2];
            in_img_target_ptr[0]  = tivxMemShared2TargetPtr(&out_image_desc[0]->mem_ptr[0]);
            tivxMemBufferMap(in_img_target_ptr[0], out_image_desc[0]->mem_size[0], VX_MEMORY_TYPE_HOST, VX_READ_ONLY);
            in_img_target_ptr[1]  = NULL;
            if (out_image_desc[0]->mem_ptr[1].shared_ptr != 0) {
                in_img_target_ptr[1]  = tivxMemShared2TargetPtr(&out_image_desc[0]->mem_ptr[1]);
                tivxMemBufferMap(in_img_target_ptr[1], out_image_desc[0]->mem_size[1], VX_MEMORY_TYPE_HOST, VX_READ_ONLY);
            }

            FILE *fp = fopen(file_name, "wb");
            if (fp == NULL) {
                LOGD("Unable to write file %s\n", file_name);
            } else {
                uint32_t width  = out_image_desc[0]->imagepatch_addr[0].dim_x;
                uint32_t height = out_image_desc[0]->imagepatch_addr[0].dim_y;
                uint32_t stride = out_image_desc[0]->imagepatch_addr[0].stride_y;
                uint8_t *pData  = in_img_target_ptr[0];
                int32_t i;

                for (i = 0; i < height; i++) {
                    fwrite(pData, 1, width, fp);
                    pData += stride;
                }

                if (in_img_target_ptr[1] != NULL) {
                    pData = in_img_target_ptr[1];
                    height = height / 2;
                    for (i = 0; i < height; i++) {
                        fwrite(pData, 1, width, fp);
                        pData += stride;
                    }
                }
                fflush(fp);
                fclose(fp);
            }
            LOGD("Done!\n");

            tivxMemBufferUnmap(in_img_target_ptr[0], out_image_desc[0]->mem_size[0], VX_MEMORY_TYPE_HOST, VX_READ_ONLY);
            if (in_img_target_ptr[1] != NULL){
              tivxMemBufferUnmap(in_img_target_ptr[1], out_image_desc[0]->mem_size[1], VX_MEMORY_TYPE_HOST, VX_READ_ONLY);
            }

            pre_tv8.tv_sec = now_tv8.tv_sec;
            pre_tv8.tv_usec = now_tv8.tv_usec;
            file_index++;

            if (file_index > 9999) {
                file_index = 0;
            }
        }
#endif

                    gettimeofday(&now_tv4, NULL);

                    LOGD("%s =======================now_tv4.tv_sec %ld, now_tv4.tv_usec %ld \n", __FUNCTION__, now_tv4.tv_sec, now_tv4.tv_usec);
                    // tv_ms4 = get_time_diff2(&now_tv4, &pre_tv4) / 1000;
                    tv_ms4 = get_time_diff2(&now_tv4, &pre_tv4);
                    LOGD("%s **************************** tv_us4 %d \n", __FUNCTION__, tv_ms4);
                    // LOGD("sleep 5ms state: %d \n", state);
                    // tivxTaskWaitMsecs(5);
                } else {
                    // LOGD("sleep 10ms state: %d \n", state);
                    // tivxTaskWaitMsecs(10);
                }
            }
        }
    }

    gettimeofday(&now_tv3, NULL);
    LOGD("%s =======================now_tv3.tv_sec %ld, now_tv3.tv_usec %ld \n", __FUNCTION__, now_tv3.tv_sec, now_tv3.tv_usec);
    // tv_ms3 = get_time_diff2(&now_tv3, &pre_tv3) / 1000;
    tv_ms3 = get_time_diff2(&now_tv3, &pre_tv3);
    LOGD("%s **************************** tv_us3 %d \n", __FUNCTION__, tv_ms3);

    return status;
}

static vx_status VX_CALLBACK bydCameraTransferCreate(
       tivx_target_kernel_instance kernel,
       tivx_obj_desc_t *obj_desc[],
       uint16_t num_params, void *priv_arg) {

    vx_status status = (vx_status)VX_SUCCESS;
    if ( (num_params != BYD_KERNEL_CAMERA_TRANSFER_MAX_PARAMS)
        || (NULL == obj_desc[BYD_KERNEL_CAMERA_TRANSFER_OUTPUT_IDX]) ) {
        status = (vx_status)VX_FAILURE;
    } else {
        camera_client_start();
    }

    return status;
}

static vx_status VX_CALLBACK bydCameraTransferDelete(
       tivx_target_kernel_instance kernel,
       tivx_obj_desc_t *obj_desc[],
       uint16_t num_params, void *priv_arg) {

    vx_status status = (vx_status)VX_SUCCESS;
    if ( (num_params != BYD_KERNEL_CAMERA_TRANSFER_MAX_PARAMS)
        || (NULL == obj_desc[BYD_KERNEL_CAMERA_TRANSFER_OUTPUT_IDX]) ) {
        status = (vx_status)VX_FAILURE;
    }
    camera_client_stop();

    return status;
}

void bydAddTargetKernelCameraTransfer(void) {
    vx_status status = VX_FAILURE;
    char target_name[TIVX_TARGET_MAX_NAME];
    vx_enum self_cpu;

    self_cpu = tivxGetSelfCpuId();

    if (self_cpu == TIVX_CPU_ID_A72_0) {
        strncpy(target_name, TIVX_TARGET_A72_0, TIVX_TARGET_MAX_NAME);
        status = VX_SUCCESS;
    } else {
        status = VX_FAILURE;
    }

    if (status == VX_SUCCESS) {
        byd_camera_transfer_target_kernel = tivxAddTargetKernelByName(
                            BYD_CAMERA_TRANSFER,
                            target_name,
                            bydCameraTransferProcess,
                            bydCameraTransferCreate,
                            bydCameraTransferDelete,
                            NULL,
                            NULL);
    }
}

void bydRemoveTargetKernelCameraTransfer(void) {
    vx_status status = VX_SUCCESS;
    status = tivxRemoveTargetKernel(byd_camera_transfer_target_kernel);
    if (status == VX_SUCCESS) {
        byd_camera_transfer_target_kernel = NULL;
    }
}


camera_client.cpp

  • Hi,

    Few questions here to get clarified regarding the issue.

    1. Are you running this kernel on A72? The capture node runs on R5F.

    2. I see in the dequeue function, there is a mem_cpy being used. I suspect this could be a reason why you are seeing the issue.
        In capture node, the data from the sensor is direclty written into the shared memory region and is accessed by the csirx driver from there.

    3. May I know why is the kernel on A72? Could you elaborate on the usecase and why capture node cannot be used?

    Regards,
    Nikhil

  • Hi,

    1.Yes.

    3.We're building the graphics application on a third-party system,
    The third party system is developed based on TI platform. They use TI CaptureNode to obtain camera images and create a camera service, which is mainly responsible for distributing camera images obtained from CaptureNode.
    According to the camera service client provided by the third party, we establish binder communication with the camera service to obtain the images they get from captureNode. The image we get, namely vx_image, has already allocated the address space (with actual image data).
    According to the pipeline graph in TI multi-camera demo, we do not know how to get vx_image as the initial entry of graph to effectively access and work efficiently. Therefore, the initial purpose is to make graph run first.
    To simulate captureNode and write bydCameraTransferNode (running on A72), by accessing the vx_image address,Copy the data from the address to the vx_object_array output passed to bydCameraTransferNode.

  • Hi,

    Sorry for the delay in response.

    the CPU usage of our graph during operation is 15%, much higher than that of TI multi-camera demo

    I'm suspecting this due to the mem_cpy being performed in your code.

    Right now, I understood that the third party obtains the data from the capture node and distributes it.

    Is the third-party node also running on A72?

    o simulate captureNode and write bydCameraTransferNode (running on A72), by accessing the vx_image address,Copy the data from the address to the vx_object_array output passed to bydCameraTransferNode

    Could you please elaborate further, what are you trying to simulate here?

    Also, again sorry for the delay, but could you also update your current status regarding this issue?

    Regards,

    Nikhil