TDA4VEN-Q1: TDA4 VEN , DSP UDMA

Part Number: TDA4VEN-Q1


Can UDMA still be applied to C7 core?

I need to do some image copying with DSP, UDMA is faster.

But when I turn on this switch, it will prompt that the channel has failed to open. It doesn't seem to support it.

How should I use it?

For example, the appUdmaCopy2D interface .

SDK version11.0

 

  • Hi,

    UDMA can be applied to c7x cores. Which Channel are you using?  Which example are you trying to build?

    Regards,
    Sivadeep

  • Hi,
    I just opened this #define ENABLE_UDMA_COPY Macro and got an error.
    I want to solve this error。
  • Hi,

    Are you building any example from vision apps?

    Regards,

    Sivadeep

  • Please let me clarify this issue again
    My application is currently running normally, and I want to add a UDMA replica, so I opened this macro. But after opening it, he reported an error during the C7 kernel initialization phase. At present, I have not used the UDMA interface.
    Perhaps your design does not require opening this switch macro. Can I use UDMA related interfaces without opening this macro
    Can you provide a use case for UDMA replication on the C7 kernel of SDK version 11.0? I want to create a node to use UDMA for image copying
    Regards
    zhiqing
  • Hi,

    Let me check this and get back to you in a day.

    Regards,
    Sivadeep

  • Hi,

    For UDMA examples you can refer to : mcu_plus_sdk_j722s_11_01_00_15\examples\drivers\udma


    Please check if the channel you are using for UDMA is available in the device. In the default vision apps code, DMA is disabled for j722s.

    Regards,
    Sivadeep

  • Hi,

    I successfully used UDMA copy as a reference example.
    But he took too long, copying a 1920 * 960 image took 70 milliseconds.
    This is my code, is there a configuration issue?

  • Hi,

    Can you please share the code if possible.

    Regards,
    Sivadeep

  • Confirming again, copying YUV takes 10 milliseconds and copying BGRX takes 25 milliseconds. The time previously recorded was incorrect. Is this time normal now

    #include "TI/tivx.h"
    #include "TI/tivx_longhorn_img_select.h"
    #include "VX/vx.h"
    #include "tivx_longhorn_img_select_kernels_priv.h"
    #include "tivx_kernel_img_select.h"
    #include "TI/tivx_target_kernel.h"
    #include "tivx_kernels_target_utils.h"
    
    #include <utils/udma/include/app_udma.h>
    #include <utils/mem/include/app_mem.h>
    #include <kernel/dpl/CacheP.h>
    #include <kernel/dpl/DebugP.h>
    #include <kernel/dpl/SemaphoreP.h>
    #include "ti_drivers_config.h"
    #include "ti_drivers_open_close.h"
    #include "ti_board_open_close.h"
    
    // #define FAKE_IMG
    /* UDMA TR packet descriptor memory size - with one TR */
    #define UDMA_TEST_TRPD_SIZE (UDMA_GET_TRPD_TR15_SIZE(1U))
    
    /* 2D copy parameters structure */
    typedef struct {
        uint16_t width;
        uint16_t height;
        uint64_t dest_addr;
        uint32_t dest_pitch;
        uint64_t src_addr;
        uint32_t src_pitch;
    } img_select_2d_copy_params_t;
    
    typedef struct
    {
        LhDmaObj dmaObj;
        
        uint32_t img_width;
        uint32_t img_height;
        uint32_t img_stride;
        uint32_t select_idx;
        uint32_t format; // nv12:0; bgrx:1
        
        void *tmp_target_ptr[TIVX_IMAGE_MAX_PLANES];
        
        /* UDMA related members */
        Udma_ChHandle udmaChHandle;
        uint8_t *udmaTrpdMem;
        SemaphoreP_Object udmaDoneSem;
    } tivxImgSelectParams;
    
    static tivx_target_kernel vx_img_select_target_kernel = NULL;
    
    /* UDMA driver instance object */
    Udma_DrvObject gUdmaDrvObj[CONFIG_UDMA_NUM_INSTANCES];
    /* UDMA driver instance init params */
    static Udma_InitPrms gUdmaInitPrms[CONFIG_UDMA_NUM_INSTANCES] =
    {
        {
        .instId = UDMA_INST_ID_BCDMA_0,
        .skipGlobalEventReg = FALSE,
        
        .virtToPhyFxn = Udma_defaultVirtToPhyFxn,
        .phyToVirtFxn = Udma_defaultPhyToVirtFxn,
        },
    };
    
    /* UDMA BCDMA_0 Blockcopy Channel Objects */
    static Udma_ChObject gConfigUdma0BlkCopyChObj[CONFIG_UDMA0_NUM_BLKCOPY_CH];
    /* UDMA CONFIG_UDMA0 Blockcopy Channel Handle */
    Udma_ChHandle gConfigUdma0BlkCopyChHandle[CONFIG_UDMA0_NUM_BLKCOPY_CH]
    
    /* UDMA BCDMA_0 Blockcopy Channel Ring Mem Size */
    #define UDMA_CONFIG_UDMA0_BLK_COPY_CH_0_RING_MEM_SIZE (((1U * 8U) + UDMA_CACHELINE_ALIGNMENT) & ~(UDMA_CACHELINE_ALIGNMENT - 1U))
    
    /* UDMA BCDMA_0 Blockcopy Channel Ring Mem */
    static uint8_t gConfigUdma0BlkCopyCh0RingMem[UDMA_CONFIG_UDMA0_BLK_COPY_CH_0_RING_MEM_SIZE] __attribute__((aligned(UDMA_CACHELINE_ALIGNMENT)));
    
    /* UDMA BCDMA_0 Blockcopy Channel Ring Memory Pointers - for all channels */
    static uint8_t *gConfigUdma0BlkCopyChRingMem[CONFIG_UDMA0_NUM_BLKCOPY_CH] = {
        &gConfigUdma0BlkCopyCh0RingMem[0U],
    };
    
    /* UDMA BCDMA_0 Blockcopy Channel Ring Elem Count */
    static uint32_t gConfigUdma0BlkCopyChRingElemCnt[CONFIG_UDMA0_NUM_BLKCOPY_CH] = {
        1U,
    };
    
    /* UDMA BCDMA_0 Blockcopy Channel Ring Memory Size */
    static uint32_t gConfigUdma0BlkCopyChRingMemSize[CONFIG_UDMA0_NUM_BLKCOPY_CH] = {
        UDMA_CONFIG_UDMA0_BLK_COPY_CH_0_RING_MEM_SIZE,
    };
    
    /* UDMA BCDMA_0 Blockcopy Channel Event Object */
    static Udma_EventObject gConfigUdma0BlkCopyCqEventObj[CONFIG_UDMA0_NUM_BLKCOPY_CH];
    /* UDMA BCDMA_0 Blockcopy Channel Event Callback */
    
    static Udma_EventCallback gConfigUdma0BlkCopyCqEventCb[CONFIG_UDMA0_NUM_BLKCOPY_CH] = {
        NULL, // Will be set dynamically in tivxImgSelectCreate
    };
    
    /* UDMA CONFIG_UDMA0 Blockcopy Event Handle */
    Udma_EventHandle gConfigUdma0BlkCopyCqEventHandle[CONFIG_UDMA0_NUM_BLKCOPY_CH];
    
    static int32_t Drivers_udmaConfigUdma0BlkCopyOpen(void)
    {
        int32_t retVal = UDMA_SOK;
        uint32_t chType, chCnt;
        Udma_ChHandle chHandle;
        Udma_ChPrms chPrms;
        
        Udma_ChTxPrms txPrms;
        Udma_ChRxPrms rxPrms;
        Udma_DrvHandle drvHandle = &gUdmaDrvObj[CONFIG_UDMA0];
        Udma_EventPrms cqEventPrms;
        Udma_EventHandle cqEventHandle;
    
    
        for(chCnt = 0U; chCnt < CONFIG_UDMA0_NUM_BLKCOPY_CH; chCnt++)
        {
            chHandle = &gConfigUdma0BlkCopyChObj[chCnt];
            gConfigUdma0BlkCopyChHandle[chCnt] = chHandle;
            
            /* Init channel parameters */
            chType = UDMA_CH_TYPE_TR_BLK_COPY;
            UdmaChPrms_init(&chPrms, chType);
            chPrms.fqRingPrms.ringMem = gConfigUdma0BlkCopyChRingMem[chCnt];
            
            chPrms.fqRingPrms.ringMemSize = gConfigUdma0BlkCopyChRingMemSize[chCnt];
            
            chPrms.fqRingPrms.elemCnt = gConfigUdma0BlkCopyChRingElemCnt[chCnt];
            
            /* Open channel for block copy */
            retVal = Udma_chOpen(drvHandle, chHandle, chType, &chPrms);
            DebugP_assert(UDMA_SOK == retVal);
            
            /* Config TX channel */
            UdmaChTxPrms_init(&txPrms, chType);
            retVal = Udma_chConfigTx(chHandle, &txPrms);
            DebugP_assert(UDMA_SOK == retVal);
            
            /* Config RX channel - which is implicitly paired to TX channel in
            * block copy mode */
            
            UdmaChRxPrms_init(&rxPrms, chType);
            retVal = Udma_chConfigRx(chHandle, &rxPrms);
            DebugP_assert(UDMA_SOK == retVal);
            
            /* Register completion event */
            if(NULL != gConfigUdma0BlkCopyCqEventCb[chCnt])
            {
                cqEventHandle = &gConfigUdma0BlkCopyCqEventObj[chCnt];
                gConfigUdma0BlkCopyCqEventHandle[chCnt] = cqEventHandle;
                
                UdmaEventPrms_init(&cqEventPrms);
                cqEventPrms.eventType = UDMA_EVENT_TYPE_DMA_COMPLETION;
                cqEventPrms.eventMode = UDMA_EVENT_MODE_SHARED;
                
                cqEventPrms.chHandle = chHandle;
                cqEventPrms.masterEventHandle = Udma_eventGetGlobalHandle(drvHandle);
                cqEventPrms.eventCb = gConfigUdma0BlkCopyCqEventCb[chCnt];
                
                retVal = Udma_eventRegister(drvHandle, cqEventHandle, &cqEventPrms);
                DebugP_assert(UDMA_SOK == retVal);
            }
        }
    
        return retVal;
    }
    
    static vx_status VX_CALLBACK tivxImgSelectProcess(
    tivx_target_kernel_instance kernel,
    tivx_obj_desc_t *obj_desc[],
    uint16_t num_params, void *priv_arg);
    
    static vx_status VX_CALLBACK tivxImgSelectCreate(
    tivx_target_kernel_instance kernel,
    tivx_obj_desc_t *obj_desc[],
    uint16_t num_params, void *priv_arg);
    
    static vx_status VX_CALLBACK tivxImgSelectDelete(
    tivx_target_kernel_instance kernel,
    tivx_obj_desc_t *obj_desc[],
    uint16_t num_params, void *priv_arg);
    
    static vx_status VX_CALLBACK tivxImgSelectControl(
    tivx_target_kernel_instance kernel,
    uint32_t node_cmd_id, tivx_obj_desc_t *obj_desc[],
    uint16_t num_params, void *priv_arg);
    
    /* UDMA callback function */
    void App_udmaEventCb(Udma_EventHandle eventHandle, uint32_t eventType, void *appData);
    
    /* UDMA 2D buffer comparison function */
    static void App_udmaCompareBuf2D(const void *srcBuf, const void *destBuf, uint32_t width, uint32_t height,
    uint32_t srcPitch, uint32_t destPitch)
    {
    
        uint32_t i, j;
        uint8_t *src = (uint8_t *)srcBuf;
        uint8_t *dest = (uint8_t *)destBuf;
        uint32_t errorCount = 0;
        
        /* Invalidate destination buffer to get latest data from memory */
        CacheP_inv(dest, destPitch * height, CacheP_TYPE_ALLD);
    
        /* Compare source and destination buffers line by line */
        for(j = 0U; j < height; j++)
        {
            uint8_t *srcLine = src + (j * srcPitch);
            uint8_t *destLine = dest + (j * destPitch);
            
            for(i = 0U; i < width; i++)
            {
                if(srcLine[i] != destLine[i])
                {
    
                    errorCount++;
                    if (errorCount <= 10) /* Print first 10 errors only */
                    {
                        VX_PRINT(VX_ZONE_ERROR,"2D Data mismatch at line %d, index %d: src=0x%02X, dest=0x%02X\r\n",
                        j, i, srcLine[i], destLine[i]);
                        return;
                    }
                }
            }
        }
    
        if (errorCount > 0)
        {
        VX_PRINT(VX_ZONE_ERROR,"UDMA 2D copy verification failed: %d mismatches out of %d pixels\r\n",
        errorCount, width * height);
        
        DebugP_assert(FALSE);
        }
        else
        {
        
        //VX_PRINT(VX_ZONE_ERROR,"UDMA 2D copy verification successful: All %d pixels match\r\n", width * height);
        }
    
    
        return;
    }
    
    /* UDMA memcpy function */
    
    static int32_t App_udmaMemcpy2D(tivxImgSelectParams *prms, const img_select_2d_copy_params_t *params);
    
    static vx_status VX_CALLBACK tivxImgSelectProcess(
    tivx_target_kernel_instance kernel,
    tivx_obj_desc_t *obj_desc[],
    uint16_t num_params, void *priv_arg)
    
    {
        vx_status status = (vx_status)VX_SUCCESS;
        tivxImgSelectParams *prms = NULL;
        
        tivx_obj_desc_image_t *in_img0_desc;
        tivx_obj_desc_image_t *in_img1_desc;
        tivx_obj_desc_image_t *out_img_desc;
        
        if ((num_params != TIVX_KERNEL_IMG_SELECT_MAX_PARAMS)
        || (NULL == obj_desc[TIVX_KERNEL_IMG_SELECT_IN_IMG0_IDX])
        || (NULL == obj_desc[TIVX_KERNEL_IMG_SELECT_OUT_IMG_IDX])
        )
        {
            status = (vx_status)VX_FAILURE;
        }
    
        if ((vx_status)VX_SUCCESS == status)
        {
            uint32_t size;
            
            in_img0_desc = (tivx_obj_desc_image_t *)obj_desc[TIVX_KERNEL_IMG_SELECT_IN_IMG0_IDX];
            in_img1_desc = (tivx_obj_desc_image_t *)obj_desc[TIVX_KERNEL_IMG_SELECT_IN_IMG1_IDX];
            out_img_desc = (tivx_obj_desc_image_t *)obj_desc[TIVX_KERNEL_IMG_SELECT_OUT_IMG_IDX];
            
            status = tivxGetTargetKernelInstanceContext(kernel, (void **)&prms, &size);
            if (((vx_status)VX_SUCCESS != status) || (NULL == prms) ||
            (sizeof(tivxImgSelectParams) != size))
            {
                status = (vx_status)VX_FAILURE;
            }
        }
    
        if ((vx_status)VX_SUCCESS == status)
        {
            uint32_t i;
            void *in_img0_target_ptr[TIVX_IMAGE_MAX_PLANES] = {NULL};
            
            void *in_img1_target_ptr[TIVX_IMAGE_MAX_PLANES] = {NULL};
            void *out_img_target_ptr[TIVX_IMAGE_MAX_PLANES] = {NULL};
    
            for (i = 0; i < in_img0_desc->planes; i++)
            {
                in_img0_target_ptr[i] = tivxMemShared2TargetPtr(&in_img0_desc->mem_ptr[i]);
                
                tivxCheckStatus(&status, tivxMemBufferMap(in_img0_target_ptr[i],
                in_img0_desc->mem_size[i],
        
                (vx_enum)TIVX_MEMORY_TYPE_DMA,
                (vx_enum)VX_READ_ONLY));
                
                out_img_target_ptr[i] = tivxMemShared2TargetPtr(&out_img_desc->mem_ptr[i]);
                tivxCheckStatus(&status, tivxMemBufferMap(out_img_target_ptr[i],
                
                out_img_desc->mem_size[i],
                (vx_enum)TIVX_MEMORY_TYPE_DMA,
                (vx_enum)VX_WRITE_ONLY));
                
                if (in_img1_desc != NULL)
                {
                    in_img1_target_ptr[i] = tivxMemShared2TargetPtr(&in_img1_desc->mem_ptr[i]);
                    
                    tivxCheckStatus(&status, tivxMemBufferMap(in_img1_target_ptr[i],
                    in_img1_desc->mem_size[i],
                    
                    (vx_enum)TIVX_MEMORY_TYPE_DMA,
                    (vx_enum)VX_READ_ONLY));
                
                }
            }
            
            /* call kernel processing function */
            if (0 == prms->format)
            {
                if (0 == prms->select_idx)
                {
                    /* Use UDMA 2D memcpy for NV12 format */
                    img_select_2d_copy_params_t params_2d;
                    
                    /* NV12 Y plane copy */
                    params_2d.width = prms->img_width;
                    params_2d.height = prms->img_height;
                    
                    params_2d.src_addr = (uint64_t)Udma_defaultVirtToPhyFxn(in_img0_target_ptr[0], 0U, NULL);
                    params_2d.src_pitch = in_img0_desc->imagepatch_addr[0].stride_y;
                    
                    params_2d.dest_addr = (uint64_t)Udma_defaultVirtToPhyFxn(out_img_target_ptr[0], 0U, NULL);
                    params_2d.dest_pitch = out_img_desc->imagepatch_addr[0].stride_y;
                    
                    App_udmaMemcpy2D(prms, &params_2d);
                    
                    /* Verify Y plane copy */
                   // App_udmaCompareBuf2D(in_img0_target_ptr[0], out_img_target_ptr[0],
                   // prms->img_width, prms->img_height,
                    
                   // in_img0_desc->imagepatch_addr[0].stride_y,
                   // out_img_desc->imagepatch_addr[0].stride_y);
                    
                    
                    if (in_img0_desc->planes > 1)
                    {
                        /* NV12 UV plane copy */
                        params_2d.width = prms->img_width;
                        
                        params_2d.height = prms->img_height / 2;
                        params_2d.src_addr = (uint64_t)Udma_defaultVirtToPhyFxn(in_img0_target_ptr[1], 0U, NULL);
                        
                        params_2d.src_pitch = in_img0_desc->imagepatch_addr[1].stride_y;
                        params_2d.dest_addr = (uint64_t)Udma_defaultVirtToPhyFxn(out_img_target_ptr[1], 0U, NULL);
                        
                        params_2d.dest_pitch = out_img_desc->imagepatch_addr[1].stride_y;
                        App_udmaMemcpy2D(prms, &params_2d);
                        
                        /* Verify UV plane copy */
                       // App_udmaCompareBuf2D(in_img0_target_ptr[1], out_img_target_ptr[1],
                       // prms->img_width, prms->img_height / 2,
                        
                       // in_img0_desc->imagepatch_addr[1].stride_y,
                       // out_img_desc->imagepatch_addr[1].stride_y);
                    }
                }
            }
            else if (in_img1_desc != NULL)
            {
                /* Use UDMA 2D memcpy for NV12 format */
                
                img_select_2d_copy_params_t params_2d;
                
                /* NV12 Y plane copy */
                
                params_2d.width = prms->img_width;
                params_2d.height = prms->img_height;
                params_2d.src_addr = (uint64_t)Udma_defaultVirtToPhyFxn(in_img1_target_ptr[0], 0U, NULL);
                
                params_2d.src_pitch = in_img1_desc->imagepatch_addr[0].stride_y;
                params_2d.dest_addr = (uint64_t)Udma_defaultVirtToPhyFxn(out_img_target_ptr[0], 0U, NULL);
                params_2d.dest_pitch = out_img_desc->imagepatch_addr[0].stride_y;
                
                App_udmaMemcpy2D(prms, &params_2d);
    
                /* Verify Y plane copy */
               // App_udmaCompareBuf2D(in_img1_target_ptr[0], out_img_target_ptr[0],
    
               // prms->img_width, prms->img_height,
               // in_img1_desc->imagepatch_addr[0].stride_y,
               // out_img_desc->imagepatch_addr[0].stride_y);
    
                if (in_img1_desc->planes > 1)
                {
                    /* NV12 UV plane copy */
                    params_2d.width = prms->img_width;
                    
                    params_2d.height = prms->img_height / 2;
                    params_2d.src_addr = (uint64_t)Udma_defaultVirtToPhyFxn(in_img1_target_ptr[1], 0U, NULL);
                    params_2d.src_pitch = in_img1_desc->imagepatch_addr[1].stride_y;
                    
                    params_2d.dest_addr = (uint64_t)Udma_defaultVirtToPhyFxn(out_img_target_ptr[1], 0U, NULL);
                    params_2d.dest_pitch = out_img_desc->imagepatch_addr[1].stride_y;
                    App_udmaMemcpy2D(prms, &params_2d);
                    
                    /* Verify UV plane copy */
                    //App_udmaCompareBuf2D(in_img1_target_ptr[1], out_img_target_ptr[1],
                   // prms->img_width, prms->img_height / 2,
                    
                    //in_img1_desc->imagepatch_addr[1].stride_y,
                   // out_img_desc->imagepatch_addr[1].stride_y);
                }
            }
    
        }
        else
        {
            if (0 == prms->select_idx)
            {
        
                /* Use UDMA 2D memcpy for BGRX format */
                img_select_2d_copy_params_t params_2d;
                
                params_2d.width = prms->img_width * 4; /* BGRX is 4 bytes per pixel */
                params_2d.height = prms->img_height;
                params_2d.src_addr = (uint64_t)Udma_defaultVirtToPhyFxn(in_img0_target_ptr[0], 0U, NULL);
                
                params_2d.src_pitch = in_img0_desc->imagepatch_addr[0].stride_y;
                params_2d.dest_addr = (uint64_t)Udma_defaultVirtToPhyFxn(out_img_target_ptr[0], 0U, NULL);
                params_2d.dest_pitch = out_img_desc->imagepatch_addr[0].stride_y;
                
                App_udmaMemcpy2D(prms, &params_2d);
                
                /* Verify BGRX copy */
                //App_udmaCompareBuf2D(in_img0_target_ptr[0], out_img_target_ptr[0],
                
                //prms->img_width * 4, prms->img_height,
                //in_img0_desc->imagepatch_addr[0].stride_y,
                //out_img_desc->imagepatch_addr[0].stride_y);
        
                VX_PRINT(VX_ZONE_ERROR, "in_img0_desc->mem_size[0] %d \n",in_img0_desc->mem_size[0]);
            }
            else if (in_img1_desc != NULL)
            {
                /* Use UDMA 2D memcpy for BGRX format */
                img_select_2d_copy_params_t params_2d;
                
                params_2d.width = prms->img_width * 4; /* BGRX is 4 bytes per pixel */
                params_2d.height = prms->img_height;
                params_2d.src_addr = (uint64_t)Udma_defaultVirtToPhyFxn(in_img1_target_ptr[0], 0U, NULL);
                
                params_2d.src_pitch = in_img1_desc->imagepatch_addr[0].stride_y;
                params_2d.dest_addr = (uint64_t)Udma_defaultVirtToPhyFxn(out_img_target_ptr[0], 0U, NULL);
                params_2d.dest_pitch = out_img_desc->imagepatch_addr[0].stride_y;
                
                App_udmaMemcpy2D(prms, &params_2d);
                
                /* Verify BGRX copy */
                //App_udmaCompareBuf2D(in_img1_target_ptr[0], out_img_target_ptr[0],
                
                //prms->img_width * 4, prms->img_height,
                //in_img1_desc->imagepatch_addr[0].stride_y,
               // out_img_desc->imagepatch_addr[0].stride_y);
            }
    
        }
    
        /* kernel processing function complete */
        
        for (i = 0; i < in_img0_desc->planes; i++)
        {
            tivxCheckStatus(&status, tivxMemBufferUnmap(in_img0_target_ptr[i],
            in_img0_desc->mem_size[i],
            
            (vx_enum)TIVX_MEMORY_TYPE_DMA,
            (vx_enum)VX_READ_ONLY));
        
            tivxCheckStatus(&status, tivxMemBufferUnmap(out_img_target_ptr[i],
            out_img_desc->mem_size[i],
            (vx_enum)TIVX_MEMORY_TYPE_DMA,
            (vx_enum)VX_WRITE_ONLY));
        
            if (in_img1_desc != NULL)
            
            {
                tivxCheckStatus(&status, tivxMemBufferUnmap(in_img1_target_ptr[i],
                in_img1_desc->mem_size[i],
                
                (vx_enum)TIVX_MEMORY_TYPE_DMA,
                (vx_enum)VX_READ_ONLY));
            }
        
        }
        return status;
    }
    
    static vx_status VX_CALLBACK tivxImgSelectCreate(
    tivx_target_kernel_instance kernel,
    
    tivx_obj_desc_t *obj_desc[],
    uint16_t num_params, void *priv_arg)
    {
    
        vx_status status = (vx_status)VX_SUCCESS;
        tivxImgSelectParams *prms = NULL;
        
        tivx_obj_desc_image_t *in_img0_desc;
        vx_imagepatch_addressing_t *pIn;
        
        
        if ((num_params != TIVX_KERNEL_IMG_SELECT_MAX_PARAMS)
        || (NULL == obj_desc[TIVX_KERNEL_IMG_SELECT_IN_IMG0_IDX])
        || (NULL == obj_desc[TIVX_KERNEL_IMG_SELECT_OUT_IMG_IDX])
        )
    
        {
            status = (vx_status)VX_FAILURE;
        }
        else
    
        {
            prms = tivxMemAlloc(sizeof(tivxImgSelectParams), (vx_enum)TIVX_MEM_EXTERNAL);
            if (NULL == prms)
            {
                status = (vx_status)VX_ERROR_NO_MEMORY;
                VX_PRINT(VX_ZONE_ERROR, "Unable to allocate local memory\n");
            }
            VX_PRINT(VX_ZONE_ERROR, "tmp_target_ptr \n");
            
            if ((vx_status)VX_SUCCESS == status)
            {
                in_img0_desc = (tivx_obj_desc_image_t *)obj_desc[TIVX_KERNEL_IMG_SELECT_IN_IMG0_IDX];
                pIn = (vx_imagepatch_addressing_t *)&in_img0_desc->imagepatch_addr[0];
                
                prms->img_width = pIn->dim_x;
                prms->img_height = pIn->dim_y;
                prms->img_stride = pIn->stride_y;
                prms->select_idx = 0;
        
                prms->tmp_target_ptr[0] = tivxMemAlloc(in_img0_desc->mem_size[0], (vx_enum)TIVX_MEM_EXTERNAL);
                if (prms->tmp_target_ptr[0] == NULL)
                {
                VX_PRINT(VX_ZONE_ERROR, "Failed to alloc prms->tmp_target_ptr[0] \n");
                }
    
                if (TIVX_DF_IMAGE_BGRX == in_img0_desc->format)
                {
                    prms->format = 1;
                }
                else if (VX_DF_IMAGE_NV12 == in_img0_desc->format)
                {
                    prms->format = 0;
                    prms->tmp_target_ptr[1] = tivxMemAlloc(in_img0_desc->mem_size[1], (vx_enum)TIVX_MEM_EXTERNAL);
                    
                    if (prms->tmp_target_ptr[1] == NULL)
                    {
                        VX_PRINT(VX_ZONE_ERROR, "Failed to alloc prms->tmp_target_ptr[0] \n");
                    }
                }
                else
                {
                    status = (vx_status)VX_ERROR_INVALID_PARAMETERS;
                    VX_PRINT(VX_ZONE_ERROR, "'in_img0' should be an image of type:\n VX_DF_IMAGE_NV12 TIVX_DF_IMAGE_BGRX \n");
                }
    
                /* Initialize UDMA driver */
                {
                    uint32_t instId;
                    int32_t retVal = UDMA_SOK;
                    
                    for(instId = 0U; instId < CONFIG_UDMA_NUM_INSTANCES; instId++)
                    {
                    retVal += Udma_init(&gUdmaDrvObj[instId], &gUdmaInitPrms[instId]);
                    
                    DebugP_assert(UDMA_SOK == retVal);
                    }
                }
    
                /* Initialize UDMA */
                Drivers_udmaConfigUdma0BlkCopyOpen();
                prms->udmaChHandle = gConfigUdma0BlkCopyChHandle[0]; /* Use UDMA0 block copy channel */
                
                /* Set up UDMA event callback with app data */
                gConfigUdma0BlkCopyCqEventCb[0] = &App_udmaEventCb;
                Udma_EventPrms cqEventPrms;
                
                UdmaEventPrms_init(&cqEventPrms);
                cqEventPrms.eventType = UDMA_EVENT_TYPE_DMA_COMPLETION;
                cqEventPrms.eventMode = UDMA_EVENT_MODE_SHARED;
                
                cqEventPrms.chHandle = prms->udmaChHandle;
                cqEventPrms.masterEventHandle = Udma_eventGetGlobalHandle(&gUdmaDrvObj[CONFIG_UDMA0]);
                cqEventPrms.eventCb = gConfigUdma0BlkCopyCqEventCb[0];
                
                cqEventPrms.appData = prms; // Pass prms as app data to callback
                
                int32_t retVal = Udma_eventRegister(&gUdmaDrvObj[CONFIG_UDMA0],
                
                &gConfigUdma0BlkCopyCqEventObj[0],
                &cqEventPrms);
                if (retVal != UDMA_SOK)
                {
                    VX_PRINT(VX_ZONE_ERROR, "Failed to register UDMA event\n");
                    status = (vx_status)VX_ERROR_NO_RESOURCES;
                }
    
                gConfigUdma0BlkCopyCqEventHandle[0] = &gConfigUdma0BlkCopyCqEventObj[0];
                
                /* Create semaphore for UDMA completion */
                
                status = SemaphoreP_constructBinary(&prms->udmaDoneSem, 0);
                if (status != SystemP_SUCCESS)
    
                {
                    VX_PRINT(VX_ZONE_ERROR, "Failed to create UDMA semaphore\n");
                    status = (vx_status)VX_ERROR_NO_MEMORY;
                }
    
                /* Allocate TRPD memory */
                prms->udmaTrpdMem = tivxMemAlloc(UDMA_TEST_TRPD_SIZE, (vx_enum)TIVX_MEM_EXTERNAL);
                if (prms->udmaTrpdMem == NULL)
                
                {
                    VX_PRINT(VX_ZONE_ERROR, "Failed to allocate UDMA TRPD memory\n");
                    status = (vx_status)VX_ERROR_NO_MEMORY;
                }
    
                /* Enable UDMA channel */
                if (status == (vx_status)VX_SUCCESS)
                {
                    int32_t retVal = Udma_chEnable(prms->udmaChHandle);
                    if (retVal != UDMA_SOK)
                    
                    {
                        VX_PRINT(VX_ZONE_ERROR, "Failed to enable UDMA channel\n");
                        status = (vx_status)VX_ERROR_NO_RESOURCES;
                    }
                }
                
                tivxSetTargetKernelInstanceContext(kernel, prms,
                sizeof(tivxImgSelectParams));
            }
            else
    
            {
                status = (vx_status)VX_ERROR_NO_MEMORY;
                VX_PRINT(VX_ZONE_ERROR, "Unable to allocate local memory\n");
            }
        }
    
    
        return status;
    }
    
    void tivxAddTargetKernelImgSelect(void)
    {
        vx_status status = (vx_status)VX_FAILURE;
        char target_name[TIVX_TARGET_MAX_NAME];
        vx_enum self_cpu;
        
        self_cpu = tivxGetSelfCpuId();
    
        if (self_cpu == (vx_enum)TIVX_CPU_ID_MCU2_0)
        {
            strncpy(target_name, TIVX_TARGET_MCU2_0, TIVX_TARGET_MAX_NAME);
        
            status = (vx_status)VX_SUCCESS;
        }
        else if (self_cpu == (vx_enum)TIVX_CPU_ID_DSP1)
        
        {
            strncpy(target_name, TIVX_TARGET_DSP1, TIVX_TARGET_MAX_NAME);
            status = (vx_status)VX_SUCCESS;
        }
        
        else if (self_cpu == (vx_enum)TIVX_CPU_ID_DSP2)
        {
            strncpy(target_name, TIVX_TARGET_DSP2, TIVX_TARGET_MAX_NAME);
            status = (vx_status)VX_SUCCESS;
        }
    
        else
        {
            status = (vx_status)VX_FAILURE;
        }
    
        if (status == (vx_status)VX_SUCCESS)
        {
            vx_img_select_target_kernel = tivxAddTargetKernelByName(
            TIVX_KERNEL_IMG_SELECT_NAME,
        
            target_name,
            tivxImgSelectProcess,
            tivxImgSelectCreate,
            NULL,
            NULL,
            NULL);
        
        }
    }
    
    void tivxRemoveTargetKernelImgSelect(void)
    {
        vx_status status = (vx_status)VX_SUCCESS;
        
        status = tivxRemoveTargetKernel(vx_img_select_target_kernel);
        if (status == (vx_status)VX_SUCCESS)
        {
            vx_img_select_target_kernel = NULL;
        }
    
    }
    
    /* UDMA callback function */
    
    void App_udmaEventCb(Udma_EventHandle eventHandle, uint32_t eventType, void *appData)
    {
        tivxImgSelectParams *prms = (tivxImgSelectParams *)appData;
    
        if(UDMA_EVENT_TYPE_DMA_COMPLETION == eventType)
        {
            SemaphoreP_post(&prms->udmaDoneSem);
        }
    }
    
    /* UDMA TRPD initialization function for 2D copy */
    static void App_udmaTrpdInit2D(Udma_ChHandle chHandle,
    uint8_t *trpdMem, const img_select_2d_copy_params_t *params)
    {
        CSL_UdmapTR15 *pTr;
        uint32_t cqRingNum = Udma_chGetCqRingNum(chHandle);
        
        /* Make TRPD with TR15 TR type */
        UdmaUtils_makeTrpdTr15(trpdMem, 1U, cqRingNum);
    
        /* Setup TR for 2D copy */
        pTr = UdmaUtils_getTrpdTr15Pointer(trpdMem, 0U);
        pTr->flags = CSL_FMK(UDMAP_TR_FLAGS_TYPE, CSL_UDMAP_TR_FLAGS_TYPE_4D_BLOCK_MOVE_REPACKING_INDIRECTION);
        pTr->flags |= CSL_FMK(UDMAP_TR_FLAGS_STATIC, 0U);
        
        pTr->flags |= CSL_FMK(UDMAP_TR_FLAGS_EOL, CSL_UDMAP_TR_FLAGS_EOL_MATCH_SOL_EOL);
        pTr->flags |= CSL_FMK(UDMAP_TR_FLAGS_EVENT_SIZE, CSL_UDMAP_TR_FLAGS_EVENT_SIZE_COMPLETION);
        pTr->flags |= CSL_FMK(UDMAP_TR_FLAGS_TRIGGER0, CSL_UDMAP_TR_FLAGS_TRIGGER_NONE);
        
        pTr->flags |= CSL_FMK(UDMAP_TR_FLAGS_TRIGGER0_TYPE, CSL_UDMAP_TR_FLAGS_TRIGGER_TYPE_ALL);
        pTr->flags |= CSL_FMK(UDMAP_TR_FLAGS_TRIGGER1, CSL_UDMAP_TR_FLAGS_TRIGGER_NONE);
        pTr->flags |= CSL_FMK(UDMAP_TR_FLAGS_TRIGGER1_TYPE, CSL_UDMAP_TR_FLAGS_TRIGGER_TYPE_ALL);
        
        pTr->flags |= CSL_FMK(UDMAP_TR_FLAGS_CMD_ID, 0x25U); /* This will come back in TR response */
        pTr->flags |= CSL_FMK(UDMAP_TR_FLAGS_SA_INDIRECT, 0U);
        pTr->flags |= CSL_FMK(UDMAP_TR_FLAGS_DA_INDIRECT, 0U);
        
        pTr->flags |= CSL_FMK(UDMAP_TR_FLAGS_EOP, 1U);
        
        /* Set source parameters for 2D copy */
        
        pTr->icnt0 = params->width; /* Width in bytes */
        pTr->icnt1 = params->height; /* Height in lines */
        pTr->icnt2 = 1U; /* Number of blocks */
        
        pTr->icnt3 = 1U; /* Number of sets */
        pTr->dim1 = params->src_pitch; /* Source pitch (stride) */
        pTr->dim2 = pTr->icnt0 * pTr->icnt1;
        
        pTr->dim3 = pTr->icnt0 * pTr->icnt1 * pTr->icnt2;
        pTr->addr = params->src_addr;
        pTr->fmtflags = 0x00000000U; /* Linear addressing, 1 byte per elem */
        
        /* Set destination parameters for 2D copy */
        pTr->dicnt0 = params->width; /* Destination width in bytes */
        pTr->dicnt1 = params->height; /* Destination height in lines */
        
        pTr->dicnt2 = 1U; /* Number of blocks */
        pTr->dicnt3 = 1U; /* Number of sets */
        pTr->ddim1 = params->dest_pitch; /* Destination pitch (stride) */
        
        pTr->ddim2 = pTr->dicnt0 * pTr->dicnt1;
        pTr->ddim3 = pTr->dicnt0 * pTr->dicnt1 * pTr->dicnt2;
        pTr->daddr = params->dest_addr;
        
        /* Perform cache writeback */
        CacheP_wb(trpdMem, UDMA_TEST_TRPD_SIZE, CacheP_TYPE_ALLD);
        
        return;
    
    }
    
    
    /* UDMA 2D memcpy function */
    static int32_t App_udmaMemcpy2D(tivxImgSelectParams *prms, const img_select_2d_copy_params_t *params)
    
    {
        int32_t retVal = UDMA_SOK;
        uint64_t pDesc;
        uint32_t trRespStatus;
        uint64_t trpdMemPhy;
        
        /* Initialize TRPD for 2D copy */
        App_udmaTrpdInit2D(prms->udmaChHandle, prms->udmaTrpdMem, params);
        
        trpdMemPhy = (uint64_t) Udma_defaultVirtToPhyFxn(prms->udmaTrpdMem, 0U, NULL);
        
        /* Submit TRPD to channel */
        retVal = Udma_ringQueueRaw(Udma_chGetFqRingHandle(prms->udmaChHandle), trpdMemPhy);
        
        if (retVal != UDMA_SOK)
        {
            VX_PRINT(VX_ZONE_ERROR,"Failed to submit TRPD to UDMA channel\r\n");
            return retVal;
        }
    
        /* Wait for completion */
        SemaphoreP_pend(&prms->udmaDoneSem, SystemP_WAIT_FOREVER);
        
        retVal = Udma_ringDequeueRaw(Udma_chGetCqRingHandle(prms->udmaChHandle), &pDesc);
        if (retVal != UDMA_SOK)
        {
            VX_PRINT(VX_ZONE_ERROR,"Failed to dequeue from completion ring\r\n");
            return retVal;
        }
    
        /* Check TR response status */
        CacheP_inv(prms->udmaTrpdMem, UDMA_TEST_TRPD_SIZE, CacheP_TYPE_ALLD);
        trRespStatus = UdmaUtils_getTrpdTr15Response(prms->udmaTrpdMem, 1U, 0U);
        if (CSL_UDMAP_TR_RESPONSE_STATUS_COMPLETE != trRespStatus)
        
        {
            VX_PRINT(VX_ZONE_ERROR,"UDMA 2D transfer failed with status: %d\r\n", trRespStatus);
            retVal = UDMA_EFAIL;
            return retVal;
        }
    
        return retVal;
    }

  • Hi,

    The time calculation for the UDMA copy are as follows:

    Image size = (1920 * 960) = 18,43,200 pixels = 1800 KB

    Size (bytes) = Image size * bytes per pixel / (1024) // for YUV & RGBA 

    Time(ms) = [size(bytes) / bandwidth (GB/s) ] * 1000

    From the code, it seems that the image is being copied from DDR to DDR, and that Vision Apps is being used.
    Could you please confirm whether the data involved in this transfer is aligned or unaligned?

    Regards,
    Ben Eapen Thomas

  • Hi,

    How to confirm alignment?

  • Hi,
    You can verify the alignment by checking the source and the destination addresses along with the jump in the configuration settings.
    The alignment requirement is 128 bytes. 

    Regards,
    Ben Eapen Thomas

  •   VX_PRINT(VX_ZONE_ERROR, "BGRX - Source addr: 0x%llx, Dest addr: 0x%llx\n", 
                             params_2d.src_addr, params_2d.dest_addr);
      VX_PRINT(VX_ZONE_ERROR, "BGRX - Source pitch: %d, Dest pitch: %d\n", 
                             params_2d.src_pitch, params_2d.dest_pitch);
      VX_PRINT(VX_ZONE_ERROR, "BGRX - Source addr 64-byte aligned: %s, Dest addr 64-byte aligned: %s\n", 
                             (params_2d.src_addr & 0x3F) ? "NO" : "YES",
                             (params_2d.dest_addr & 0x3F) ? "NO" : "YES");
      VX_PRINT(VX_ZONE_ERROR, "BGRX - Source pitch 128-byte aligned: %s, Dest pitch 128-byte aligned: %s\n", 
                         (params_2d.src_pitch & 0x7F) ? "NO" : "YES",
                         (params_2d.dest_pitch & 0x7F) ? "NO" : "YES");

    Hi,
    [C7x_1 ] [107.147538][VX_ZONE_ERROR][tivxImgSelectProcess:547] BGRX - Source addr: 0xc7dea000, Dest addr: 0xcb71f000
    [C7x_1 ] [107.147580][VX_ZONE_ERROR][tivxImgSelectProcess:549] BGRX - Source pitch: 7680, Dest pitch: 7680
    [C7x_1 ] [107.147615][VX_ZONE_ERROR][tivxImgSelectProcess:552] BGRX - Source addr 64-byte aligned: YES, Dest addr 64-byte aligned: YES
    [C7x_1 ] [107.147662][VX_ZONE_ERROR][tivxImgSelectProcess:555] BGRX - Source pitch 128-byte aligned: YES, Dest pitch 128-byte aligned: YES

    It looks aligned

  • Hi,
    Yes , the buffers are aligned. 

    Image size = (1920 * 960) = 18,43,200 pixels = 1800 KB
    Size (bytes) = Image size * bytes per pixel / (1024) // for YUV & RGBA 
    Time(ms) = [size(bytes) / bandwidth (GB/s) ] * 1000

    Could you please provide the time value based on this formulae.
    Regards,
    Ben Eapen Thomas

  • Sorry,
    I don't know what the bandwidth is (GB/s)
    Core C7X, bandwidth=1 GHz x 128 bits x 2/8=32 GB?
    Image size=1920 × 960 × 4=7372800 bytes  =7.37 MB (RGBA)
    Time=7.37/1024/32 * 1000=0.23?

  • Hi,

    copying YUV takes 10 milliseconds and copying BGRX takes 25 milliseconds

    How are you calculating this time?  

    Regards,
    Sivadeep

  • GRAPH:       tidl_graph (#nodes =  10, #executions =    182)
     NODE:      VPAC_MSC1:               ScalerNode: avg =  15906 usecs, min/max =  13226 /  18849 usecs, #executions =        182
     NODE:       DSP_C7-2:           img_merge_node: avg =   9425 usecs, min/max =   5827 /  14750 usecs, #executions =        182
     NODE:       DSP_C7-2:            ODPreProcNode: avg =  16530 usecs, min/max =  12044 /  24060 usecs, #executions =        182
     NODE:       DSP_C7-2:               ODTIDLNode: avg avg =  33146 usecs, min/max =  31029 /  35150 usecs, #executions =        182
     NODE:       DSP_C7-2:    DrawBoxDetectionsNode: avg =   7075 usecs, min/max =   3942 /  13326 usecs, #executions =        182
     NODE:      VPAC_MSC2:            mosaic_node_1: avg =   9862 usecs, min/max =   3085 /  15171 usecs, #executions =        182
     NODE:          MPU-2:         test_libyuv_node: avg =   1938 usecs, min/max =   1506 /   7780 usecs, #executions =        182
     NODE:       DSP_C7-1:            ImSelectNode: avg =  25303 usecs, min/max =  16373 /  31918 usecs, #executions =        182
     NODE:          MPU-0:           srv_write_node: avg =   2597 usecs, min/max =     21 / 459651 usecs, #executions =        182
     NODE:          MPU-3:           AVM Putod Node: avg =    377 usecs, min/max =     83 /   1649 usecs, #executions =        186

    Printing through node information,  ImSelectNode: avg =  25303 usecs(RGBA)

  • Hi,

    Since it’s transferring data to DDR, there might be some overhead. Let me check with the available values.
    Also, I need to verify if this is the correct method for calculating the DMA transfer time. I'll check these and get back to you.

    Regards,
    Sivadeep

  • Hi,

    Printing through node information,  ImSelectNode: avg =  25303 usecs(RGBA)

    Does this node only contain the DMA transfer, looking through your code this seems to be not the case.

    Regards,
    Sivadeep

  • Hi,

    Yes, only UDMA copy was used. 

    In the tivxImgSelectProcess function

    It copies YUV or BGRX and only selects one of them.

    What else did you see?

    Regards

  • Hi,

    In the tivxImgSelectProcess function

    In this TRPD initialization function, the checks are in place, right? To measure the transfer time, you should only consider the time from DMA submission to completion.

    Regards,
    Sivadeep

  • Hi,

    This is the debugging information I added, only for DMA copying

     

    int32_t App_udmaMemcpy2D( const img_select_2d_copy_params_t *params)
    {
        int32_t         retVal = UDMA_SOK;
        uint64_t        pDesc;
        uint32_t        trRespStatus;
        uint64_t        trpdMemPhy;
        uint64_t current_time = 0;
        current_time = tivxPlatformGetTimeInUsecs();
    
    
        /* Initialize TRPD for 2D copy */
        App_udmaTrpdInit2D(g_udmaChHandle, g_udmaTrpdMem, params);
    
        trpdMemPhy = (uint64_t) Udma_defaultVirtToPhyFxn(g_udmaTrpdMem, 0U, NULL);
        /* Submit TRPD to channel */
        retVal = Udma_ringQueueRaw(Udma_chGetFqRingHandle(g_udmaChHandle), trpdMemPhy);
        if (retVal != UDMA_SOK)
        {
            VX_PRINT(VX_ZONE_ERROR,"Failed to submit TRPD to UDMA channel\r\n");
            return retVal;
        }
     /* Wait for completion */
        SemaphoreP_pend(&g_udmaDoneSem, SystemP_WAIT_FOREVER);
    
        retVal = Udma_ringDequeueRaw(Udma_chGetCqRingHandle(g_udmaChHandle), &pDesc);
        if (retVal != UDMA_SOK)
        {
            VX_PRINT(VX_ZONE_ERROR,"Failed to dequeue from completion ring\r\n");
            return retVal;
        }
     /* Check TR response status */
        CacheP_inv(g_udmaTrpdMem, UDMA_TEST_TRPD_SIZE, CacheP_TYPE_ALLD);
        trRespStatus = UdmaUtils_getTrpdTr15Response(g_udmaTrpdMem, 1U, 0U);
        if (CSL_UDMAP_TR_RESPONSE_STATUS_COMPLETE != trRespStatus)
        {
            VX_PRINT(VX_ZONE_ERROR,"UDMA 2D transfer failed with status: %d\r\n", trRespStatus);
            retVal = UDMA_EFAIL;
            return retVal;
        }
    
        VX_PRINT(VX_ZONE_ERROR,"App_udmaMemcpy2D time : %d\r\n", (int)(tivxPlatformGetTimeInUsecs() - current_time)/1000 );
    
        return retVal;
    }

    VX_PRINT(VX_ZONE_ERROR,"App_udmaMemcpy2D time : %d\r\n", (int)(tivxPlatformGetTimeInUsecs() - current_time)/1000 );

    Regard

  • /mcu_plus_sdk_j722s_11_00_00_12/examples/drivers/udma/udma_memcpy_interrupt/udma_memcpy_interrupt.c

    This is the source code for your example. He is a copy of 1D.

    I have done 2D configuration, everything else is the same.

    1D is slower than 2D。

  • Hi,

    Can you check with this : Transfer time between submission and wiat

    int32_t App_udmaMemcpy2D( const img_select_2d_copy_params_t *params)
    {
        int32_t         retVal = UDMA_SOK;
        uint64_t        pDesc;
        uint32_t        trRespStatus;
        uint64_t        trpdMemPhy;
        uint64_t current_time = 0;
       
    
    
        /* Initialize TRPD for 2D copy */
        App_udmaTrpdInit2D(g_udmaChHandle, g_udmaTrpdMem, params);
    
        trpdMemPhy = (uint64_t) Udma_defaultVirtToPhyFxn(g_udmaTrpdMem, 0U, NULL);
        /* Submit TRPD to channel */
         current_time = tivxPlatformGetTimeInUsecs();
        retVal = Udma_ringQueueRaw(Udma_chGetFqRingHandle(g_udmaChHandle), trpdMemPhy);
        if (retVal != UDMA_SOK)
        {
            VX_PRINT(VX_ZONE_ERROR,"Failed to submit TRPD to UDMA channel\r\n");
            return retVal;
        }
     /* Wait for completion */
        SemaphoreP_pend(&g_udmaDoneSem, SystemP_WAIT_FOREVER);
        
        VX_PRINT(VX_ZONE_ERROR,"Transfer time : %d\r\n", (int)(tivxPlatformGetTimeInUsecs() - current_time)/1000 );
        
    
        retVal = Udma_ringDequeueRaw(Udma_chGetCqRingHandle(g_udmaChHandle), &pDesc);
        if (retVal != UDMA_SOK)
        {
            VX_PRINT(VX_ZONE_ERROR,"Failed to dequeue from completion ring\r\n");
            return retVal;
        }
     /* Check TR response status */
        CacheP_inv(g_udmaTrpdMem, UDMA_TEST_TRPD_SIZE, CacheP_TYPE_ALLD);
        trRespStatus = UdmaUtils_getTrpdTr15Response(g_udmaTrpdMem, 1U, 0U);
        if (CSL_UDMAP_TR_RESPONSE_STATUS_COMPLETE != trRespStatus)
        {
            VX_PRINT(VX_ZONE_ERROR,"UDMA 2D transfer failed with status: %d\r\n", trRespStatus);
            retVal = UDMA_EFAIL;
            return retVal;
        }
    
        VX_PRINT(VX_ZONE_ERROR,"App_udmaMemcpy2D time : %d\r\n", (int)(tivxPlatformGetTimeInUsecs() - current_time)/1000 );
    
        return retVal;
    }


    Regards,
    Sivadeep

  • Hi,

    int32_t App_udmaMemcpy2D( const img_select_2d_copy_params_t *params)
    {
        int32_t         retVal = UDMA_SOK;
        uint64_t        pDesc;
        uint32_t        trRespStatus;
        uint64_t        trpdMemPhy;
        uint64_t current_time = 0;
       
    
         current_time = tivxPlatformGetTimeInUsecs();
    
        /* Initialize TRPD for 2D copy */
        App_udmaTrpdInit2D(g_udmaChHandle, g_udmaTrpdMem, params);
        VX_PRINT(VX_ZONE_ERROR,"App_udmaTrpdInit2D time : %d\r\n", (int)(tivxPlatformGetTimeInUsecs() - current_time)/1000 );
    
        trpdMemPhy = (uint64_t) Udma_defaultVirtToPhyFxn(g_udmaTrpdMem, 0U, NULL);
        /* Submit TRPD to channel */
        retVal = Udma_ringQueueRaw(Udma_chGetFqRingHandle(g_udmaChHandle), trpdMemPhy);
        if (retVal != UDMA_SOK)
        {
            VX_PRINT(VX_ZONE_ERROR,"Failed to submit TRPD to UDMA channel\r\n");
            return retVal;
        }
        VX_PRINT(VX_ZONE_ERROR,"Udma_ringQueueRaw time : %d\r\n", (int)(tivxPlatformGetTimeInUsecs() - current_time)/1000 );
    
     /* Wait for completion */
        SemaphoreP_pend(&g_udmaDoneSem, SystemP_WAIT_FOREVER);
        
        VX_PRINT(VX_ZONE_ERROR,"Transfer time : %d\r\n", (int)(tivxPlatformGetTimeInUsecs() - current_time)/1000 );
        
    
        retVal = Udma_ringDequeueRaw(Udma_chGetCqRingHandle(g_udmaChHandle), &pDesc);
        if (retVal != UDMA_SOK)
        {
            VX_PRINT(VX_ZONE_ERROR,"Failed to dequeue from completion ring\r\n");
            return retVal;
        }
     /* Check TR response status */
        CacheP_inv(g_udmaTrpdMem, UDMA_TEST_TRPD_SIZE, CacheP_TYPE_ALLD);
        trRespStatus = UdmaUtils_getTrpdTr15Response(g_udmaTrpdMem, 1U, 0U);
        if (CSL_UDMAP_TR_RESPONSE_STATUS_COMPLETE != trRespStatus)
        {
            VX_PRINT(VX_ZONE_ERROR,"UDMA 2D transfer failed with status: %d\r\n", trRespStatus);
            retVal = UDMA_EFAIL;
            return retVal;
        }
    
        VX_PRINT(VX_ZONE_ERROR,"App_udmaMemcpy2D time : %d\r\n", (int)(tivxPlatformGetTimeInUsecs() - current_time)/1000 );
    
        return retVal;
    }



  • Hi,

    Which pdk example are you using for this BCDMA? 

    These are the throughput times which I got from pdk documentation.

    Regards,

    Sivadeep

  • Hi,

    Could you please check the time between submission and wait for transfer completion like the below.

    Regards,
    Sivadeep

  • Hi,

    I don't know which PDK example I correspond to。

    My SDK version is TDA4VEN_VISION_V11_00.
    Reference case ./mcu_plus_sdk_j722s_11_00_00_12/examples/drivers/udma/udma_memcpy_interrupt/j722s-evm/c75ss0-0_freertos。

  • Hi,

    Are you running anything else in parallel. 

    Could you share the steps you did to integrate the same. I will try from my side also.

    Regards,

    SIvadeep

  • Hi,

    I have AVM and 3 TIDL models running。

    My project is too big for me to share. But you can use the code I sent above to register in the kernel, and a simple example is also possible.

  • Hi,

    Which cores are you running the TIDL, and which core for this DMA transfer.

    Could you please check the execution time without using TIDL if possible?

    Regards,
    Sivadeep

  • Hi,

    The TIDL model runs on C7_2,The DMA transfer.runs on C7_1

    I need time to confirm not using TIDL, please wait a moment。

  • Summary of CPU load,
    ====================
    
    CPU: mpu1_0: TOTAL LOAD =  30.65 % ( HWI =   1.97 %, SWI =   0.29 % )
    CPU: mcu2_0: TOTAL LOAD =   1.25 % ( HWI =   0. 0 %, SWI =   0. 0 % )
    CPU:  c7x_1: TOTAL LOAD =   0.46 % ( HWI =   0. 0 %, SWI =   0. 0 % )
    CPU:  c7x_2: TOTAL LOAD =   0.13 % ( HWI =   0. 0 %, SWI =   0. 0 % )
    
    HWA performance statistics,
    ===========================
    
    HWA:   LDC : LOAD =   7.35 % ( 39 MP/s )
    HWA:   GPU : LOAD =  72. 0 % ( 92 MP/s )
    
    DDR performance statistics,
    ===========================
    
    DDR: READ  BW: AVG =    595 MB/s, PEAK =   4999 MB/s
    DDR: WRITE BW: AVG =    667 MB/s, PEAK =   6091 MB/s
    DDR: TOTAL BW: AVG =   1262 MB/s, PEAK =  11090 MB/s
    
    GRAPH:        avm_graph (#nodes =   3, #executions =    347)
     NODE:       CAPTURE1:             capture_node: avg =  33042 usecs, min/max =     47 /  44608 usecs, #executions =        347
     NODE:      VPAC_LDC1:                 ldc_node: avg =   9184 usecs, min/max =   8676 /  14326 usecs, #executions =        347
     NODE:          MPU-0:      OpenGL_APS_SRV_Node: avg =  26002 usecs, min/max =  23403 /  90819 usecs, #executions =        347
     NODE:       DSP_C7-1:            ImgSelectNode: avg =  20351 usecs, min/max =  15204 /  24047 usecs, #executions =        347
     NODE:          CSITX:                CsitxNode: avg =  16657 usecs, min/max =  16623 /  16750 usecs, #executions =        347
    

    This is all the information of the NODE, as well as some resource information。

    Remove TIDL. The time has decreased, around 19-20.

  • Hi,

    Remove TIDL. The time has decreased, around 19-20.

    Thanks for the update. I'm checking this timing internally. Will update my findings in this thread.

    Regards,
    Sivadeep

  • Hi,

    Can you check the execution time without running the other components. ( Capture, vpac, etc)

    Regards,

    Sivadeep

  • Hi,

    I blocked all functions, leaving only DMA copying. The time is 15 milliseconds

  • Hi,

    The time is 15 milliseconds

    The current time aligns with the test numbers. The earlier time was affected by other masters.

    Regards,
    Sivadeep

  • Hi,

    How to calculate this time?  can you teach me?Thank you

    7.2M /0.015sec = 480M/sec。

    Does not match the content of the image

  • Hi,


    The PDK data represents read + write throughput.

    Calculation : 

    Total Data = Read Size+Write Size = 7.2 MB +7.2 MB = 14.4 MB
    Time Taken = Total Data / Throughput
               = 14.4 MB / 	949 MB/sec 
               ≈ 15 ms


    Regards,
    Sivadeep

  • Hi,

    Can we support 4CH?

  • Hi,

    Yes. Parameters can be changed for supporting the same. Since the original question has been answered, I think we can close this thread and start a new one for the new discussion.

    Regards,
    Sivadeep

  • Thanks. Closing this thread since the original question has been answered