This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS: Using Vlib apis

Other Parts Discussed in Thread: SYSBIOS

Tool/software: TI-RTOS

Hello,

I want to use apis available in vlib on TDA2x like,

Morphological Dilation
Morphological Erosion

I went through the document @ ti_components/algorithms/vlib_c66x_3_3_0_3/docs/VLIB_Users_Manual.chm

Also this post  e2e.ti.com/.../559631   , but that didn't solved my issue .

I have used "Morphological Dilation" lib as follows,

1. "#include <ti/vlib/vlib.h> "included in plugin c file

2. Called this function ,
     VLIB_dilate_bin_square((uint8_t *)pSysVideoFrameBufferInput->bufAddr[0],
                                              (uint8_t *)pSysVideoFrameBufferOutput->bufAddr[0],
                                              pInputChInfo->width,
                                              pInputChInfo->pitch[0]
                                          );

3. Path to this library is already added in vision_sdk/build/rtos/makerules/env.mk @ line no. 195 196

So current situation, i didn't get any errors at build and run time and output screen is green i.e didn't display any output.

So is it the correct way to use that lib functions or anything elas i am missing?

Regards,

Kajal

  • Any updates on this issue??

    Regards,
    Kajal
  • Kajal,

    I have a few questions:

    Q1. What is the data format of pSysVideoFrameBufferInput->bufAddr[0] and pSysVideoFrameBufferOutput->bufAddr[0] buffers? This API expects that the data is packed binary data, meaning each pixels is only 1 bit. In other words, if "cols" is 32, the size in bytes of the input and output data is 4 bytes each.

    If your input and output is not packed like this, you have 2 options:
    1. Use the following functions to pack before and unpack after: VLIB_packMask32, VLIB_unpackMask32
    2. Consider if VXLIB_dilate_3x3_i8u_o8u is what you want.

    It looks like you put the "cols" as width. This means you will only operation on one line of the image. In order to do the whole image, you need to do one of the following:
    1. If the width is equal to the stride, then your cols should be cols=width*height.
    2. If the width is less than the stride, then you need to call the VLIB function in a loop, one iteration for each line, updating the pointers to each line accordingly. ("pitch" is just used to determine the start of each line so that the 3x3 mask can align appropriatly from one line to the next).

    Jesse
  • Hello Jesse,

    " Q1. What is the data format of pSysVideoFrameBufferInput->bufAddr[0] and pSysVideoFrameBufferOutput->bufAddr[0] buffers? "

    >> SYSTEM_DF_YUV420SP_UV

    I believe that "This API expects that the data is packed binary data" , so I have used VLIB_packMask32 and VLIB_unpackMask32 for the same as,

    in_pack_size   =  (pInputChInfo->width*pInputChInfo->height) / 8 + 4; /* bit packed binary input with zero padding */
    out_pack_size = (pInputChInfo->width*pInputChInfo->height) / 8;     /* bit packed binary output */

    In_mask32packed         =  (uint32_t *) VLIB_memalign(8, in_pack_size);
    Out_mask32packed      =  (uint32_t *) VLIB_memalign(8, out_pack_size);

    VLIB_packMask32((uint8_t *)pSysVideoFrameBufferInput->bufAddr[0],
                               In_mask32packed,
                               (pInputChInfo->width*pInputChInfo->height)
                            );

    VLIB_dilate_bin_square((uint8_t *)In_mask32packed,
                                    (uint8_t *)Out_mask32packed,
                                    (pInputChInfo->width*pInputChInfo->height),
                                    pInputChInfo->pitch[0]
                                   );

    VLIB_unpackMask32(Out_mask32packed,
                                (uint8_t *)pSysVideoFrameBufferOutput->bufAddr[0],
                                (pInputChInfo->width*pInputChInfo->height)
                                  );

    Is it correct way ??

    Regards,

    Kajal

  • This looks mostly correct, assuming your pitch is the same as the width.

    I suggest double checking the following files:

    1. vlib_c66x_3_3_0_3\packages\ti\vlib\src\VLIB_dilate_bin_square\VLIB_dilate_bin_square_d.c

    - This is an example of the code that needs to be called.  As you see, the 'cols' is the output number of pixels to be produced, and since this is a 3x3 filter, we need to reduce the height by 2 so  that we don't access pixels beyond  the end of the buffer.  Your code doesn't seem to be doing this, so you may be accessing/reading data after the allocated buffer, and taking more cycles than is necessary.

    2. vlib_c66x_3_3_0_3\packages\ti\vlib\src\VLIB_dilate_bin_square\c66\VLIB_dilate_bin_square.h

    - This file has all the documentation of the API.  I advise to go through the "Assumptions" section to ensure you are meeting all the assumptions of the function.  The same is true of the pack and unpack api's ... please refer to their header files. 

    By the way, these header files are used as input to doxygen documentation generation.  From the package release notes, you can navigate to "User Manual"->"API Reference", and then choose the function category and functions you are using.  All of the documentation, explanation, and assumptions are listed here.

    Jesse

  • Also, this function is typically used on a binary image (pixels are either 0 or non-zero). Usually, a threshold type operation happens before this to suppress some values to zero. The output will be a binary image for sure (values of 0 and 1). So if you want to see the results on a display, you may want to saturate non-zero values to 255 so you can see the difference.
  • Hello Jesse,

    Pitch and width is the same i.e 1920, taking input from HDMI.

    While aligning,

    In_mask32packed = (uint32_t *) VLIB_memalign(8, in_pack_size);
    Out_mask32packed = (uint32_t *) VLIB_memalign(8, out_pack_size)

    I am getting below error on console,

    [DSP1 ] 30.880650 s: ### XDC ASSERT - ERROR CALLBACK START ###
    [DSP1 ] 30.880680 s:
    [DSP1 ] 30.880772 s: out of memory: handle=0x87733708, size=259208
    [DSP1 ] 30.880802 s:
    [DSP1 ] 30.880802 s: ### XDC ASSERT - ERROR CALLBACK END ###
    [DSP1 ] 30.880833 s:
    [DSP1 ] 30.881046 s: ti.sysbios.heaps.HeapMem: line 221: ti.sysbios.heaps.HeapMem: line 221: out of memory: handle=0x87733708, size=259208
    [DSP1 ] 30.902580 s:
    [DSP1 ] 30.902641 s: ### XDC ASSERT - ERROR CALLBACK START ###
    [DSP1 ] 30.902641 s:
    [DSP1 ] 30.902763 s: out of memory: handle=0x87733708, size=259212

    Do you have any idea for this??

    Regards,
    Kajal
  • It looks like you have run out of Heap memory. May I suggest allocating only 1 line worth of memory for each buffer. Then you can call all 3 functions as part of a loop for each line. This saves memory while at the same time increases performance due to cache locality.

    Jesse
  • Hello Jesse,

    I have done what you suggested , "out of memory" issue resolved.

    But still there is no output on screen and no runtime errors also.

    Attaching code for your reference.

    dilationLib.c
    #include "dilationLibLink_priv.h"
    #include <include/link_api/system_common.h>
    #include <src/rtos/utils_common/include/utils_mem.h>
    
    #include <ti/vlib/vlib.h>
    
    /**
     *******************************************************************************
     *
     * \brief Implementation of function to init plugins()
     *
     *        This function will be called by AlgorithmLink_initAlgPlugins, so as
     *        register plugins of frame copy algorithm
     *
     * \return  SYSTEM_LINK_STATUS_SOK on success
     *
     *******************************************************************************
     */
    Int32 AlgorithmLink_DilationLib_initPlugin()
    {
        AlgorithmLink_FuncTable pluginFunctions;
        UInt32 algId = (UInt32)-1;
    
        pluginFunctions.AlgorithmLink_AlgPluginCreate = AlgorithmLink_dilationlibCreate;
        pluginFunctions.AlgorithmLink_AlgPluginProcess = AlgorithmLink_dilationlibProcess;
        pluginFunctions.AlgorithmLink_AlgPluginControl = AlgorithmLink_dilationlibControl;
        pluginFunctions.AlgorithmLink_AlgPluginStop = AlgorithmLink_dilationlibStop;
        pluginFunctions.AlgorithmLink_AlgPluginDelete = AlgorithmLink_dilationlibDelete;
    
    #ifdef BUILD_DSP
        algId =  ALGORITHM_LINK_DSP_ALG_DILATIONLIB;
    #endif
    
    
        AlgorithmLink_registerPlugin(algId, &pluginFunctions);
    
        return SYSTEM_LINK_STATUS_SOK;
    }
    
    
    Int32 AlgorithmLink_dilationlibCreate(void * pObj, void * pCreateParams)
    {
        //Alg_Dilation_Obj            * algHandle;
        Int32                        frameIdx;
        Int32                        status    = SYSTEM_LINK_STATUS_SOK;
        UInt32                       maxHeight;
        UInt32                       maxWidth;
        System_Buffer              * pSystemBuffer;
        System_VideoFrameBuffer    * pSystemVideoFrameBuffer;
        System_LinkInfo              prevLinkInfo;
        Int32                        outputQId;
        Int32                        channelId;
        Int32                        numChannelsUsed;
        Int32                        numInputQUsed;
        Int32                        numOutputQUsed;
        UInt32                       prevLinkQueId;
        UInt32                       dataFormat;
        System_LinkChInfo          * pOutChInfo;
        System_LinkChInfo          * pPrevChInfo;
        UInt32                       prevChInfoFlags;
    
        AlgorithmLink_DilationLibObj          * pDilationLibObj;
        AlgorithmLink_DilationLibCreateParams * pDilationLibCreateParams;
        AlgorithmLink_OutputQueueInfo       * pOutputQInfo;
        AlgorithmLink_InputQueueInfo        * pInputQInfo;
    
    
        pDilationLibCreateParams = (AlgorithmLink_DilationLibCreateParams *)pCreateParams;
    
        /*
         * Space for Algorithm specific object gets allocated here.
         * Pointer gets recorded in algorithmParams
         */
        pDilationLibObj = (AlgorithmLink_DilationLibObj *) Utils_memAlloc(UTILS_HEAPID_DDR_CACHED_LOCAL, sizeof(AlgorithmLink_DilationLibObj), 32);
    
        UTILS_assert(pDilationLibObj!=NULL);
    
        pOutputQInfo = &pDilationLibObj->outputQInfo;
        pInputQInfo  = &pDilationLibObj->inputQInfo;
    
        AlgorithmLink_setAlgorithmParamsObj(pObj, pDilationLibObj);
    
        /*
         * Taking copy of needed create time parameters in local object for future
         * reference.
         */
        pDilationLibObj->algLinkCreateParams.maxHeight = pDilationLibCreateParams->maxHeight;
        pDilationLibObj->algLinkCreateParams.maxWidth  = pDilationLibCreateParams->maxWidth;
        pDilationLibObj->algLinkCreateParams.numOutputFrames = pDilationLibCreateParams->numOutputFrames;
    
        memcpy((void*)(&pDilationLibObj->outQueParams),
               (void*)(&pDilationLibCreateParams->outQueParams),
               sizeof(System_LinkOutQueParams));
        memcpy((void*)(&pDilationLibObj->inQueParams),
               (void*)(&pDilationLibCreateParams->inQueParams),
               sizeof(System_LinkInQueParams));
    
        /*
         * Populating parameters corresponding to Q usage of frame copy
         * algorithm link
         */
        numInputQUsed     = 1;
        numOutputQUsed    = 1;
        numChannelsUsed   = 1;
        pInputQInfo->qMode  = ALGORITHM_LINK_QUEUEMODE_NOTINPLACE;
        pOutputQInfo->qMode = ALGORITHM_LINK_QUEUEMODE_NOTINPLACE;
    
        outputQId                 = 0;
        pOutputQInfo->queInfo.numCh = numChannelsUsed;
    
        /*
         * Channel info of current link will be obtained from previous link.
         * If any of the properties get changed in the current link, then those
         * values need to be updated accordingly in
         * pOutputQInfo->queInfo.chInfo[channelId]
         * In frame copy example, only pitch changes. Hence only it is
         * updated. Other parameters are copied from prev link.
         */
        status = System_linkGetInfo(pDilationLibCreateParams->inQueParams.prevLinkId, &prevLinkInfo);
    
        prevLinkQueId = pDilationLibCreateParams->inQueParams.prevLinkQueId;
    
        pDilationLibObj->numInputChannels = prevLinkInfo.queInfo[prevLinkQueId].numCh;
    
        maxHeight = pDilationLibObj->algLinkCreateParams.maxHeight;
        maxWidth  = pDilationLibObj->algLinkCreateParams.maxWidth;
    
        /*
         * Make pitch a multiple of ALGORITHMLINK_FRAME_ALIGN, so that if the frame
         * origin is aligned, then individual lines are also aligned
         * Also note that the pitch is kept same independent of width of
         * individual channels
         */
        pDilationLibObj->pitch = maxWidth;
        if(maxWidth % ALGORITHMLINK_FRAME_ALIGN)
        {
            pDilationLibObj->pitch += (ALGORITHMLINK_FRAME_ALIGN -
                                    (maxWidth % ALGORITHMLINK_FRAME_ALIGN));
        }
    
        /*
         * Channel Info Population
         */
        for(channelId =0 ; channelId < pDilationLibObj->numInputChannels; channelId++)
        {
    
          pOutChInfo      = &(pOutputQInfo->queInfo.chInfo[channelId]);
          pPrevChInfo     = &(prevLinkInfo.queInfo[prevLinkQueId].chInfo[channelId]);
          prevChInfoFlags = pPrevChInfo->flags;
    
          /*
           * Certain channel info parameters simply get defined by previous link
           * channel info. Hence copying them to output channel info
           */
          pOutChInfo->startX = pPrevChInfo->startX;
          pOutChInfo->startY = pPrevChInfo->startY;
          pOutChInfo->width  = pPrevChInfo->width;
          pOutChInfo->height = pPrevChInfo->height;
          pOutChInfo->flags  = prevChInfoFlags;
    
          dataFormat = System_Link_Ch_Info_Get_Flag_Data_Format(prevChInfoFlags);
    
          if((dataFormat != SYSTEM_DF_YUV422I_YUYV)
             &&
             (dataFormat != SYSTEM_DF_YUV420SP_UV)
            )
          {
            return SYSTEM_LINK_STATUS_EFAIL;
          }
    
          if(pPrevChInfo->width > maxWidth || pPrevChInfo->height > maxHeight)
          {
            return SYSTEM_LINK_STATUS_EFAIL;
          }
    
          /*
           * Certain channel info parameters are properties of the current link,
           * They are set here.
           */
          pOutChInfo->pitch[0] = pDilationLibObj->pitch;
          pOutChInfo->pitch[1] = pDilationLibObj->pitch;
          pOutChInfo->pitch[2] = pDilationLibObj->pitch;
    
          if(dataFormat == SYSTEM_DF_YUV422I_YUYV)
          {
            pOutChInfo->pitch[0] = pDilationLibObj->pitch * 2;
          }
    
          /*
           * Taking a copy of input channel info in the link object for any future
           * use
           */
          memcpy((void *)&(pDilationLibObj->inputChInfo[channelId]),
                 (void *)&(prevLinkInfo.queInfo[prevLinkQueId].chInfo[channelId]),
                 sizeof(System_LinkChInfo)
                );
        }
    
        /*
         * If any output buffer Q gets used in INPLACE manner, then
         * pOutputQInfo->inQueParams and
         * pOutputQInfo->inputQId need to be populated appropriately.
         */
    
        /*
         * Initializations needed for book keeping of buffer handling.
         * Note that this needs to be called only after setting inputQMode and
         * outputQMode.
         */
        AlgorithmLink_queueInfoInit(pObj,
                                    numInputQUsed,
                                    pInputQInfo,
                                    numOutputQUsed,
                                    pOutputQInfo
                                    );
    
        /*
         * Algorithm creation happens here
         * - Population of create time parameters
         * - Create call for algorithm
         * - Algorithm handle gets recorded inside link object
         */
    
        //pDilationLibObj->createParams.maxHeight    = maxHeight;
        //pDilationLibObj->createParams.maxWidth     = maxWidth;
        pDilationLibObj->frameDropCounter          = 0;
    
        //algHandle = Alg_DilationCreate(&pDilationLibObj->createParams);
        //UTILS_assert(algHandle != NULL);
    
        //pDilationLibObj->algHandle = algHandle;
    
        /*
         * Creation of output buffers for output buffer Q = 0 (Used)
         *  - Connecting video frame buffer to system buffer payload
         *  - Memory allocation for Luma and Chroma buffers (Assume 420 format)
         *  - Put the buffer into empty queue
         */
        outputQId = 0;
    
        for(channelId =0 ; channelId < pDilationLibObj->numInputChannels; channelId++)
        {
          for(frameIdx = 0; frameIdx < pDilationLibObj->algLinkCreateParams.numOutputFrames; frameIdx++)
          {
            pSystemBuffer 		= &(pDilationLibObj->buffers[channelId][frameIdx]);
            pSystemVideoFrameBuffer = &(pDilationLibObj->videoFrames[channelId][frameIdx]);
    
            /*
             * Properties of pSystemBuffer, which do not get altered during
             * run time (frame exchanges) are initialized here
             */
            pSystemBuffer->payload     = pSystemVideoFrameBuffer;
            pSystemBuffer->payloadSize = sizeof(System_VideoFrameBuffer);
            pSystemBuffer->bufType     = SYSTEM_BUFFER_TYPE_VIDEO_FRAME;
            pSystemBuffer->chNum       = channelId;
    
            memcpy((void *)&pSystemVideoFrameBuffer->chInfo,
                   (void *)&pOutputQInfo->queInfo.chInfo[channelId],
                   sizeof(System_LinkChInfo));
    
            /*
             * Buffer allocation done for maxHeight, maxWidth and also assuming
             * worst case num planes = 2, for data Format SYSTEM_DF_YUV422I_YUYV
             * run time (frame exchanges) are initialized here
             */
            pSystemVideoFrameBuffer->bufAddr[0] = Utils_memAlloc(
                                               UTILS_HEAPID_DDR_CACHED_SR,
                                               (maxHeight*(pDilationLibObj->pitch)*2),
                                               ALGORITHMLINK_FRAME_ALIGN);
    
            /*
             * Carving out memory pointer for chroma which will get used in case of
             * SYSTEM_DF_YUV422SP_UV
             */
            pSystemVideoFrameBuffer->bufAddr[1] = (void*)(
                (UInt32) pSystemVideoFrameBuffer->bufAddr[0] +
                (UInt32)(maxHeight*(pDilationLibObj->pitch))
                );
    
            UTILS_assert(pSystemVideoFrameBuffer->bufAddr[0] != NULL);
    
            AlgorithmLink_putEmptyOutputBuffer(pObj, outputQId, pSystemBuffer);
          }
        }
    
        pDilationLibObj->linkStatsInfo = Utils_linkStatsCollectorAllocInst(
            AlgorithmLink_getLinkId(pObj), "ALG_DILATIONLIB");
        UTILS_assert(NULL != pDilationLibObj->linkStatsInfo);
    
        pDilationLibObj->isFirstFrameRecv = FALSE;
    
    
        printf("dilation lib create done\n");
        return status;
    }
    
    
    Int32 AlgorithmLink_dilationlibProcess(void * pObj)
    {
        AlgorithmLink_DilationLibObj   * pDilationLibObj;
        //Alg_Dilation_Obj            * algHandle;
        Int32                        inputQId;
        Int32                        outputQId;
        UInt32                       channelId;
        Int32                        status    = SYSTEM_LINK_STATUS_SOK;
        UInt32                       bufId;
        System_BufferList            inputBufList;
        System_BufferList            inputBufListReturn;
        System_BufferList            outputBufListReturn;
        System_Buffer              * pSysBufferInput;
        System_VideoFrameBuffer    * pSysVideoFrameBufferInput;
        System_Buffer              * pSysBufferOutput;
        System_VideoFrameBuffer    * pSysVideoFrameBufferOutput;
        UInt32                       dataFormat;
        UInt32                       outPitch[SYSTEM_MAX_PLANES];
        UInt32                       bufSize[SYSTEM_MAX_PLANES];
        UInt32                       bufCntr;
        UInt32                       numBuffs;
        System_LinkChInfo          * pInputChInfo;
        System_LinkChInfo          * pOutputChInfo;
        Bool                         bufDropFlag;
        System_LinkStatistics      * linkStatsInfo;
    
        //for bit masking
    
        uint32_t  in_pack_size, out_pack_size;
        uint32_t  *restrict In_mask32packed ;
        uint32_t  *restrict Out_mask32packed ;
    
        uint8_t *restrict inBuff;
    	  uint8_t *restrict outBuff;
    	  int i;
    
        pDilationLibObj = (AlgorithmLink_DilationLibObj *) AlgorithmLink_getAlgorithmParamsObj(pObj);
    
        linkStatsInfo = pDilationLibObj->linkStatsInfo;
        UTILS_assert(NULL != linkStatsInfo);
    
        //algHandle     = pDilationLibObj->algHandle;
    
        Utils_linkStatsCollectorProcessCmd(linkStatsInfo);
    
        linkStatsInfo->linkStats.newDataCmdCount++;
    
        /*
         * Getting input buffers from previous link
         */
        System_getLinksFullBuffers(pDilationLibObj->inQueParams.prevLinkId,pDilationLibObj->inQueParams.prevLinkQueId, &inputBufList);
    
        if(inputBufList.numBuf)
        {
            if(pDilationLibObj->isFirstFrameRecv==FALSE)
            {
                pDilationLibObj->isFirstFrameRecv = TRUE;
    
                Utils_resetLinkStatistics(
                        &linkStatsInfo->linkStats,
                        pDilationLibObj->numInputChannels,
                        1);
    
                Utils_resetLatency(&linkStatsInfo->linkLatency);
                Utils_resetLatency(&linkStatsInfo->srcToLinkLatency);
            }
    
            for (bufId = 0; bufId < inputBufList.numBuf; bufId++)
            {
    
              pSysBufferInput           = inputBufList.buffers[bufId];
              pSysVideoFrameBufferInput = pSysBufferInput->payload;
    
              channelId = pSysBufferInput->chNum;
    
              if(channelId < pDilationLibObj->numInputChannels)
              {
                linkStatsInfo->linkStats.chStats[channelId].inBufRecvCount++;
              }
    
              /*
               * Error checks can be done on the input buffer and only later,
               * it can be picked for processing
               */
              if((pSysBufferInput->bufType != SYSTEM_BUFFER_TYPE_VIDEO_FRAME) || (channelId >= pDilationLibObj->numInputChannels) )
              {
                bufDropFlag = TRUE;
                linkStatsInfo->linkStats.inBufErrorCount++;
              }
              else
              {
    
              /*
               * Getting free (empty) buffers from pool of output buffers
               */
              outputQId = 0;
    
    
              status = AlgorithmLink_getEmptyOutputBuffer(pObj,
                                                          outputQId,
                                                          channelId,
                                                          &pSysBufferOutput);
    
              /*
               * Get into algorithm processing only if an output frame is available.
               * Else input buffer will be returned back to sender and its a case
               * of frame drop.
               */
              if(status == SYSTEM_LINK_STATUS_SOK)
              {
    
              pSysVideoFrameBufferOutput = pSysBufferOutput->payload;
              pOutputChInfo = &(pDilationLibObj->outputQInfo.queInfo.chInfo[channelId]);
              pInputChInfo  = &(pDilationLibObj->inputChInfo[channelId]);
    
              /*
               * If there is any parameter change on the input channel,
               * then, channel info needs to be read from pSysVideoFrameBufferInput.
               * And then,
               *  - Update the local copies present in OutputQInfo and inputChInfo
               *  - Also update channel info in pSysVideoFrameBufferOutput to
               *    pass on new parameters to next link
               */
    
              if(System_Link_Ch_Info_Get_Flag_Is_Rt_Prm_Update(pSysVideoFrameBufferInput->chInfo.flags))
              {
                pInputChInfo = &(pSysVideoFrameBufferInput->chInfo);
    
                memcpy(&(pDilationLibObj->inputChInfo[channelId]), pInputChInfo, sizeof(System_LinkChInfo));
    
                memcpy(pOutputChInfo, pInputChInfo, sizeof(System_LinkChInfo));
    
                dataFormat = System_Link_Ch_Info_Get_Flag_Data_Format(pInputChInfo->flags);
    
                /*
                 * Upon dataFormat change pitch for plane 0 needs to be updated
                 * Plane 1 is used only for 420 SP case and it need not be altered
                 */
                pOutputChInfo->pitch[0] = pDilationLibObj->pitch;
                if(dataFormat == SYSTEM_DF_YUV422I_YUYV)
                {
                    pOutputChInfo->pitch[0] = pDilationLibObj->pitch * 2;
                }
    
                /*
                 * Also update the Channel info in Output System Buffer to pass it
                 * on to next link
                 */
                memcpy(&(pSysVideoFrameBufferOutput->chInfo), pOutputChInfo, sizeof(System_LinkChInfo));
              }
              else
              {
                /*
                 * Indicating to next link that there has been no parameter update
                 */
                pSysVideoFrameBufferOutput->chInfo.flags = System_Link_Ch_Info_Set_Flag_Is_Rt_Prm_Update(pSysVideoFrameBufferOutput->chInfo.flags,0);
              }
    
              /*
               * Call to the algorithm
               */
              outPitch[0] = pOutputChInfo->pitch[0];
              outPitch[1] = pOutputChInfo->pitch[1];
    
              dataFormat = System_Link_Ch_Info_Get_Flag_Data_Format(pOutputChInfo->flags);
    
              switch (dataFormat)
              {
                  case SYSTEM_DF_YUV422I_YUYV:
                      numBuffs = 1;
                      break;
                  case SYSTEM_DF_YUV420SP_UV:
                      numBuffs = 2;
                      break;
                  default:
                      numBuffs = 1;
                      UTILS_assert (0);
                      break;
              }
    
              pSysBufferOutput->srcTimestamp = pSysBufferInput->srcTimestamp;
              pSysBufferOutput->frameId = pSysBufferInput->frameId;
              pSysBufferOutput->linkLocalTimestamp = Utils_getCurGlobalTimeInUsec();
    
              /*
               * Cache Invalidate of input buffer
               */
              
                bufSize[0]  = ((pInputChInfo->height)*(pInputChInfo->pitch[0]));
                bufSize[1]  = ((pInputChInfo->height)*(pInputChInfo->pitch[1]));
    
                for(bufCntr = 0; bufCntr < numBuffs; bufCntr++)
                {
                    Cache_inv(pSysVideoFrameBufferInput->bufAddr[bufCntr],
                              bufSize[bufCntr],
                              Cache_Type_ALLD,
                              TRUE
                             );
                }
    
    	        in_pack_size = 	(pInputChInfo->width) / 8 + 4; /* bit packed binary input with zero padding */
              out_pack_size = (pInputChInfo->width) / 8;     /* bit packed binary output */
    
              In_mask32packed         =  (uint32_t *) VLIB_memalign(8, in_pack_size);
              Out_mask32packed        =  (uint32_t *) VLIB_memalign(8, out_pack_size);
    
    
    
              inBuff = pSysVideoFrameBufferInput->bufAddr[0];
              outBuff = pSysVideoFrameBufferOutput->bufAddr[0];
    
              for (i=0; i<pInputChInfo->height;i++){
    
                VLIB_packMask32(inBuff, In_mask32packed, (pInputChInfo->width) );
    
                VLIB_dilate_bin_square((uint8_t *)In_mask32packed,
            		  	  	  	  (uint8_t *)Out_mask32packed,
            		  	  	  	  (pInputChInfo->width),
            		  	  	  	  pInputChInfo->pitch[0]
         	 	 	 	 	 	 );
    
    
                VLIB_unpackMask32(Out_mask32packed, outBuff, (pInputChInfo->width) );
    
                inBuff += pInputChInfo->pitch[0];
                outBuff += pInputChInfo->pitch[0];
            }
     
              /*
               * Cache Write back of output buffer to DDR
               */
              
                bufSize[0]  = ((pOutputChInfo->height)*(outPitch[0]));
                bufSize[1]  = ((pOutputChInfo->height)*(outPitch[1]));
                for(bufCntr = 0; bufCntr < numBuffs; bufCntr++)
                {
                  Cache_wb(pSysVideoFrameBufferOutput->bufAddr[bufCntr],bufSize[bufCntr], Cache_Type_ALLD,TRUE);
                }
              
    
              Utils_updateLatency(&linkStatsInfo->linkLatency,pSysBufferOutput->linkLocalTimestamp);
              Utils_updateLatency(&linkStatsInfo->srcToLinkLatency,pSysBufferOutput->srcTimestamp);
    
              linkStatsInfo->linkStats.chStats[pSysBufferInput->chNum].inBufProcessCount++;
              linkStatsInfo->linkStats.chStats[pSysBufferInput->chNum].outBufCount[0]++;
    
              /*
               * Putting filled buffer into output full buffer Q
               * Note that this does not mean algorithm has freed the output buffer
               */
              status = AlgorithmLink_putFullOutputBuffer(pObj,outputQId,pSysBufferOutput);
    
              UTILS_assert(status == SYSTEM_LINK_STATUS_SOK);
    
              /*
               * Informing next link that a new data has peen put for its
               * processing
               */
              System_sendLinkCmd(pDilationLibObj->outQueParams.nextLink,SYSTEM_CMD_NEW_DATA,NULL);
    
              /*
               * Releasing (Free'ing) output buffer, since algorithm does not need
               * it for any future usage.
               * In case of INPLACE computation, there is no need to free output
               * buffer, since it will be freed as input buffer.
               */
              outputQId                      = 0;
              outputBufListReturn.numBuf     = 1;
              outputBufListReturn.buffers[0] = pSysBufferOutput;
    
              AlgorithmLink_releaseOutputBuffer(pObj,outputQId,&outputBufListReturn);
    
              bufDropFlag = FALSE;
    
              }
              else
              {
                bufDropFlag = TRUE;
    
                linkStatsInfo->linkStats.outBufErrorCount++;
                linkStatsInfo->linkStats.chStats[pSysBufferInput->chNum].inBufDropCount++;
                linkStatsInfo->linkStats.chStats[pSysBufferInput->chNum].outBufDropCount[0]++;
    
              } /* Output Buffer availability */
    
              } /* Input Buffer validity */
    
              /*
               * Releasing (Free'ing) input buffer, since algorithm does not need
               * it for any future usage.
               */
              inputQId                      = 0;
              inputBufListReturn.numBuf     = 1;
              inputBufListReturn.buffers[0] = pSysBufferInput;
              AlgorithmLink_releaseInputBuffer(pObj,inputQId,pDilationLibObj->inQueParams.prevLinkId,pDilationLibObj->inQueParams.prevLinkQueId,&inputBufListReturn,
                                          &bufDropFlag);
            }
    
        }
    
        return status;
    }
    
    Int32 AlgorithmLink_dilationlibControl(void * pObj, void * pControlParams)
    {
        AlgorithmLink_DilationLibObj       * pDilationLibObj;
        AlgorithmLink_ControlParams    * pAlgLinkControlPrm;
        //Alg_Dilation_Obj                * algHandle;
        Int32                        status    = SYSTEM_LINK_STATUS_SOK;
    
        pDilationLibObj = (AlgorithmLink_DilationLibObj *)AlgorithmLink_getAlgorithmParamsObj(pObj);
        //algHandle     = pDilationLibObj->algHandle;
    
        pAlgLinkControlPrm = (AlgorithmLink_ControlParams *)pControlParams;
    
        /*
         * There can be other commands to alter the properties of the alg link
         * or properties of the core algorithm.
         * In this simple example, there is just a control command to print
         * statistics and a default call to algorithm control.
         */
        switch(pAlgLinkControlPrm->controlCmd)
        {
    
            case SYSTEM_CMD_PRINT_STATISTICS:
                AlgorithmLink_dilationlibPrintStatistics(pObj, pDilationLibObj);
                break;
    
            //default:
            //    status = Alg_DilationControl(algHandle,
            //                                  &(pDilationLibObj->controlParams)
             //                                 );
             //   break;
        }
    
        return status;
    }
    
    Int32 AlgorithmLink_dilationlibStop(void * pObj)
    {
        return SYSTEM_LINK_STATUS_SOK;
    }
    
    Int32 AlgorithmLink_dilationlibDelete(void * pObj)
    {
        //Alg_Dilation_Obj            * algHandle;
        Int32                        frameIdx;
        Int32                        status    = SYSTEM_LINK_STATUS_SOK;
        UInt32                       maxHeight;
        UInt32                       channelId;
    
        System_VideoFrameBuffer    * pSystemVideoFrameBuffer;
        AlgorithmLink_DilationLibObj   * pDilationLibObj;
    
        pDilationLibObj = (AlgorithmLink_DilationLibObj *)AlgorithmLink_getAlgorithmParamsObj(pObj);
    
        status = Utils_linkStatsCollectorDeAllocInst(pDilationLibObj->linkStatsInfo);
        UTILS_assert(status == SYSTEM_LINK_STATUS_SOK);
    
        maxHeight = pDilationLibObj->algLinkCreateParams.maxHeight;
    
        for(channelId =0 ; channelId < pDilationLibObj->numInputChannels; channelId++)
        {
          for(frameIdx = 0; frameIdx < (pDilationLibObj->algLinkCreateParams.numOutputFrames); frameIdx++)
          {
            pSystemVideoFrameBuffer = &(pDilationLibObj->videoFrames[channelId][frameIdx]);
    
            /*
             * Free'ing up of allocated buffers
             */
            status = Utils_memFree(UTILS_HEAPID_DDR_CACHED_SR,pSystemVideoFrameBuffer->bufAddr[0],(maxHeight*pDilationLibObj->pitch*2));
            UTILS_assert(status == SYSTEM_LINK_STATUS_SOK);
          }
        }
    
        status = Utils_memFree(UTILS_HEAPID_DDR_CACHED_LOCAL,pDilationLibObj, sizeof(AlgorithmLink_DilationLibObj));
        UTILS_assert(status == SYSTEM_LINK_STATUS_SOK);
    
        return SYSTEM_LINK_STATUS_SOK;
    }
    
    Int32 AlgorithmLink_dilationlibPrintStatistics(void *pObj,AlgorithmLink_DilationLibObj *pDilationLibObj)
    {
    
        UTILS_assert(NULL != pDilationLibObj->linkStatsInfo);
    
        Utils_printLinkStatistics(&pDilationLibObj->linkStatsInfo->linkStats, "ALG_DILATIONLIB", TRUE);
    
        Utils_printLatency("ALG_DILATIONLIB",&pDilationLibObj->linkStatsInfo->linkLatency,&pDilationLibObj->linkStatsInfo->srcToLinkLatency,TRUE);
    
        return SYSTEM_LINK_STATUS_SOK;
    }
    
    /* Nothing beyond this point */
    

    Regards,

    Kajal

  • A few comments:

    1. VLIB_memalign is for VLIB testing purposes. It is probably better to use Utils_memAlloc for memory allocation.
    2. In my previous post, I said to allocate and run 1 line at a time, however, the dilation needs 3 lines to run properly. This means you need to have at least 3 lines before calling the dilation. Given this, it may be pretty inefficient to just run 1 line output from dilation at a time, and have to recopy 3 lines for every one line output. Perhaps it is better to allocate something like 10 lines. This way, out of 10 lines read into the buffer, dilation runs to output 8 lines. Then only 2 lines need to be "recopied" for every slice of producing 8 output lines.
    3. Also, the output will only have values of 0 or 1. If you are trying to see something on the display, I think it would "look" like zeros unless you convert the 1 values to something larger (like 255).

    Jesse
  • Hello Jesse,

    Now  we have changed input , so that we can directly call dilation( i. e giving input in binary contain 0 or 1) , please find code snippet below,

        inBuff = (uint8_t *)pSysVideoFrameBufferInput->bufAddr[0];
        outBuff = (uint8_t *)pSysVideoFrameBufferOutput->bufAddr[0];

        VLIB_dilate_bin_square((uint8_t *)inBuff,
                                   (uint8_t *)outBuff,
                                   ((pInputChInfo->width * (pInputChInfo->height -2))),
                                   pInputChInfo->pitch[0]
                                   );

    where,

    pInputChInfo->width = 1920

    pInputChInfo->height =1080

    pInputChInfo->pitch[0] = 1920

    After this we are just multiplying by 255 , for displaying purpose.

    But output is looking as below,

    7360.output.zip

    it is either not processing whole image or not displaying properly??

    Do you have any comments for this??

    Regards,

    Kajal

  • Are you unpacking the output before multiplying by 255?

  • Hello Jesse,

    Thank for your suggestions.

    Regards,
    Kajal .
  • Hello Jesse,

    We have used vlib api for dilation, and processing 10 lines at a time ( 8 lines of output ), output is also expected.

    But though we have used optimized library function and processing 10 lines at a time, we are not getting performance, i.e output on display is getting trapped.

    For your reference attaching logs and output video.

    Please do let me your comments for the same.

    Log:

    dilationLib.log
      
    [IPU1-0]     85.677514 s:  CPU [IPU1-0 ] Statistics, 
    [IPU1-0]     85.677575 s:  ************************* 
    [IPU1-0]     85.677636 s:  
    [IPU1-0]     85.677697 s:  LOAD: CPU: 7.1% HWI: 1.7%, SWI:0.4%, Low Power: 87.2% 
    [IPU1-0]     85.677819 s:  
    [IPU1-0]     85.677880 s:  LOAD: TSK: SYSTEM_MSGQ         : 0.1% 
    [IPU1-0]     85.677972 s:  LOAD: TSK: SYSTEM_IPU1_127     : 0.1% 
    [IPU1-0]     85.678063 s:  LOAD: TSK: IPC_OUT_0           : 0.1% 
    [IPU1-0]     85.678185 s:  LOAD: TSK: DISPLAY0            : 0.1% 
    [IPU1-0]     85.678307 s:  LOAD: TSK: CAPTURE             : 0.2% 
    [IPU1-0]     85.678429 s:  LOAD: TSK: STAT_COLL           : 3.1% 
    [IPU1-0]     85.678521 s:  LOAD: TSK: MISC                : 1.3% 
    [IPU1-0]     85.678612 s:  
    [IPU1-0]     85.678643 s:  SYSTEM: SW Message Box Msg Pool, Free Msg Count = 1022 
    [IPU1-0]     85.678734 s:  
    [IPU1-0]     85.678765 s:  SYSTEM: Sempahores Objects,  159 of 1000 free 
    [IPU1-0]     85.678856 s:  SYSTEM: Task Objects      ,   20 of  100 free 
    [IPU1-0]     85.678948 s:  SYSTEM: Clock Objects     ,   97 of  100 free 
    [IPU1-0]     85.679039 s:  SYSTEM: Hwi Objects       ,   89 of  100 free 
    [IPU1-0]     85.679131 s:  
    [IPU1-0]     85.679161 s:  SYSTEM: Heap = LOCAL_DDR            @ 0x00000000, Total size = 262144 B (256 KB), Free size = 250800 B (244 KB)
    [IPU1-0]     85.679344 s:  SYSTEM: Heap = SR_OCMC              @ 0x00000000, Total size = 0 B (0 KB), Free size = 0 B (0 KB)
    [IPU1-0]     85.679466 s:  SYSTEM: Heap = SR_DDR_CACHED        @ 0x89d03000, Total size = 368037888 B (350 MB), Free size = 192611328 B (183 MB)
    [IPU1-0]     85.679649 s:  SYSTEM: Heap = SR_DDR_NON_CACHED    @ 0xbfc00000, Total size = 129152 B (0 MB), Free size = 117632 B (0 MB)
    [IPU1-0]     85.679802 s:  
    [IPU1-0]     85.679832 s:  
    [IPU1-0]     85.679893 s:  CPU [IPU1-1 ] Statistics, 
    [IPU1-0]     85.679954 s:  ************************* 
    [IPU1-0]     85.679985 s:  
    [IPU1-0]     85.680046 s:  LOAD: CPU: 2.3% HWI: 0.5%, SWI:0.4%, Low Power: 88.6% 
    [IPU1-0]     85.680168 s:  
    [IPU1-0]     85.680290 s:  LOAD: TSK: MISC                : 1.4% 
    [IPU1-0]     85.680381 s:  
    [IPU1-0]     85.680412 s:  SYSTEM: SW Message Box Msg Pool, Free Msg Count = 1022 
    [IPU1-0]     85.680503 s:  
    [IPU1-0]     85.680534 s:  SYSTEM: Sempahores Objects,  407 of 1000 free 
    [IPU1-0]     85.680625 s:  SYSTEM: Task Objects      ,   39 of  100 free 
    [IPU1-0]     85.680717 s:  SYSTEM: Clock Objects     ,   99 of  100 free 
    [IPU1-0]     85.680808 s:  SYSTEM: Hwi Objects       ,   99 of  100 free 
    [IPU1-0]     85.680900 s:  
    [IPU1-0]     85.680930 s:  SYSTEM: Heap = LOCAL_DDR            @ 0x00000000, Total size = 655360 B (640 KB), Free size = 624024 B (609 KB)
    [IPU1-0]     85.681083 s:  
    [IPU1-0]     85.681113 s:  
    [IPU1-0]     85.681174 s:  CPU [HOST   ] Statistics, 
    [IPU1-0]     85.681205 s:  ************************* 
    [IPU1-0]     85.681296 s:  
    [IPU1-0]     85.681327 s:  LOAD: CPU: 0.3% HWI: 0.1%, SWI:0.0%, Low Power: 98.5% 
    [IPU1-0]     85.681479 s:  
    [IPU1-0]     85.681540 s:  LOAD: TSK: MISC                : 0.2% 
    [IPU1-0]     85.681632 s:  
    [IPU1-0]     85.681662 s:  SYSTEM: SW Message Box Msg Pool, Free Msg Count = 1022 
    [IPU1-0]     85.681754 s:  
    [IPU1-0]     85.681784 s:  SYSTEM: Sempahores Objects,  408 of 1000 free 
    [IPU1-0]     85.681876 s:  SYSTEM: Task Objects      ,   41 of  100 free 
    [IPU1-0]     85.681967 s:  SYSTEM: Clock Objects     ,   99 of  100 free 
    [IPU1-0]     85.682059 s:  SYSTEM: Hwi Objects       ,   99 of  100 free 
    [IPU1-0]     85.682150 s:  
    [IPU1-0]     85.682181 s:  SYSTEM: Heap = LOCAL_DDR            @ 0x00000000, Total size = 6291456 B (6144 KB), Free size = 6279992 B (6132 KB)
    [IPU1-0]     85.682516 s:  
    [IPU1-0]     85.682547 s:  
    [IPU1-0]     85.682608 s:  CPU [DSP1   ] Statistics, 
    [IPU1-0]     85.682638 s:  ************************* 
    [IPU1-0]     85.682699 s:  
    [IPU1-0]     85.682760 s:  LOAD: CPU: 99.6% HWI: 0.7%, SWI:0.3%, Low Power: 0.3% 
    [IPU1-0]     85.682882 s:  
    [IPU1-0]     85.682943 s:  LOAD: TSK: SYSTEM_TSK_MULTI_MBX: 98.4% 
    [IPU1-0]     85.683035 s:  LOAD: TSK: MISC                : 0.2% 
    [IPU1-0]     85.683126 s:  
    [IPU1-0]     85.683157 s:  SYSTEM: SW Message Box Msg Pool, Free Msg Count = 1022 
    [IPU1-0]     85.683248 s:  
    [IPU1-0]     85.683309 s:  SYSTEM: Sempahores Objects,  410 of 1000 free 
    [IPU1-0]     85.683401 s:  SYSTEM: Task Objects      ,   92 of  100 free 
    [IPU1-0]     85.683492 s:  SYSTEM: Clock Objects     ,   98 of  100 free 
    [IPU1-0]     85.683584 s:  SYSTEM: Hwi Objects       ,  100 of  100 free 
    [IPU1-0]     85.683645 s:  
    [IPU1-0]     85.683706 s:  SYSTEM: Heap = LOCAL_L2             @ 0x00800000, Total size = 227264 B (221 KB), Free size = 227264 B (221 KB)
    [IPU1-0]     85.683858 s:  SYSTEM: Heap = LOCAL_DDR            @ 0x00000000, Total size = 524288 B (512 KB), Free size = 488848 B (477 KB)
    [IPU1-0]     85.684011 s:  
    [IPU1-0]     85.684041 s:  
    [IPU1-0]     85.684072 s:  CPU [DSP2   ] Statistics, 
    [IPU1-0]     85.684133 s:  ************************* 
    [IPU1-0]     85.684194 s:  
    [IPU1-0]     85.684255 s:  LOAD: CPU: 0.2% HWI: 0.1%, SWI:0.0%, Low Power: 99.0% 
    [IPU1-0]     85.684407 s:  
    [IPU1-0]     85.684468 s:  LOAD: TSK: MISC                : 0.1% 
    [IPU1-0]     85.684529 s:  
    [IPU1-0]     85.684590 s:  SYSTEM: SW Message Box Msg Pool, Free Msg Count = 1022 
    [IPU1-0]     85.684651 s:  
    [IPU1-0]     85.684712 s:  SYSTEM: Sempahores Objects,  411 of 1000 free 
    [IPU1-0]     85.684956 s:  SYSTEM: Task Objects      ,   92 of  100 free 
    [IPU1-0]     85.685048 s:  SYSTEM: Clock Objects     ,   99 of  100 free 
    [IPU1-0]     85.685139 s:  SYSTEM: Hwi Objects       ,  100 of  100 free 
    [IPU1-0]     85.685292 s:  
    [IPU1-0]     85.685353 s:  SYSTEM: Heap = LOCAL_L2             @ 0x00800000, Total size = 227264 B (221 KB), Free size = 227264 B (221 KB)
    [IPU1-0]     85.685536 s:  SYSTEM: Heap = LOCAL_DDR            @ 0x00000000, Total size = 524288 B (512 KB), Free size = 518328 B (506 KB)
    [IPU1-0]     85.685689 s:  
    [IPU1-0]     85.685719 s:  
    [IPU1-0]     85.685750 s:  CPU [EVE1   ] Statistics, 
    [IPU1-0]     85.685811 s:  ************************* 
    [IPU1-0]     85.685872 s:  
    [IPU1-0]     85.685933 s:  LOAD: CPU: 0.5% HWI: 0.2%, SWI:0.1%, Low Power: 93.7% 
    [IPU1-0]     85.686055 s:  
    [IPU1-0]     85.686116 s:  LOAD: TSK: MISC                : 0.2% 
    [IPU1-0]     85.686177 s:  
    [IPU1-0]     85.686238 s:  SYSTEM: SW Message Box Msg Pool, Free Msg Count = 1022 
    [IPU1-0]     85.686329 s:  
    [IPU1-0]     85.686390 s:  SYSTEM: Sempahores Objects,  412 of 1000 free 
    [IPU1-0]     85.686482 s:  SYSTEM: Task Objects      ,   93 of  100 free 
    [IPU1-0]     85.686573 s:  SYSTEM: Clock Objects     ,   99 of  100 free 
    [IPU1-0]     85.686634 s:  SYSTEM: Hwi Objects       ,   99 of  100 free 
    [IPU1-0]     85.686726 s:  
    [IPU1-0]     85.686756 s:  SYSTEM: Heap = LOCAL_L2             @ 0x40020000, Total size = 22528 B (22 KB), Free size = 22528 B (22 KB)
    [IPU1-0]     85.686909 s:  SYSTEM: Heap = LOCAL_DDR            @ 0x00000000, Total size = 262144 B (256 KB), Free size = 256200 B (250 KB)
    [IPU1-0]     85.687092 s:  
    [IPU1-0]     85.687824 s:  
    [IPU1-0]     85.687885 s:  UTILS_PRCM_STATS:  Current Temperature,
    [IPU1-0]     85.687946 s:  
    [IPU1-0]     85.687976 s:   Voltage Rail         ||   Curr Temp Min - Max   
    [IPU1-0]     85.688068 s:  --------------------------------------------------------- 
    [IPU1-0]     85.688159 s:      PMHAL_PRCM_VD_MPU ||     [34.800 , 35.200]    
    [IPU1-0]     85.688251 s:     PMHAL_PRCM_VD_CORE ||     [34.800 , 35.200]    
    [IPU1-0]     85.688403 s:    PMHAL_PRCM_VD_IVAHD ||     [34.800 , 35.200]    
    [IPU1-0]     85.688495 s:   PMHAL_PRCM_VD_DSPEVE ||     [33.200 , 33.600]    
    [IPU1-0]     85.688617 s:      PMHAL_PRCM_VD_GPU ||     [34.  0 , 34.400]    
    [IPU1-0]     85.688708 s: 
    [IPU1-0]     85.688739 s: ============================================================================
    [IPU1-0]     85.688861 s: Name      | Bus (mV)	| Res (mOhm) | Shunt (uV)  | Current (mA) | Power (mW)
    [IPU1-0]     85.688952 s: ----------------------------------------------------------------------------
    [IPU1-0]     85.694442 s:  UTILS_PRCM_STATS: Reading the regulator data failed
    [IPU1-0]     85.694534 s:  UTILS_PRCM_STATS: PM INA226 Power Read Failed !!
    [IPU1-0]     85.694747 s:  
    [IPU1-0]     85.694808 s:  Statistics Collector,
    [IPU1-0]     85.694869 s:  
    [IPU1-0]     85.694900 s:        STATISTIC          Avg Data        Peak Data 
    [IPU1-0]     85.694961 s:        COLLECTOR          MB/s            MB/s      
    [IPU1-0]     85.695052 s:  -------------------------------------------------- 
    [IPU1-0]     85.695144 s:  SCI_EMIF1 RD+WR      |    421.010458    917.503922
    [IPU1-0]     85.695235 s:  SCI_EMIF2 RD+WR      |      0.000000      0.000000
    [IPU1-0]     85.695388 s:  SCI_EMIF1 RD ONLY    |    322.202706    546.783588
    [IPU1-0]     85.695510 s:  SCI_EMIF1 WR ONLY    |     99.016824    424.756234
    [IPU1-0]     85.695601 s:  SCI_EMIF2 RD ONLY    |      0.000000      0.000000
    [IPU1-0]     85.695693 s:  SCI_EMIF2 WR ONLY    |      0.000000      0.000000
    [IPU1-0]     85.695815 s:  SCI_MA_MPU_P1        |      0.002915      0.063974
    [IPU1-0]     85.695906 s:  SCI_MA_MPU_P2        |      0.000000      0.000000
    [IPU1-0]     85.696028 s:  SCI_DSS              |    186.399637    195.257292
    [IPU1-0]     85.696120 s:  SCI_IPU1             |      6.831629     22.104411
    [IPU1-0]     85.696242 s:  SCI_VIP1_P1          |      8.603715     65.106999
    [IPU1-0]     85.696364 s:  SCI_VIP1_P2          |     17.212971    130.097936
    [IPU1-0]     85.696455 s:  SCI_VPE_P1           |      0.000000      0.000000
    [IPU1-0]     85.696577 s:  SCI_VPE_P2           |      0.000000      0.000000
    [IPU1-0]     85.696669 s:  SCI_DSP1_MDMA        |    194.198251    522.950714
    [IPU1-0]     85.696760 s:  SCI_DSP1_EDMA        |      0.000000      0.000000
    [IPU1-0]     85.696882 s:  SCI_DSP2_MDMA        |      0.052709      0.347616
    [IPU1-0]     85.696974 s:  SCI_DSP2_EDMA        |      0.000000      0.000000
    [IPU1-0]     85.697065 s:  SCI_EVE1_TC0         |      2.710242      4.828992
    [IPU1-0]     85.697187 s:  SCI_EVE1_TC1         |      0.000000      0.000000
    [IPU1-0]     85.697462 s:  SCI_EVE2_TC0         |      0.000000      0.000000
    [IPU1-0]     85.697584 s:  SCI_EVE2_TC1         |      0.000000      0.000000
    [IPU1-0]     85.697675 s:  SCI_EDMA_TC0_RD      |      0.000000      0.000000
    [IPU1-0]     85.697797 s:  SCI_EDMA_TC0_WR      |      0.000000      0.000000
    [IPU1-0]     85.697889 s:  SCI_EDMA_TC1_RD      |      0.000000      0.000000
    [IPU1-0]     85.697980 s:  SCI_EDMA_TC1_WR      |      0.000000      0.000000
    [IPU1-0]     85.698072 s:  SCI_VIP2_P1          |      0.000000      0.000000
    [IPU1-0]     85.698194 s:  SCI_VIP2_P2          |      0.000000      0.000000
    [IPU1-0]     85.698316 s:  SCI_VIP3_P1          |      0.000000      0.000000
    [IPU1-0]     85.698407 s:  SCI_VIP3_P2          |      0.000000      0.000000
    [IPU1-0]     85.698529 s:  SCI_EVE3_TC0         |      0.000000      0.000000
    [IPU1-0]     85.698621 s:  SCI_EVE3_TC1         |      0.000000      0.000000
    [IPU1-0]     85.698712 s:  SCI_EVE4_TC0         |      0.000000      0.000000
    [IPU1-0]     85.698804 s:  SCI_EVE4_TC1         |      0.000000      0.000000
    [IPU1-0]     85.698926 s:  SCI_IVA              |      0.000000      0.000000
    [IPU1-0]     85.699017 s:  SCI_GPU_P1           |      0.000000      0.000000
    [IPU1-0]     85.699109 s:  SCI_GPU_P2           |      0.000000      0.000000
    [IPU1-0]     85.699200 s:  SCI_GMAC_SW          |      0.000000      0.000000
    [IPU1-0]     85.699322 s:  SCI_OCMC_RAM1        |      0.000000      0.000000
    [IPU1-0]     85.699444 s:  SCI_OCMC_RAM2        |      0.000000      0.000000
    [IPU1-0]     85.699536 s:  SCI_OCMC_RAM3        |      0.000000      0.000000
    [IPU1-0]     85.799365 s:  
    [IPU1-0]     85.799457 s:  ### CPU [IPU1-0], LinkID [ 64],
    [IPU1-0]     85.799518 s:  
    [IPU1-0]     85.799579 s:  [ CAPTURE ] Link Statistics,
    [IPU1-0]     85.799640 s:  ******************************
    [IPU1-0]     85.799701 s:  
    [IPU1-0]     85.799731 s:  Elapsed time       = 28378 msec
    [IPU1-0]     85.799792 s:  
    [IPU1-0]     85.799823 s:  New data Recv      =  60.8 fps
    [IPU1-0]     85.799914 s:  Get Full Buf Cb    =   8.21 fps
    [IPU1-0]     85.799975 s:  Put Empty Buf Cb   =   8.6 fps
    [IPU1-0]     85.800067 s:  Driver/Notify Cb   =  60.8 fps
    [IPU1-0]     85.800128 s:  
    [IPU1-0]     85.800158 s:  Input Statistics,
    [IPU1-0]     85.800219 s:  
    [IPU1-0]     85.800250 s:  CH | In Recv | In Drop | In User Drop | In Process 
    [IPU1-0]     85.800372 s:     | FPS     | FPS     | FPS          | FPS        
    [IPU1-0]     85.800463 s:  -------------------------------------------------- 
    [IPU1-0]     85.800524 s:   0 |   8. 6      0. 0      0. 0           8. 6 
    [IPU1-0]     85.800646 s:  
    [IPU1-0]     85.800707 s:  Output Statistics,
    [IPU1-0]     85.800738 s:  
    [IPU1-0]     85.800799 s:  CH | Out | Out     | Out Drop | Out User Drop 
    [IPU1-0]     85.800860 s:     | ID  | FPS     | FPS      | FPS           
    [IPU1-0]     85.800921 s:  --------------------------------------------- 
    [IPU1-0]     85.801012 s:   0 |  0      8.21     0. 0      0. 0 
    [IPU1-0]     85.801134 s:  
    [IPU1-0]     85.801165 s:  [ CAPTURE ] LATENCY,
    [IPU1-0]     85.801226 s:  ********************
    [IPU1-0]     85.801287 s:  
    [IPU1-0]     85.801622 s:  
    [IPU1-0]     85.801683 s:  ### CPU [IPU1-0], LinkID [  0],
    [IPU1-0]     85.801775 s:  
    [IPU1-0]     85.801805 s:  [ IPC_OUT_0 ] Link Statistics,
    [IPU1-0]     85.802049 s:  ******************************
    [IPU1-0]     85.802110 s:  
    [IPU1-0]     85.802171 s:  Elapsed time       = 28380 msec
    [IPU1-0]     85.802232 s:  
    [IPU1-0]     85.802263 s:  New data Recv      =   8.21 fps
    [IPU1-0]     85.802568 s:  Release data Recv  =   8.6 fps
    [IPU1-0]     85.802659 s:  Driver/Notify Cb   = 108.6 fps
    [IPU1-0]     85.802751 s:  
    [IPU1-0]     85.802781 s:  Input Statistics,
    [IPU1-0]     85.802842 s:  
    [IPU1-0]     85.802873 s:  CH | In Recv | In Drop | In User Drop | In Process 
    [IPU1-0]     85.802934 s:     | FPS     | FPS     | FPS          | FPS        
    [IPU1-0]     85.803086 s:  -------------------------------------------------- 
    [IPU1-0]     85.803178 s:   0 |   8.21      0. 0      0. 0           8.21 
    [IPU1-0]     85.803330 s:  
    [IPU1-0]     85.803361 s:  Output Statistics,
    [IPU1-0]     85.803422 s:  
    [IPU1-0]     85.803452 s:  CH | Out | Out     | Out Drop | Out User Drop 
    [IPU1-0]     85.803544 s:     | ID  | FPS     | FPS      | FPS           
    [IPU1-0]     85.803605 s:  --------------------------------------------- 
    [IPU1-0]     85.803696 s:   0 |  0      8.21     0. 0      0. 0 
    [IPU1-0]     85.803818 s:  
    [IPU1-0]     85.803849 s:  [ IPC_OUT_0 ] LATENCY,
    [IPU1-0]     85.803910 s:  ********************
    [IPU1-0]     85.803971 s:  Local Link Latency     : Avg =      8 us, Min =      0 us, Max =     61 us, 
    [IPU1-0]     85.804062 s:  Source to Link Latency : Avg =     81 us, Min =     61 us, Max =    152 us, 
    [IPU1-0]     85.804184 s:  
    [IPU1-0]     86.303391 s:  
    [IPU1-0]     86.303483 s:  ### CPU [  DSP1], LinkID [ 10],
    [IPU1-0]     86.303544 s:  
    [IPU1-0]     86.303605 s:  [ IPC_IN_0 ] Link Statistics,
    [IPU1-0]     86.303666 s:  ******************************
    [IPU1-0]     86.303727 s:  
    [IPU1-0]     86.303757 s:  Elapsed time       = 28881 msec
    [IPU1-0]     86.303818 s:  
    [IPU1-0]     86.303879 s:  Get Full Buf Cb    =   2.7 fps
    [IPU1-0]     86.303940 s:  Put Empty Buf Cb   =   8.10 fps
    [IPU1-0]     86.304032 s:  Driver/Notify Cb   =   8.17 fps
    [IPU1-0]     86.304093 s:  
    [IPU1-0]     86.304123 s:  Input Statistics,
    [IPU1-0]     86.304184 s:  
    [IPU1-0]     86.304215 s:  CH | In Recv | In Drop | In User Drop | In Process 
    [IPU1-0]     86.304337 s:     | FPS     | FPS     | FPS          | FPS        
    [IPU1-0]     86.304398 s:  -------------------------------------------------- 
    [IPU1-0]     86.304489 s:   0 |   8.20      0. 0      0. 0           8.20 
    [IPU1-0]     86.304672 s:  
    [IPU1-0]     86.304733 s:  Output Statistics,
    [IPU1-0]     86.304794 s:  
    [IPU1-0]     86.304825 s:  CH | Out | Out     | Out Drop | Out User Drop 
    [IPU1-0]     86.304886 s:     | ID  | FPS     | FPS      | FPS           
    [IPU1-0]     86.304977 s:  --------------------------------------------- 
    [IPU1-0]     86.305038 s:   0 |  0      8.20     0. 0      0. 0 
    [IPU1-0]     86.305160 s:  
    [IPU1-0]     86.305191 s:  [ IPC_IN_0 ] LATENCY,
    [IPU1-0]     86.305252 s:  ********************
    [IPU1-0]     86.305343 s:  Local Link Latency     : Avg =      7 us, Min =      0 us, Max =     31 us, 
    [IPU1-0]     86.305465 s:  Source to Link Latency : Avg = 304130 us, Min =    213 us, Max = 1347679 us, 
    [IPU1-0]     86.305587 s:  
    [IPU1-0]     86.305709 s:  
    [IPU1-0]     86.305740 s:  ### CPU [  DSP1], LinkID [ 50],
    [IPU1-0]     86.305801 s:  
    [IPU1-0]     86.305862 s:  [ ALG_FRAMEDIFFERENCE ] Link Statistics,
    [IPU1-0]     86.305923 s:  ******************************
    [IPU1-0]     86.305984 s:  
    [IPU1-0]     86.306045 s:  Elapsed time       = 28883 msec
    [IPU1-0]     86.306106 s:  
    [IPU1-0]     86.306136 s:  New data Recv      =   2.4 fps
    [IPU1-0]     86.306228 s:  
    [IPU1-0]     86.306258 s:  Input Statistics,
    [IPU1-0]     86.306319 s:  
    [IPU1-0]     86.306380 s:  CH | In Recv | In Drop | In User Drop | In Process 
    [IPU1-0]     86.306441 s:     | FPS     | FPS     | FPS          | FPS        
    [IPU1-0]     86.306533 s:  -------------------------------------------------- 
    [IPU1-0]     86.306594 s:   0 |   8.13      0. 0      0. 0           8.10 
    [IPU1-0]     86.306746 s:  
    [IPU1-0]     86.306777 s:  Output Statistics,
    [IPU1-0]     86.306838 s:  
    [IPU1-0]     86.306868 s:  CH | Out | Out     | Out Drop | Out User Drop 
    [IPU1-0]     86.306929 s:     | ID  | FPS     | FPS      | FPS           
    [IPU1-0]     86.307021 s:  --------------------------------------------- 
    [IPU1-0]     86.307082 s:   0 |  0      8.10     0. 0      0. 0 
    [IPU1-0]     86.307204 s:  
    [IPU1-0]     86.307234 s:  [ ALG_FRAMEDIFFERENCE ] LATENCY,
    [IPU1-0]     86.307509 s:  ********************
    [IPU1-0]     86.307570 s:  Local Link Latency     : Avg =  47283 us, Min =  46300 us, Max =  58714 us, 
    [IPU1-0]     86.307692 s:  Source to Link Latency : Avg = 422626 us, Min =  58988 us, Max = 1409718 us, 
    [IPU1-0]     86.307814 s:  
    [IPU1-0]     86.307905 s:  
    [IPU1-0]     86.307966 s:  ### CPU [  DSP1], LinkID [ 49],
    [IPU1-0]     86.308027 s:  
    [IPU1-0]     86.308088 s:  [ ALG_DILATIONLIB ] Link Statistics,
    [IPU1-0]     86.308149 s:  ******************************
    [IPU1-0]     86.308210 s:  
    [IPU1-0]     86.308241 s:  Elapsed time       = 28828 msec
    [IPU1-0]     86.308332 s:  
    [IPU1-0]     86.308393 s:  New data Recv      =   2.1 fps
    [IPU1-0]     86.308485 s:  
    [IPU1-0]     86.308515 s:  Input Statistics,
    [IPU1-0]     86.308576 s:  
    [IPU1-0]     86.308607 s:  CH | In Recv | In Drop | In User Drop | In Process 
    [IPU1-0]     86.308668 s:     | FPS     | FPS     | FPS          | FPS        
    [IPU1-0]     86.308759 s:  -------------------------------------------------- 
    [IPU1-0]     86.308851 s:   0 |   8. 8      0. 0      0. 0           8. 8 
    [IPU1-0]     86.308973 s:  
    [IPU1-0]     86.309003 s:  Output Statistics,
    [IPU1-0]     86.309064 s:  
    [IPU1-0]     86.309095 s:  CH | Out | Out     | Out Drop | Out User Drop 
    [IPU1-0]     86.309156 s:     | ID  | FPS     | FPS      | FPS           
    [IPU1-0]     86.309247 s:  --------------------------------------------- 
    [IPU1-0]     86.309339 s:   0 |  0      8. 8     0. 0      0. 0 
    [IPU1-0]     86.309461 s:  
    [IPU1-0]     86.309491 s:  [ ALG_DILATIONLIB ] LATENCY,
    [IPU1-0]     86.309552 s:  ********************
    [IPU1-0]     86.309613 s:  Local Link Latency     : Avg =  76166 us, Min =  71585 us, Max = 1063320 us, 
    [IPU1-0]     86.309735 s:  Source to Link Latency : Avg = 676953 us, Min = 130787 us, Max = 1691454 us, 
    [IPU1-0]     86.309857 s:  
    [IPU1-0]     86.309979 s:  
    [IPU1-0]     86.310010 s:  ### CPU [  DSP1], LinkID [  0],
    [IPU1-0]     86.310101 s:  
    [IPU1-0]     86.310132 s:  [ IPC_OUT_0 ] Link Statistics,
    [IPU1-0]     86.310193 s:  ******************************
    [IPU1-0]     86.310254 s:  
    [IPU1-0]     86.310315 s:  Elapsed time       = 28758 msec
    [IPU1-0]     86.310406 s:  
    [IPU1-0]     86.310437 s:  New data Recv      =   7.99 fps
    [IPU1-0]     86.310498 s:  Release data Recv  =   7.92 fps
    [IPU1-0]     86.310589 s:  Driver/Notify Cb   = 108.3 fps
    [IPU1-0]     86.310650 s:  
    [IPU1-0]     86.310681 s:  Input Statistics,
    [IPU1-0]     86.310742 s:  
    [IPU1-0]     86.310772 s:  CH | In Recv | In Drop | In User Drop | In Process 
    [IPU1-0]     86.310864 s:     | FPS     | FPS     | FPS          | FPS        
    [IPU1-0]     86.310925 s:  -------------------------------------------------- 
    [IPU1-0]     86.311016 s:   0 |   8.10      0. 0      0. 0           8.10 
    [IPU1-0]     86.311138 s:  
    [IPU1-0]     86.311169 s:  Output Statistics,
    [IPU1-0]     86.311230 s:  
    [IPU1-0]     86.311260 s:  CH | Out | Out     | Out Drop | Out User Drop 
    [IPU1-0]     86.311352 s:     | ID  | FPS     | FPS      | FPS           
    [IPU1-0]     86.311443 s:  --------------------------------------------- 
    [IPU1-0]     86.311504 s:   0 |  0      8.10     0. 0      0. 0 
    [IPU1-0]     86.311626 s:  
    [IPU1-0]     86.311657 s:  [ IPC_OUT_0 ] LATENCY,
    [IPU1-0]     86.311718 s:  ********************
    [IPU1-0]     86.311779 s:  Local Link Latency     : Avg =      3 us, Min =      0 us, Max =     31 us, 
    [IPU1-0]     86.311901 s:  Source to Link Latency : Avg = 797273 us, Min = 130970 us, Max = 1847404 us, 
    [IPU1-0]     86.312023 s:  
    [IPU1-0]     86.811413 s:  
    [IPU1-0]     86.811474 s:  ### CPU [IPU1-0], LinkID [ 10],
    [IPU1-0]     86.811565 s:  
    [IPU1-0]     86.811596 s:  [ IPC_IN_0 ] Link Statistics,
    [IPU1-0]     86.811657 s:  ******************************
    [IPU1-0]     86.811718 s:  
    [IPU1-0]     86.811748 s:  Elapsed time       = 29259 msec
    [IPU1-0]     86.811840 s:  
    [IPU1-0]     86.811870 s:  Get Full Buf Cb    =   2.5 fps
    [IPU1-0]     86.811931 s:  Put Empty Buf Cb   =   8.6 fps
    [IPU1-0]     86.812023 s:  Driver/Notify Cb   =   2.1 fps
    [IPU1-0]     86.812084 s:  
    [IPU1-0]     86.812114 s:  Input Statistics,
    [IPU1-0]     86.812175 s:  
    [IPU1-0]     86.812206 s:  CH | In Recv | In Drop | In User Drop | In Process 
    [IPU1-0]     86.812297 s:     | FPS     | FPS     | FPS          | FPS        
    [IPU1-0]     86.812572 s:  -------------------------------------------------- 
    [IPU1-0]     86.812663 s:   0 |   8.10      0. 0      0. 0           8.10 
    [IPU1-0]     86.812785 s:  
    [IPU1-0]     86.812846 s:  Output Statistics,
    [IPU1-0]     86.812877 s:  
    [IPU1-0]     86.812938 s:  CH | Out | Out     | Out Drop | Out User Drop 
    [IPU1-0]     86.812999 s:     | ID  | FPS     | FPS      | FPS           
    [IPU1-0]     86.813060 s:  --------------------------------------------- 
    [IPU1-0]     86.813151 s:   0 |  0      8.10     0. 0      0. 0 
    [IPU1-0]     86.813273 s:  
    [IPU1-0]     86.813334 s:  [ IPC_IN_0 ] LATENCY,
    [IPU1-0]     86.813395 s:  ********************
    [IPU1-0]     86.813456 s:  Local Link Latency     : Avg =     16 us, Min =      0 us, Max =     61 us, 
    [IPU1-0]     86.813578 s:  Source to Link Latency : Avg = 797043 us, Min = 131153 us, Max = 1847618 us, 
    [IPU1-0]     86.813700 s:  
    [IPU1-0]     86.813822 s:  
    [IPU1-0]     86.813853 s:  ### CPU [IPU1-0], LinkID [ 67],
    [IPU1-0]     86.813944 s:  
    [IPU1-0]     86.813975 s:  [ DISPLAY ] Link Statistics,
    [IPU1-0]     86.814036 s:  ******************************
    [IPU1-0]     86.814097 s:  
    [IPU1-0]     86.814158 s:  Elapsed time       = 29260 msec
    [IPU1-0]     86.814219 s:  
    [IPU1-0]     86.814249 s:  New data Recv      =   2.1 fps
    [IPU1-0]     86.814341 s:  Driver/Notify Cb   =  59.97 fps
    [IPU1-0]     86.814493 s:  
    [IPU1-0]     86.814554 s:  Input Statistics,
    [IPU1-0]     86.814585 s:  
    [IPU1-0]     86.814646 s:  CH | In Recv | In Drop | In User Drop | In Process 
    [IPU1-0]     86.814707 s:     | FPS     | FPS     | FPS          | FPS        
    [IPU1-0]     86.814798 s:  -------------------------------------------------- 
    [IPU1-0]     86.814860 s:   0 |   8. 9      0. 0      0. 0           8. 9 
    [IPU1-0]     86.814982 s:  
    [IPU1-0]     86.815043 s:  [ DISPLAY ] LATENCY,
    [IPU1-0]     86.815104 s:  ********************
    [IPU1-0]     86.815134 s:  Local Link Latency     : Avg =     22 us, Min =      0 us, Max =     61 us, 
    [IPU1-0]     86.815256 s:  Source to Link Latency : Avg = 797285 us, Min = 131275 us, Max = 1847862 us, 
    [IPU1-0]     86.815409 s:  
    [IPU1-0]     87.315348 s: 
    [IPU1-0]  
    [IPU1-0]  ====================
    [IPU1-0]  Chains Run-time Menu
    [IPU1-0]  ====================
    [IPU1-0]  
    [IPU1-0]  0: Stop Chain
    [IPU1-0]  
    [IPU1-0]  2: Pause Capture
    [IPU1-0]  3: Resume Capture
    [IPU1-0]  
    [IPU1-0]  p: Print Performance Statistics 
    [IPU1-0]  
    

    Regards,

    Kajal

  • You can reduce the DDR latency and performance if you allocated the intermediate 10 line buffers from the L2RAM (UTILS_HEAPID_L2_LOCAL) instead of cached DDR.

    Jesse

  • Hello Jesse,

    We tried your suggestion, but unfortunately output is same as previous.
    Input resolution is 1080p , so is it thing that this high resolution is taking more time to process??

    Regards,
    Kajal
  • Kajal,

    It is always the case that if you do too much processing, you may non meet your real-time deadlines.  The question is, where is the bottleneck, and what shall be targeted for additional optimization.  Have you profiled individual function calls at various levels to see if the processing time of the different functions are in line with expectations?  It could also simply be that not enough buffers are allocated from the camera or display to sufficiently pipeline the processing across the different cores.  One would need to analyze the overall use case graph, how many buffers are given to camera, display, and intermediate links, what the algLink policy is for dropping, processing frames, and the individual algorithm cycles required.

    From my analysis, I have the following expected numbers for these functions:

    pack = 1.61 cycles/pix (cached DDR based)
    dilate = 0.18 cycles/pix (L2 SRAM -> L2 SRAM)
    unpack = 1.31 cycles/pix (cached DDR based)

    SUM= ~3.09 cycles/pix

    Therefore, 1080p should take about 6.4 MCycles per frame.  I'm not sure what your frame rate is, but i would think that if this is the only processing you are doing, this should fit in real time.  When you profile the sum of the functions above and divide by the size of the image, are you getting something close to 3 cycles per pixel?

    Just for your information, if you needed to get further lift out of your code, you can do so with the additional work of adding DMA copies of the blocks from DDR into L2SRAM buffer in a ping - pong fashion before doing the processing.  In this case, I would expect you to see the following performance numbers:

    pack = 0.58 cycles/pix (L2 SRAM -> L2 SRAM)
    dilate = 0.18 cycles/pix (L2 SRAM -> L2 SRAM)
    unpack = 0.46 cycles/pix (L2 SRAM -> L2 SRAM)

    SUM= ~1.22 cycles/pix

    However, before working on the DMA, I suggest you profile the existing code to see if it is performing as expected.  I also don't know what other processing you are doing on the DSP which may be taking up cycles, so profiling that would be good to see what is taking the bulk of the time.

    Jesse