This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

FileRead->Encode->FileWrite Usecase

Hi,

I create a fileread->encode->filewrite usecase, I use the following steps to feed yuv data to encode input.

0. VdecVdis_ipcFramesFillBufInfo()

1. read yuv data from input file 

2. Vdis_putFullVideoFrames()

3. Vdis_getEmptyVideoFrames()

4. VdecVdis_ipcFramesFreeFrameBuf()

I use the following steps to get h264 data from encode output.

0. Venc_getBitstreamBuffer()

1. write h264 data to file

2. Venc_releaseBitstreamBuffer()

Now the program could run successfully. But once I feed one frame per channel to encode input I will get newdata available callback twice and if the encode channel number are more than 2, the channel index larger than 1 will have each frames twice. The following figure will show the situation.

I don't know why this happens, could someone give me some advise?

  • Attach the usecase file where you are creating and connecting links and the application file which is calling steps 0 - 4 you have mentioned above

  • Hi Badri,

    Thanks for your reply! I attach the files in the end. I refer to the file demo_vdec_vdis_frames_send.c to feed yuv data to encode. I don't quite understand the following codes in VdecVdis_ipcFramesFillBufInfo() function.

    bufList->frames[bufList->numFrames].addr[0][0] =frameObj[i].bufVirt;

    bufList->frames[bufList->numFrames].phyAddr[0][0] = (Ptr)frameObj[i].bufPhy;

    As each frame in bufList corresponding to one encode channel, why all the addr of each frame are set to the same. I modify it in my usecase. I use multi frameObj and each has one frame.

    4846.Desktop.zip

  • Hi Badri,

    Do you find some issues in my codes?

  • Your application code is wrong.

    IpcFramesFillBufInfo will populate only those channels which have frameObj[i].refCnt == 0.

    But when reading from file you are doing

    for(i=0; i<ENC_CH; i++)
    			{
    				fread((unsigned char *)(bufList.frames[i].addr[0][0]), 1, gReadYUVConfig.width * gReadYUVConfig.height, gReadYUVConfig.fin[i]);
    				fread((unsigned char *)(bufList.frames[i].addr[0][1]), 1, gReadYUVConfig.width * gReadYUVConfig.height * 0.5, gReadYUVConfig.fin[i]);
    				if(feof(gReadYUVConfig.fin[i]))
    				{
    					fseek(gReadYUVConfig.fin[i], 0, SEEK_SET);
    				}
    			}

     

    Correct code should be

    		if(bufList.numFrames)
    		{
    		    //printf("0rfcnt is %d\n", frmObj[1].refCnt);
    			for(i=0; i<bufList.numFrames; i++)
    			{
    				fread((unsigned char *)(bufList.frames[i].addr[0][0]), 1, gReadYUVConfig.width * gReadYUVConfig.height, gReadYUVConfig.fin[bufList.frames[i].channelNum]);
    				fread((unsigned char *)(bufList.frames[i].addr[0][1]), 1, gReadYUVConfig.width * gReadYUVConfig.height * 0.5, gReadYUVConfig.fin[bufList.frames[i].channelNum]);
    				if(feof(gReadYUVConfig.fin[bufList.frames[i].channelNum]))
    				{
    					fseek(gReadYUVConfig.fin[bufList.frames[i].channelNum], 0, SEEK_SET);
    				}
    			}
    			printf("feed buflist to encode\n");
    			status = Venc_putFullVideoFrames(&bufList);
          		OSA_assert(0 == status);
    		}
    

  • Hi Badri,

    Thanks for your reply!

    I apply the changes as you said, But the result is just the same. The channels beyond 2 still have each frame twice. In my codes, I assign one frameObj for each channel, the bufList.frame[i] is corresponding to channel i. So the above   two  sections of codes are the same.

    I wonder if it is something wrong with IpcFramesFreeFrameBuf() function. It seems that those channels who have each frame twice do not release the input buffer properly.

  • The application logic you are using currently is highly prone to errors.

    It is better you have a separate queue of free buffer per channel.

    1.Allocate from empty buffer from channel specific free queue.

    2.Fill with data

    3.Put Full Frames

    4. Get back empty frame and free into channel specific empty queue.

    This way there is no chance of reading content wrongly.

  • Hi Badri,

    Thanks so much for your reply! What do you mean by have a separate queue of free buffer per chanel? Should I use OSA_que? Could you give some example on how to use OSA_que?

  • Hi Badri,

    I find the issue. In FileFD_ipcBitsProcessFullBufs() function, After get encode output bufList I should use fullBufList.numBufs to limit the loop time. I use ENC_CH for mistake.

    Thanks so much for your kindly  help! 

    Another question, now we want to do the video mix job on ARM side instead of using swMs Link. As we need a lot of memcpy operations, we want to know if we could use assembly codes?

  • You can use memcpy_neon which is part of DVR RDK 4.0 but that too is not suitable for copying video frames. You should use DMA if you intend to copy video frames. Doing video frame processing on ARM will result in very poor performance.Why cant you use SwMs

  • Hi Badri,

    Thanks for your reply!

    When we use swMs to do video mix the timedelay of the whole data process is about 200ms. I test the decode and encode timedelay and find these two are all very small, just more or less 2 ms. So the most contribution of timedelay will be dupLink and swMsLink, is that right? My boss thought 200ms is a bit long and want to do the video mix on ARM side. The data flow is as following:

    RTPRecv->decode->ARM side video mix->encode->RTPSend

    What do you mean by Doing video frame processing on ARM will result in very poor performance? Is it because the ARM is too slow or some other reasons?

    You mention that we could use DMA to copy video frames, could you give an example on how to use DMA on ARM side?

  • Hi Badri,

    As in my app the memcpy size is 320, I test the memcpy_neon and memcpy with data size of 320. The result shows that the efficiency of memcpy_neon is almost the same as memcpy.

    memcpy_neon :  (320 bytes copy) = 1751.0 MB/s
    memcpy_arm  :  (320 bytes copy) = 1746.4 MB/s 

    I read the post at http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka13544.html . But I don't quite understand on what conditions the memcpy_neon could get a higher efficiency. Could you give some tips?

  • We have seen memcpy_neon will give 2x better performance if block size of memcpy is greater tha 10K bytes.For small size like 320 bytes it will not give noticeable performance improvement although you are seeing 5MB/s improvement which is siginificant.