This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/AM5728: VIDDEC3 support for decoding two 1080i60 streams

Part Number: AM5728

Tool/software: Linux

Hi.

I understand IVA-HD is single decoder(it decode only one frame at the same time.) and then,While IVA-HD decode one frame, the other frame is blocked for completing decoding one contents.

I undertstand viddec3test that is gived by TI decode two different 1080p30 in a second. viddec3test don't implement outBufsInUseFlag, and then it want specific 1080p30 contents.

our contents that  is serviced now need many outBufsInUseFlag, and then it access IVA-HD many times( 60~62 times in a second).

IVA-HD takes 7~12ms for decodeing one frame. I hope that application use VIDDEC3_processAsync  because IVA-HD need to decode long time(7~12ms). Can you support it ?

or I want VIDDEC3_process that use multiple H.264 frame(one contents, the other contents). Can you support it ?

The DSS(Display Sub System) miss 2~3 frame for decoding two different contents. because IVA-HD need to decode long time(7~12ms).

I want to get your solution.

Thanks a lot.

  • Hi,

    The software team have been notified. They will respond here.
  • Hi Lee,

    If you are looking for outputBufsInUseFlag support in viddec3test, attached patch can be used.  With this decoder decodes one field at a time and for second field, same output buffer will be fed.

    diff --git a/viddec3test.c b/viddec3test.c
    index 8c57ed9..2ef94dd 100644
    --- a/viddec3test.c
    +++ b/viddec3test.c
    @@ -58,6 +58,9 @@ struct decoder {
     	size_t *outBuf_fd;
     	suseconds_t tdisp;
     	int id;
    +	struct buffer *lastOutBuf;
    +	int need_out_buf;
    +
     };
     
     
    @@ -305,6 +308,7 @@ decoder_open(int argc, char **argv)
     	decoder->outArgs = dce_alloc(sizeof(IVIDDEC3_OutArgs));
     	decoder->outArgs->size = sizeof(IVIDDEC3_OutArgs);
     
    +	decoder->need_out_buf = XDAS_TRUE;
     	decoder->tdisp = mark(NULL);
     
     	return decoder;
    @@ -334,12 +338,19 @@ decoder_process(struct decoder *decoder)
     	/* demux; in loop mode, we can do two tries at the end of the stream. */
     	for (i = 0; i < 2; i++) {
     		n = demux_read(decoder->demux, decoder->input, decoder->input_sz);
    -		if (n) {
    +		if(decoder->need_out_buf == XDAS_TRUE){
     			buf = disp_get_vid_buffer(decoder->disp);
     			if (!buf) {
     				ERROR("%p: fail: out of buffers", decoder);
     				return -1;
     			}
    +			decoder->lastOutBuf = buf;
    +			}
    +		else{
    +			buf = decoder->lastOutBuf;
    +		}
    +
    +		if(n) {
     			inBufs->descs[0].bufSize.bytes = n;
     			inArgs->numBytes = n;
     			DBG("%p: push: %d bytes (%p)", decoder, n, buf);
    @@ -374,7 +385,7 @@ decoder_process(struct decoder *decoder)
     		outBufs->descs[0].buf = buf->fd[0];
     		outBufs->descs[1].buf = (buf->multiplanar) ?buf->fd[1]:(XDAS_Int8 *)((outBufs->descs[0].buf));
     
    -
    +	if(decoder->need_out_buf == XDAS_TRUE){
     		if(buf->multiplanar){
     			decoder->outBuf_fd[0] = buf->fd[0];
     			decoder->outBuf_fd[1] = buf->fd[1];
    @@ -384,6 +395,7 @@ decoder_process(struct decoder *decoder)
     			decoder->outBuf_fd[0] = buf->fd[0];
     			dce_buf_lock(1,decoder->outBuf_fd);
     		}
    +	}
     		decoder->outBufs->descs[0].bufSize.bytes =decoder->padded_width*decoder->padded_height;
     		decoder->outBufs->descs[1].bufSize.bytes = decoder->padded_width* (decoder->padded_height/2);
     	}
    @@ -443,12 +455,16 @@ decoder_process(struct decoder *decoder)
     		}
     
     		if(freeBufCount){
    -            if(!eof)dce_buf_unlock(freeBufCount,decoder->outBuf_fd);
    +            dce_buf_unlock(freeBufCount,decoder->outBuf_fd);
     			freeBufCount =0;
     		}
     		if (outArgs->outBufsInUseFlag) {
    -			MSG("%p: TODO... outBufsInUseFlag", decoder); // XXX
    +			decoder->need_out_buf = XDAS_FALSE;
     		}
    +		else{
    +			decoder->need_out_buf = XDAS_TRUE;
    +		}
    +
     	} while ((err == 0) && eof && !no_process);
     
     	return (inBufs->numBufs > 0) ? 0 : -1;
    

  • JOONHO,

    It's not clear what you are wanting to ask . IVA-HD IP by design cannot process multiple frames in parallel. It can only handle one frame at a time.
  • Hi Ramprasad and manisha.
    We thanks for testing viddectest3 our contents(1080i60) as your viddec3test patch.
    Only viddectest3 running for decoding two different contents run out of performace as IVA-HD.

    It is our log as below it.
    It show VIDDEC3_process's accumulated time of THREAD_A and VIDDEC3_process's accumulated time of THREAD_B for 1 seconds.
    g_codecdceCnt means that THREAD_x's counts that access IVA-HD with VIDDEC3_process for 1 second.
    g_codecoutCnt means that outArgs->outputID[i]'s counts of THREAD_x for 1 second.
    g_codecdceAccmulTime means that VIDDEC3_process's accumulated time of THREAD_x for 1 second.

    As shortly, we must get count that g_codecoutCnt is 30 due to 1080i60. but sometimes we get under 30 g_codecoutCnt(YUV) and DSS miss 2~6 frame.
    and then I hope that TI support VIDDEC3_processAsync or other soulition.

    <viddectest3 log>
    .....
    DBG : THREAD_A : g_codecdceCnt(62) g_codecoutCnt(26) g_codecdceAccmulTime(673)ms at decoder_process
    DBG : THREAD_B: g_codecdceCnt(63) g_codecoutCnt(27) g_codecdceAccmulTime(650)ms at decoder_process
    ....
    DBG : THREAD_A : g_codecdceCnt(64) g_codecoutCnt(32) g_codecdceAccmulTime(584)ms at decoder_process
    DBG : THREAD_B: g_codecdceCnt(67) g_codecoutCnt(33) g_codecdceAccmulTime(595)ms at decoder_process
    ...
    DBG : THREAD_A : g_codecdceCnt(57) g_codecoutCnt(29) g_codecdceAccmulTime(489)ms at decoder_process
    DBG : THREAD_B: g_codecdceCnt(72) g_codecoutCnt(36) g_codecdceAccmulTime(476)ms at decoder_process
    ..
    DBG : THREAD_A : g_codecdceCnt(56) g_codecoutCnt(28) g_codecdceAccmulTime(511)ms at decoder_process
    DBG : THREAD_B: g_codecdceCnt(71) g_codecoutCnt(36) g_codecdceAccmulTime(485)ms at decoder_process
  • Part Number: AM5728

    Dear Champs,

    My customer wants to implement 2ch 1080i60 display with H.264 contents, but found 1~2 frames drop when ARM Cortex-A15 CPU load is high.

    Is there any method to improve performance to make robust display 30 frame(60fields) without frame drop?

    Customer found there were many resource loss while ipc communication between ARM Cortex-A15 and IVA-HD, and they measured the execution times was 10 ~ 12ms as below when they test it using viddec3test from SDK.

    IVA-HD decoding : 7ms

    ipc + context switch : 3 ~5ms

    I think if  there is a method to implement like tunneling mode of OpenMAX IL or all controls related with H.264 decoding and displaying were done in M4 ducati core.

    Is there any method to implement it in our SDK?

    Is it possible to configure SDK SW to implement this like tunneling mode of OpenMAX IL?

    Our customer posted this issue in below, but I want to discuss in other ways.

    Thanks and Best Regards,

    SI.

  • PLSDK driver architecture cannot support tunneling mode as the display driver is linux based and runs on A15 while the IVA-HD is controlled by M4.

    Customer may have to try playing with thread priority (and any other resource identified as bottleneck??) to control the performance.