Linux/DRA746: Frame Drop during H.264 Encode

Part Number: DRA746

Tool/software: Linux

Dear Community,

I'm looking for support with the following issue:

We've a use case where we want to capture the output of one of the CRTCs and feed it to an IVA-HD encoder.

We've tried the following pipeline:

gst-launch-1.0 -e v4l2src device=/dev/video11 io-mode=dmabuf ! video/x-raw,format=NV12 ! ducatih264enc level=51 ! queue ! h264parse ! mp4mux ! filesink location=capture.mp4

Although this pipeline works, there's an unwanted memory copy which causes a significant frame drop (~23fps when it should be ~60fps).

We believe that the reason of the memory copy is due the buffers dequed from the omapwb-cap driver, when the pixel format is NV12, are not continuous in memory.  However, from our understanding, the ducatih264enc GStreamer element expects them to be continuous in memory (that is, a single dmabuf-fd for both the Y and UV planes).

In addition, we think that the correct pixel format for NV12 exposed by the omapwb-cap driver should be V4L2_PIX_FMT_NV12M instead of V4L2_PIX_FMT_NV12.

Have TI ever managed to encode using GStreamer at 60 fps from the writeback v4l2 device? Is there any work in progress to address this?

For reference, our kernel version is 4.4.0-3 and we are using GStreamer 1.11.90. 

  • Hello,

    What is the software release that you are using?

    BR
    Margarita

    NOTE : Please click "Verify Answer" if this post has answered your question!

  • In reply to Thomas Brown:

    Hello,

    I meant PROCESSOR-SDK- ANDROID or Linux or RTOS?


    BR
    Margarita

    NOTE : Please click "Verify Answer" if this post has answered your question!

  • In reply to Margarita Gashova:

    The application is a Debian image with firmware and Ducati GStreamer blobs from the Processor SDK 3.02.00.03.

    I'm posting this to check if we have ever encoded 1920x1080.30fps video using the capture device created by the omapwb-cap drive?

    I suspect the issue is due the memory copy degrading performance. Here's what we see when we run GST_DEBUG=v4l2*:LOG gst-launch-1.0 -e v4l2src device=/dev/video11 io-mode=dmabuf ! video/x-raw,format=NV12 ! ducatih264enc level=51 ! queue ! h264parse ! mp4mux ! filesink location=capture.mp4

    0:00:02.339602868 20868 0x80dcb920 LOG v4l2bufferpool gstv4l2bufferpool.c:1027:gst_v4l2_buffer_pool_poll:<v4l2src0:pool:src> polling device
    0:00:02.339936679 20868 0x80dcb920 LOG v4l2bufferpool gstv4l2bufferpool.c:1160:gst_v4l2_buffer_pool_dqbuf:<v4l2src0:pool:src> dequeueing a buffer
    0:00:02.340338652 20868 0x80dcb920 LOG v4l2allocator gstv4l2allocator.c:1311:gst_v4l2_allocator_dqbuf:<v4l2src0:pool:src:allocator> dequeued buffer 3 (flags 0x2005)
    0:00:02.340797399 20868 0x80dcb920 DEBUG v4l2allocator gstv4l2allocator.c:1315:gst_v4l2_allocator_dqbuf:<v4l2src0:pool:src:allocator> driver pretends buffer is queued even if dequeue succeeded
    0:00:02.341188310 20868 0x80dcb920 LOG v4l2bufferpool gstv4l2bufferpool.c:1192:gst_v4l2_buffer_pool_dqbuf:<v4l2src0:pool:src> dequeued buffer 0xb510e220 seq:21 (ix=3), mem 0xb5101868 used 2073600, plane=0, flags 00002001, ts 23:56:15.158207000, pool-queued=8, buffer=0xb510e220
    0:00:02.341552705 20868 0x80dcb920 LOG v4l2bufferpool gstv4l2bufferpool.c:1192:gst_v4l2_buffer_pool_dqbuf:<v4l2src0:pool:src> dequeued buffer 0xb510e220 seq:21 (ix=3), mem 0xb51018c0 used 1036800, plane=1, flags 00002001, ts 23:56:15.158207000, pool-queued=8, buffer=0xb510e220
    0:00:02.341938573 20868 0x80dcb920 LOG v4l2bufferpool gstv4l2bufferpool.c:116:gst_v4l2_buffer_pool_copy_buffer:<v4l2src0:pool:src> copying buffer
    0:00:02.342350957 20868 0x80dcb920 DEBUG v4l2bufferpool gstv4l2bufferpool.c:122:gst_v4l2_buffer_pool_copy_buffer:<v4l2src0:pool:src> copy video frame
  • In reply to Thomas Brown:

    Hi Thomas,

    I have forwarded your question to Linux multimedia expert.

    Regards,
    Yordan

     Note: If this answer solves your question click the Verify Answer button.

     Please make sure you read the forum guidelines first.

  • In reply to Thomas Brown:

    To answer the main question - No, we haven't achieved 1080p 30fps with WB capture. This is not a tested pipeline. There is no plan as of now to support writeback pipeline. 

    Yes, ducati encoder expects one single fd. What puzzles me is that camera capture works at 30fps but not wb capture. From gstreamer perspective buffer management shouldn't be different in these cases.  Is there a difference from driver perspective.

  • In reply to Pooja Prajod:

    Hi pooja,


    Though VIP and Write back are V4L2 drivers, WB is completely different in that it uses display pipelines DMA of the data.

    From the description of the issue, it is clear that the problem is with single FD vs two FD requirements of the two different drivers.
    Its the memcpy which is reducing the FPS, the drop is not happening in the driver.
    It makes sense to expose NV12M instead of NV12.

    The solution to this problem can be done if the buffers are allocated in the contiguous manner.
    OR buffers are allocated using DRM allocator and given to the capture driver (as in the case of VIP driver)

    Regards,
    Nikhil D
  • In reply to Nikhil Devshatwar:

    Hi Nikhil,

    Thanks for the clarification and explanation.

    However this leads onto the next question which is can you help get a modified 4.4 omapwb-cap kernel driver that allocates the
    chroma and luminence planes in contiguous memory? (can this be done as a drop in for the existing module?)

    Thanks
    Chris Lande
  • In reply to Chris Lande:

    Hi The answer I suggested has two parts. Check if the write back pipeline can work with externally allocated buffers. If so, then your usecase can be realized without any frame copy Nikhil D
  • In reply to Nikhil Devshatwar:

    Thanks Nikhil. We will await your confirmation on this point:
    Check if the write back pipeline can work with externally allocated buffers. If so, then your usecase can be realized without any frame copy