This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

fps dropping with increased resizer usage with dmai resizer using sdma

Other Parts Discussed in Thread: OMAP3530

 

 

Hi all, 

We have our product to be released very soon and we desperate to overcome the following issue with omap3530 (720Mhz).

1) FPS around 10-15 for 720P

2) Coupled with increased cpu usage of 40-50%  with the OMAP3530 board (arm 720Mhz clock) + sdk.

We use the following pipeline to perform video playback.

gst-launch --gst-debug=TI*:2 filesrc location=/home/121view/media/capitalcardB_30s_31dec08.mp4 ! typefind ! qtdemux name=demux demux.audio_00 ! queue max-size-buffers=8000 max-size-time=0 max-size-bytes=0 ! typefind ! TIAuddec1 ! audioconvert ! audio/x-raw-int, width=16, depth=16 ! alsasink demux.video_00 ! typefind ! TIViddec2 codecName=h264dec ! TIDmaiVideoSink displayStd=fbdev displayDevice=/dev/fb1 videoStd=720P_60 videoOutput=DVI resizer=TRUE accelFrameCopy=TRUE rotation=0 contiguousInputFrame=TRUE

We are using the omap3530 with the following h/w and s/w configurations

ARM clocking rate -720Mhz, ram 512MB

kernel - 2.6.33.2, dmai- brijesh dmai dev 2xx branch, dsplink - 1.64, codec engine - 2.25.01.06, linux utils -2.25.01.06

 

Initially we started our development with the following SDK and Hardware configurations, we did not face any of the above issues.

ARM clocking rate -  500Mhz, Ram - 256MB, 

Kernel -2.6.29-rc3-omap1 (tipsp release), Dvsdk - 3.00.02.44, Dmai - 2.00.02.04, Dsplink - 1.61.03, Codec_engine - 2.24, Linuxutils - 2.24.04 (cmem install dir), Bios - 5.33.06

 We used the same gstreamer pipeline as before except that we used v4l2 driver instead.

 gst-launch --gst-debug=TI*:2 filesrc location=/home/121view/media/capitalcardB_30s_31dec08.mp4 ! typefind ! qtdemux name=demux  ..........displayStd=v4l2 displayDevice=/dev/video1......

 The cpu usage was between 10-20% and the fps was good at 720P resolutions in the sdk.

On probing I suspect the following to be the issue.

The recent dmai versions use sdma for its resizer functionality (dmai brijesh dev branch). This particular upgrade is to overcome the resizer restriction to resize height & width only in multiples of 32.

But this upgrade causes performance issues (fps dropping and increased cpu usage).

As we are not concerned about the restricting height/width to multiples of 32, I tried merging the Dmai - 2.00.02.04  with the recent brijesh dmai dev 2xx branch and found it was not working for fbdev.  Also I found the resize implementation is meant to work only with v4l2, so I tried v4l2 and still found it to be failing.

Kindly provide your valuable advice to resolve this issue ASAP.

Thanks and Regards,

Hari

 

 

  • Hari,

    I need some information which is not very clear in the post above.

    1. DVSDK version that you are using?

    2. Is the DMAI version that you are using consistent with what is being packaged in DVSDK? Let me know the DMAI version number?

    3. Is the PSP SDK version that you are using is consistent with the one used for validating the DVSDK version that you are using? Could you clearly state the version number of the PSP SDK and the Linux kernel version?

    4. The post mentions 720P codec. This is not something that has been validated neither on the DVSDK, not on gstreamer for OMAP3530. I believe you got this 720P codec from a thrid party or might be using your own. Kindly clarify. In that case, did you check if you have taken care of the memory and bandwidth requirement calculations?

    Based on the statement in the post, "kernel - 2.6.33.2, dmai- brijesh dmai dev 2xx branch, dsplink - 1.64, codec engine - 2.25.01.06, linux utils -2.25.01.06", and the fact that you are setting ARM MPU rate to 720MHz, I'm assuming that you are using DVSDK v3.01.00.10. Though there are questions on the kernel and brijesh DMAI dev 2xx branch, I'm ingnoring them for now, till I hear back clearly on the versions of the components that you are using. Here are some responses to your query.

    1. If you are using the DVSDK version 3.01.00.10 and the DMAI version that comes with it, the framecopy module in the the DMAI uses SDMA for doing the frame copy. This is different from the DMAI that uses ISS resizer IP for doing the framecopy in case of DVSDK v3.00.02.44. Having said that, we do not see any bandwidth issues or performance issues with this switch, when validated with either the gstreamer or the DVSDK demos for SD resolution, 30fps as well as 25fps clips. When I'm making this statement, it is always using the V4L2 for video rendering for both the gstreamer and DVSDK demos.

    2. Now, I see that you are trying with fframebuffer driver for video rendering and I also see that the resizer is set to TRUE. This, along with 720P decode are somethings that are different from our validation perspective. After the estimates for 720P on OMAP3530, what is your expectation for the fps @ 720P?

    Kinldy let me know with the following information, and only then I can clearly read your problem. Thanks

    - Karthik

  • Hi,

    Thanks for your reply Karthik. I hope the following issues to be resolved ASAP with your help.

    I have aligned the version to your requirements.

    Here are my recent updates on the packages which I made.

    1) updated to dvsdk_3_01_00_10
    2) used the kernel from here. here

    3) updated the above mentioned kernel with the ti's v4l2 (http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=history;f=drivers/media/video/omap;hb=master), so that I can use the /dev/video1

    4) used gstreamer-ti-svnr612-r59+r840

    5) AM35x-OMAP35x-PSP-SDK-03.00.01.06 (but replaced the kernel by the one shown in point 3)

    6) cmem load info

    insmod cmemk.ko phys_start=0x86300000 phys_end=0x87300000 pools=1x5250000,6x829440,1x345600,1x1 allowOverlap=1

    7) cat /proc/cmdline

    mem=88M@0x80000000 mem=384M@0x88000000 console=ttyS2,115200n8 vram=32M omapfb.vram=0:8M,1:16M consoleblank=0 omapfb.mode=dvi:1280x720MR-16@60 omapfb.rotate=0 omapfb.vrfb=y omapfb.debug=1 omapdss.debug=1 root=/dev/nfs nfsroot=10.1.1.20:/srv/nfs/angstrom/121ds/igepnfs

    Other than my video encoded  is encoded to D1 (720x480 resolution, 30fps h.264 bp + aac audio) and not 720P. I had mentioned only the screen resolution as 1280x720(720P). Sorry if I was not clear with my previous  mail.

    Following is the encode details of the file which i used for testing d1_30fps_underdog.mp4 file.
    ffmpeg -i d1_30fbs_underdog.mp4  
    *
     Duration: 00:02:09.09, start: 0.000000, bitrate: 6884 kb/s
        Stream #0.0(eng): Video: h264, yuv420p, 720x480, 6848 kb/s, 23.97 fps, 29.97 tbr, 2997 tbn, 59.94 tbc
        Stream #0.1(eng): Audio: aac, 24000 Hz, stereo, s16, 32 kb/s


    Here are my profile datas with ti v4l2 + gstreamer.

    Video FPS issue:
    1) gst-launch -v filesrc location=/opt/dvsdk/omap3530/data/videos/davincieffect_ntsc_1.264 ! TIViddec2 codecName=h264dec engineName=codecServer ! dmaiperf engine-name=codecServer print-arm-load=true !TIDmaiVideoSink videoStd=720P_60 videoOutput=DVI sync=false
    - With the demo video file from ti, the pipeline succeeded with the following profile data. It consistently was maintaining 30fps.

    INFO:
    Timestamp: 0:35:20.802430016; bps: 20736000; fps: 30; CPU: 17; DSP: 52; mem_seg: DDR2; base: 0x87f16d80; size: 0x20000; maxblocklen: 0x151c8; used: 0xae38; mem_seg: DDRALGHEAP; base: 0x87400000; size: 0x900000; maxblocklen: 0x2e0868; used: 0x61eea0; mem_seg: L1DSRAM; base: 0x10f04000; size: 0x10000; maxblocklen: 0x0; used: 0x10000;

    2) gst-launch -v filesrc location= d1_30fps_underdog.mp4  ! qtdemux name=demux demux.video_00 !  TIViddec2 codecName=h264dec engineName=codecServer ! queue ! dmaiperf engine-name=codecServer print-arm-load=true  ! TIDmaiVideoSink videoStd=720P_60 videoOutput=DVI resizer=FALSE accelFrameCopy=TRUE contiguousInputFrame=TRUE
    - Without queue element after tivideodec the fps was dropping/fluctuating. (attached the gst log case2-gstlogs-withoutqueue.txt)

    -With queue the fps improved (case2-gstlogs-withqueue.txt)

    Comments 1 & 2:

    - With the above two pipeline was able to see the playback with out audio. But the fps was not constantly maintained in case (2) with out the help of queue element highlighted in the above pipeline.

    The fps for video was showing as ffmpeg -i videofile showed, 23.97 fps for the video, though i set the fps to 29.97in the ffmpeg encode. May be i have to tune my video encoding parameters  ?? Kindly suggest the video encode parameter to use so as to make optimum/effective utilization of the DSP, but still achieve the requried quality.

    Resizer Issue:
    3) gst-launch -v filesrc location=/opt/dvsdk/omap3530/data/videos/davincieffect_ntsc_1.264 ! TIViddec2 codecName=h264dec engineName=codecServer ! dmaiperf engine-name=codecServer print-arm-load=true !TIDmaiVideoSink videoStd=720P_60 videoOutput=DVI sync=false resizer=TRUE accelFrameCopy=TRUE contiguousInputFrame=TRUE
    - Pipeline failed with the following error
    0:00:00.504913384  1713    0x987d0 ERROR        TIDmaiVideoSink gsttidmaivideosink.c:1257:gst_tidmaivideosink_init_display: Failed to create resizer

    4) gst-launch videotestsrc ! 'video/x-raw-yuv,width=352,height=288' ! TIVidResize ! 'video/x-raw-yuv,width=640,height=480' ! TIDmaiVideoSink videoStd=720P_60 videoOutput=DVI sync=false
    - failed saying
    "ERROR: from element /GstPipeline:pipeline0/GstTIVidResize:tividresize0: failed to create resize handle"

    Comments 3 & 4:
    It could be that there is some issue with mismatch of dmai resizer implementation and linux kernel resizer driver. Dont know how to figure out whether the resizer driver is properly aligned to the dvsdk version. Kindly help me finding the appropriate patch for this (I had pointed the linux kernel source which I am currently using).

    AUDIO+VIDEO:
    5) gst-launch -v filesrc location=/opt/videos/d1_30fbs_underdog.mp4 ! qtdemux name=demux demux.audio_00 ! queue max-size-buffers=8000 max-size-time=0 max-size-bytes=0 ! TIAuddec1 ! alsasink demux.video_00 ! queue ! TIViddec2 codecName=h264dec engineName=codecServer ! dmaiperf engine-name=codecServer print-arm-load=true  ! TIDmaiVideoSink videoStd=720P_60 videoOutput=DVI
    - failed with the following error
    CMEM Error: getPool: Failed to get a pool fitting a size 691200
    Failed to allocate memory.
    gst-launch-0.10: Buffer.c:205: Buffer_freeUseMask: Assertion `hBuf' failed.
    ERROR: from element /GstPipeline:pipeline0/GstTIViddec2:tividdec20: failed to re-partition decode buffers after processingfirst frame

    6) So I tweaked the pipeline as below

    gst-launch --gst-debug=TI*:0 filesrc location= /opt/videos/d1_30fbs_underdog.mp4   ! typefind ! qtdemux name=demux demux.audio_00 ! queue max-size-buffers=8
    000 max-size-time=0 max-size-bytes=0 ! typefind ! TIAuddec1 ! audioconvert ! audio/x-raw-int, width=16, depth=16 ! alsasink demux.video_00 ! queue ! TIViddec2 codecName=h264dec numOutputBufs=2 ! dmaiperf engine-name=codecServer print-arm-load=true ! queue ! TIDmaiVideoSink displayStd=v4l2 displayDevice=/dev/video1 videoStd=720P_60 videoOutput=DVI resizer=FALSE accelFrameCopy=TRUE contiguousInputFrame=TRUE numBufs=3

    - After tuning the numbufs of tividdec and tidmaivideosink and adding the queue element before tividdec, i was able to see the playback for the first time with audio. But it had the following issues
        - arm cpu usage increased around 30%. I am worried on using resizer the cpu could still increase more.
        - audio and video was playing out of sync

    Attached the logs for the above pipeline (audio_video_outofsync_withqueue.txt).

    comments 5& 6:

    How do i over come the audio video sync issue? Also I have 512MB of memory, so I have an option  to increase the cmem allocation memory for dsp.

    Will this help anyway and if so could you guide me increasing the cmem.??


    With Rotation:
    7) With tidmaivideosink's numBufs=2 (<3) and without queue element after tividdec, i find the fps was dropping badly.
    And with queue the rotating the video (rotation=90) it caused flickers in the video and the following error was thrown
    7] omapdss DISPC error: GO bit not down for channel 0g VID1
    [ 4132.046447] omapdss DISPC error: VID1_FIFO_UNDERFLOW, disabling VID1
    [ 4132.134185] omapdss DISPC error: VID1_FIFO_UNDERFLOW, disabling VID1
    [ 4132.140625] omapdss DISPC error: GO bit not down for channel 0g VID1
    [ 4132.259643] omapdss DISPC error: VID1_FIFO_UNDERFLOW, disabling VID1
    [ 4132.359130] omapdss DISPC error: VID1_FIFO_UNDERFLOW, disabling VID1
    [ 4132.365661] omapdss DISPC error: GO bit not down for channel 0g VID1
    [ 4132.485595] omapdss DISPC error: VID1_FIFO_UNDERFLOW,

    Comments 7:
    Without queue I could not acheive the required fps (but with sync issue ofcourse), but with queue, video rotation is causing flickering.
    Seems i have to increase the cmem allocated for the dsp, so as to avoid restricting the numbuf used, but iam not sure about that.

    We have to resolve the issues in FPS, resizer, video/audio sync and rotation issues to release our product. Kindly provide us with your solution to resolve the issue.

    Thanks and Regards,

     

    Hari

    gst-logs.zip