This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DM368 performance issue encoding std definition H.264

I recently got my h.264 encode application running on our dm368 board.  I'm encoding D1 resolution from the composite input, streaming RTP to the network, and am having performance problems.  At D1 resolution I can only achieve 28.5 fps.  At 1/2 D1 I can get the full 29.97 fps  We've got our board clocked at 432Mhz for the arm and 340Mhz for the DDR.  The DM365EVM with the same application only gets about 22fps @ D1 res so I know that our board performance is not the issue.

I'm using the encode application in the dvsdk-demos directory (dvsdk_4_00_00_17 beta) as a basis for my application.  So I'm using the DMAI interface.  I wrote this app many months ago on the DVSDK_2 and don't recall having this problem.  I was using the DM365EVM at the time.  However, I didn't do any extensive testing at the time and it's possible that I never even ran at D1 resolution.  But I believe I did run D1 and didn't have a performance problem

Does anyone have any idea what could be the slowdown as this is far below the stated performance of the chip?

  • Is the DMAI interface inefficient?
  • Is the DVSDK 4 beta known to be slow?
  • Could it be a lack of available buffers causing the pipeline to choke?

I ran "top" and my app is using 80-90% CPU.  Funny thing is that it says MEM is 107%.  But the used and available at the top are about 30000K used and 46000K free.  Also the VSZ next to %MEM shows over the 80Meg that I have for Linux.  Not even sure what VSZ means.

Thanks,

John A

  • Hi John,

    As a sanity test, have you tried to run DVSDK 4.0's encode demo straight out-of-box to see if it runs at 30 fps? (It should, even on the DM365EVM.) Also, do you make a copy of the encoded content on the filesystem before sending it over the network? What I have noticed is that performance on NFS is poorer when compared with recording on an SD card for example, so you may want to keep that in mind.

    Best regards,

    Vincent

  • Vincent,

    I'm streaming H.264 RTP multicast instead of writing to the NFS file system.  No recording.  I'll have to try running the demo and see how it works.

    John A

  • Running the encodedecode demo h.264 @ 3Mbps on the DM365EVM shows only 20% or less cpu load.  The next step is to run my app and throw away the encoded data instead of streaming to the network, and see if that makes a big difference.

    John A

  • Thanks for the update. Good to know that the you can get the basic demo running at full frame rate. Let us know how this turns out once you remove the streaming code.

    Best regards,

    Vincent

  • It's not the streaming to the network.  I've systematically gone through each part of the application and have determined that the offending routine is Framecopy_execute.

    I'm thinking that there is a default parameter that changed between dvsdk 2 and 4 that causes the frame copy to use a much slower method.  Hopefully I will figure this out tomorrow.

    John A.

  • Turns out that I didn't have the Framecopy_attrs "accel" parameter set to true.  After setting accel to true I'm at around 10% cpu usage.

    I'm wondering why frame copy needs to be done.  I'm capturing std def on the composite input and encoding at std def resolution.  What's the theory behind requiring a frame copy?

    John A.

  • In DM365 the line length for capture/display buffers needs to be multiple of 32. So the line length is 736 for D1 resolution (width 720). One can play with codec parameters to accept line length different from image width and able to feed the captured buffer as it is. But these settings are different for different codecs.

    Also, on the display path, if you want to display smaller resolution frames on a higher resolution display (for eg. D1 in 720P), there is a need to do a copy.

    So in the interest of keeping the code generic for various codecs and resolutions, a copy is performed on both the encode path and the display path at all times for D1 resolution.
     
    In brief, in your app, a no-frame copy version can be done, but we did not do it to keep the code simpler, and at D1 resolution the frame copy was judged to be a low overhead for DM365 (when accel is set).

    Best regards,
    Vincent
     

  • Hi All,

    Let me know demo encoder application available in  DVSDK 4.2 with DM368 SOC (/dvsdk-demos_4_02_00_01/dm365/encode) is useful for write own application

    of video capturing from composite input and encode raw data in MPEG -2 format.

    Regards,

    Anil Verma

  • Hi Anil,

    I have replied to your other thread: http://e2e.ti.com/support/embedded/linux/f/354/t/154890.aspx

    Best regards,

    Vincent