This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DM36x Queries (IPNC)

Hi,

1. Does Appro RDK's SW OSD algorithm support foreign language (chinese) ?

[Feroz] OSD can display graphics. Font should just be a library. We will need check with our Chinese team. Since its such a large market for us, it is more likely customers have done this.

[Customer] - Did you get chance to confirm this ? What additional change is needed to support Chinese language ?

Comment> In OSD guide, details are given on how create fonts, once fonts created, then OSD lib will just overlay it


2. Does platinum encoders (H.264 and MJPEG) support privacy masking feature ?

[Feroz] Our IPNCs support privacy masking. This is more of a feature of the application/system and display rather than the video Codecs. Decompression is independent of this feature as I understand. Will confirm.

[Customer] - Ok. Let me check detailed requirement here. Customer has specifically asked whether codecs support this feature or not. Currently, we are doing this on captured raw data. Will get back with mode details.

Comment> in DM36x, h264 codec support privacy masking inside the code, but MJPEG/MPEG4 does not support it


3. We are using platinum H.264 encoder. With this, when we use EDMA for memory copy of YUV data (every captured frame), we get only 24 fps instead of 30 fps (expected). If I don't perform memory copy using EDMA, then I can 30 fps.
Since memory copy is completely offloaded to EDMA instead of CPU, don't you think that compromise of 6 fps is a big number ?
I noticed that memory copy is not taking time but using EDMA for copy makes encoder performance low.
Query - Is it possible that we are making encoder suffer by occupying EDMA for huge memory copy or that should not be an issue? Any idea what can be the issue ?

[Feroz] Which version of DM36x are you using? DM365@300MHz  can do 720p30@30fps and some configs 720p60.

 

[Customer] - Davinci DM365, Variant 0x8. ARM clock rate is 432 MHz. 1080P_30 is expected, we are getting this without privacy masking (ultimately YUV memcpy of each frame).
As soon as we enable privacy masking feature (which performs EDMA based memcpy to fill a color region before encoding), we have to compromise 6 fps, although profiling shows that copy operation is not taking much time with EDMA, compared to CPU based copy.

Comment> Arbitrary EDMA will cause the load balancing done on each TC to go awry, can you tell us which TC you used, can you use TC=3 for this. Performance s very sensitive to DDR access and load, as this has been perfectly balanced for transfer on each TC. Also, what encoder preset are you using? Can you use XDM_HIGH_SPEED preset?

So EDMA enabling reduced the fps??? Strange! Perhaps EDMA overload occurs here. Will need to get more inputs here…

 

[Customer] - Yes. We enable EDMA for memory copy operation. When we don't do memory copy, we get 1080P_30. With EDMA based memory copy, we get 1080P_24, which seems to be a concern, when using EDMA.

  • Hi Feroz.,

    Regarding load balancing of EDMA, if I am not wrong, I can configure TC=3 by passing EVENTQ_3 in davinci_request_dma() function of dm365mm/module/dm365mmap.c file.

    Currently EVENTQ_1 is passed in this function.

    Please correct me if I have misunderstood this.


    Regards,
    Sweta

  • From Raghu: Yes that’s correct, but TC3 is used for Audio too, check to ensure it is fine.

    Best Regards

    Feroz

  • Hi Feroz,

    By default, it was using TC1. When I tried using TC3, I got further loss of 3 fps (than the number which I get with TC1).

    The behavior is same with TC2 and TC3. TC1 is the only TC where I get 25 fps, which should be 30 as expected number.

    Any suggestion what can be done next ? Is it possible that I have missed something ?

    Regards,
    Sweta

  • Hi Feroz,

    Would like to share one more observation for reduction in encoder fps, while using EDMA from application...

    1080P encoding process call takes 32924 msec without the logic of using EDMA in my application, giving 29.58 fps.

    1080P encoding process call takes 39800 msec without the logic of using EDMA in my application, giving 24.78 fps.

    Above statistics say that as we increase usage of EDMA for our application, encode call takes more time, which reduces fps drastically.

    Is there any setting for H264 encoder, which can fine tune the performance so that encoding call gets finish within < 33 msec ? If needed, we can also think to provide more EDMA channels to encoder.

    Any other inputs ?


    Regards,
    Sweta

    32924
  • Hi Sweta,

    Some inputs: 

    Did they use HIGH_SPEED preset and used the codec params as we set in IPNC RDK?

    Please ensure to use enableDDRbugg variable to zero.

     

    Also, can you please tell us on how much data are they trying to overlay, we assume 5% OSD like Date/Time, simple  text.

     

    If they have more data to overlay, they have to use HIGH_SPEED preset and use DM368-486 part, can they try this too?

    Best Regards

    Feroz

  • Hi Feroz,

    We tried using HIGH_SPEED, but didn't get any change in performance. Currently we are using user defined profile.

    Let me compare it with IPNC RDK codec params and get back to you asap.

    Most important one -

    We are overlaying 1080P YUV420 data (1920x1080x1.5) on every 1080P captured frame, which means for 30 fps, per second we overlay 1920x1080x1.5x30 = 93 MB.

    Do you think EDMA performing copy operation of 93 MBps can affect encoder's processing time ? Or EDMA should be capable enough to handle this memory copy as well as encoder's requirement for 1080P_30.

    We are getting 1080P_30 with this encoder without this overlay.

    Regards,
    Sweta

  • Sweta,

    Whats is the overlay data you need to do? as overlaying entire frame is not the efficient way to do this.

    Maximum you can do OSD is 10% of the data which is normally the case more than this, you will see drop in performance as this one frame traffic of data on DMA will bring it down.

    Can you overlay only the region where you need to overlay data like Date/time, some text etc. You can configure num of windows required and overlay only those windows?

    Regards,

    Raghu

  • Hi Raghu,

    The use case is not overlay.

    I am performing privacy masking of YUV420 data, before submitting this frame to encoder and other image processing algorithm.

    Worst case use case is to mask 1080P data and still requirement is to achieve 30 fps.

    Do you think that masking of each 1080P frame using EDMA becomes bottleneck for encoder which is also using EDMA as a resource ? Or if there is any scope of fine tuning or optimization to achieve both 1080P masking and 30 fps for encoding of 1080p?

    Regards,

    Sweta