Video delay on dm6446 evm with TVP5146

inchul lee

Other Parts Discussed in Thread: TVP5146

Hi everybody,

We are using DM6446 EVM with TVP5146 and testing LCD output as encodedecode demo example. We had verified davinci_vpfe.c and ccdc_davinci.c. The result, ccdc_config_ycbcr() function is only used for TVP5146. I know that this mothed is for Video Capture Mode Data Path(SPRUE38E page.119). But we want to use Preview/Movie Capture Mode(SPRUE38E page.109). We have tried to use Preview/Movie Capture Mode in device driver source, but We always failed. Is it possible to use Preview/Move Capture Mode with TVP5146? (I saw and notied that it possible to use Preview/Movie Captuer Mode with MT9T031 at this forum anywhere)

If we use this mode, is it possible to improve video delay time, which is about over 200ms?

Thank you and BRs.

over 16 years ago

0 Juan Gonzales over 16 years ago

TI__Mastermind 37340 points

You can certainly use preview movie capture mode with TVP5146 input, or any other video input source for that matter.

In DM6446, the VPSS blocks are laid out in a sequential fashion and you can choose to bypass any of these blocks if you do not have a need to perform the particular function a block performs. In the MT9T031 case, the data captures is in Bayer Pattern pixel format, so we need to pass it thru Preview engine block to convert it to YCbCr pixel format (what subsequent blocks use). Preview block can feed into resizer block, therefore if you need it to do any resizing of captured image you can use resizer block; otherwise, you can bypass it saving you time. Eventually, you will need to place data in DDR2 so that VPBE can read this video data from DDR2 and display it (assuming you want to display it).

That said, if you are capturing video data using TVP5146, the data is already in YCbCr pixel format; therefor you cannot not pass data thru preview block. If furthermore you are not resizing the video, you can bypass resizer as well... I think you get the point.

0 Bernie Thompson TI over 16 years ago

TI__Mastermind 41665 points

inchul lee said:
Is it possible to use Preview/Move Capture Mode with TVP5146? (I saw and notied that it possible to use Preview/Movie Captuer Mode with MT9T031 at this forum anywhere)

Not really, the previewer is really there to do a bayer pattern (i.e. raw rgb image sensor output) conversion to YUV that is more suitable for display (preview) or compression with a video codec (movie). When you are using the TVP5146 the output will generally already be in the proper YUV format, so the previewer is not necessary.

inchul lee said:
If we use this mode, is it possible to improve video delay time, which is about over 200ms?

As suggested above, the previewer will not help to lower the latency of your system when taking in video from a TVP5146. I think the problem here is that you are using the encodedecode demo example, this particular example will encode (compress) the video using a codec like h.264 and than decode (decompress) the same video stream before displaying it, this means that the CPU is very busy and there will be a lot of extra buffering going on which adds the latency you are seeing. If you do not need to actually encode and decode the video stream than you can get lower latency, though there will always be some latency for buffering in DDR between the capture and display drivers. To this end you may want to try some of the examples from the PSP itself which do not actually do any codec work, such as dvsdk_2_00_00_22\PSP_02_00_00_140\examples\dm644x\v4l2\v4l2_mmap_loopback.c, something like this should show much lower latency.

0 inchul lee over 16 years ago in reply to Bernie Thompson TI

Intellectual 675 points

Thank you for their answers.

Fisrt of all, we had tested v4l2_mmap_loopback.c, but we think that the result is almost equal. So, we had token firmware level test by CCS. We had made buffer 0x81000000 for capture and display. capture and display has same buffer region. The result is very good, latency is about 0.08 sec. Therebefore, we think that the root of lateny is buffering. We know that captuer device driver has 2 buffers minimum, and display device driver has 3 buffers minimum. Is it possible to decrease number of buffer for capture and display?

Thank you and BRs.

0 Juan Gonzales over 16 years ago in reply to inchul lee

TI__Mastermind 37340 points

Double buffering is very popular in video applications as a way of reducing tearing effects that can occur if the capture driver is writting new video frame to the same buffer being displayed while display driver is still busy trying to display the last video frame.

Tripple buffering can become necessary when doing video deconding as B-frames can sometimes cuase MPEG decoders to keep a frame that will be needed to decode the next frame.. hence adding the need for one more frame to our double buffering scenerio.

That said, if you are not doing any decoding, you can get rid of tripple buffering... if you do not mind a little tearing effect or can control your enviroment such that you make sure that both audio capture and display are not happening at the same time... then you can even get rid of double buffering. In short, yes you can get rid of double or tripple buffering, but you should be aware of the purpose they serve and make the call the works best for you.

0 inchul lee over 16 years ago in reply to Juan Gonzales

Intellectual 675 points

Thank you for your reply.

You are right. Our environment of development is a military spec without audio. we made sample device driver using a buffer for this condition and tested it. The result contents our spec. But we must test to verify it whether our spec after adding other application. And I have some question.

1. Is a buffer for display or capture reserved? We tested it as the following. We assumed that the mem variable get buffer for capture or display reserved by OS(mem is 0xc73000000 on our environment). But it doesn't work correctly(but no system panic or lock state). So, as the following, we used 0x81000000 for VPFE_SDR_ADDR and OSD_VIDWIN1ADR. It work correctly.

// define dimension
width = 720;
height = 480;
// reserved buffer size
fbuf_size = VPFE_TVP5146_MAX_FBUF_SIZE;

mem = (void *)__get_free_pages(GFP_KERNEL | GFP_DMA, get_order(fbuf_size));

if (mem)
  {
  printk("===> start reserving page\n");
  adr = (unsigned long)mem;
  printk("===> mem(0x%lx)\n", mem);
  size = PAGE_SIZE << (get_order(fbuf_size));
  while (size > 0) {
   /* make sure the frame buffers
      are never swapped out of memory */
   SetPageReserved(virt_to_page(adr));
   adr += PAGE_SIZE;
   size -= PAGE_SIZE;
   }
   fbuffer_test = (u8 *) mem;
  }
  else
  {
   free_reserved_pages((unsigned long)fbuffer_test, fbuf_size);
  }

....

VPFE_SDR_ADDR = 0x81000000; // (u32)mem;

...

OSD_VIDWIN1ADR = 0x81000000; // (u32)mem;

2. Is data moved automatically from capture to display(Is data streamming automatically by h/w)? We just set VPFE_SDR_ADDR and OSD_VIDWIN1ADR register without a infinite loop or creating thread for carrying data from capture to disaply.

3. Where do we get a detial system memory map for Davinci EVM?

I am sorry to have put you to so much questions... Thank you and BRs.

0 Juan Gonzales over 16 years ago in reply to inchul lee

TI__Mastermind 37340 points

These are all very good questions and we are happy to help.

1) this is a tricky questions, depending on what is your software stack. From a hardware perspective, the memory map is shown in table 2-3 of the main data-sheet available at http://focus.ti.com/docs/prod/folders/print/tms320dm6446.html. From a Linux software stack perspective, if you are doing only video (no OSD graphics) capture and display, you can use V4L2 driver and use the same buffer for capture and display. The buffer will lie is SDRAM and be managed by Linux virtual memory manager (VMM) and hece you will have little control where it is located. If you need to display OSD graphics as well, then you throw FBDev driver into the mix and now the graphics buffer is not shared with capture driver..

2) both registers you mention take in a memory address in SDRAM; if you program both registers with the same memory address, then yes whatever is captured thru input video interface goes directly into that address and can be read by the VID1 display window directly (the most efficient way possible).

3) Again, from a hardware perspective, memory map is defined in the main data-sheet; from a DVSDK software perspective, the memory map is defined per left side of Figure 1 in this wiki article... http://wiki.davincidsp.com/index.php/Changing_the_DVEVM_memory_map

0 inchul lee over 16 years ago in reply to Juan Gonzales

Intellectual 675 points

Juan Gonzales said:

2) both registers you mention take in a memory address in SDRAM; if you program both registers with the same memory address, then yes whatever is captured thru input video interface goes directly into that address and can be read by the VID1 display window directly (the most efficient way possible).

Thank you for your kindly answers.

I understand what you mean. But I have some questions. Do which of hardware function carries data? a dedicated DMA for this function or only ARM...?

And, How can the sync(timming) between input by capture and read by VID1 be adjusted? Is it possbie to adjust it by programmer? If not, should we use s/w funcition such like double buffer. Because, we should consider tearing after adding applications.

BRs.

0 Juan Gonzales over 16 years ago in reply to inchul lee

TI__Mastermind 37340 points

All data buffers captured, displayed, or processed by VPSS are transferred to/from SDRAM to VPSS processing blocks via dedicated video DMA. Normally, the ARM comes into play to manage the rate of capture and display

Capture: capture VSYNC interrupt -> signals to ARM new frame has been captured, ARM can program different capture SDRAM address (if double buffering) and proceed to capture a new frame, while new frame is being captured ARM can send current frame to be displayed

Display: display VSYNC interrupt -> signals to ARM it finish displaying current frame, ARM can program different display SDRAM address (if double buffering) and proceed displaying.

Once enabled, both capture and display work in a continous fashion; for example, they process 720x480 of valid video data @ 30 fps; fortunately, the video timing frame is actually bigger than 720x480 as you always have some blanking data (varies w/ video standard: NTSC, 720P, ...) Therefore, It is important to note that programming of new capture or display SDRAM addresses --done via ioctl() function calls from ARM to drivers-- normally take place quickly during the vertical blanking period. After the vertical blanking period, the hardware will assume valid data is available once again and start capturing/displaying whatever data is present to/from SDRAM.

In your case, let us assume you do not want any double buffering. In that case, there are two scenerios I can think of

1) If your capture and display VSYNCs occur close together in time (good chance, depending on hadware clocking scheme chosen) you can simply enable both capture and display and leave ARM out of the equation. Depending on how far apart capture and display VYNCs are, capture may get done just in time for display to start displaying (ideally you want display VSYNC to occurr right after captuure VSYNC, also likely). If the VSYNCs are far apart that they exceed the vertical blanking period, then capture will start dumping new data before displaying is done and you will see tearing of video. It is worth a try... but this approach, although really efficient will be very susceptible to any clock drift so your clocking scheme becomes very important. Also you may have to enable capture and display at close to the same time as possible (this init phase part will have to be done by ARM); given the nature of Linux scheduler, you cannot always guarantee this as your thread may loose is time slice before you get a chance to enable both services... (assuming you are using Linux)

2) you can have ARM involved (takes actions on VSYNCs) to make sure no video tearing happens, opting to drop new capture frame if last frame has not finished displaying. This addresses the Linux scheduler problem, but by dropping frames you introduce another video problem, slower changing video which can appear like image freezes or is in slow motion at times....

From the above, you can probrably appreciate why people just use double buffering and not worry so much about dealing with common video problems solved by double buffering; but if efficiency is your goal, you do have options to try....

0 inchul lee over 16 years ago in reply to Juan Gonzales

Intellectual 675 points

Thank you, It's very a useful information. BRs.

Processors

Processors forum

Video delay on dm6446 evm with TVP5146