This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DM6437 video port driver questions

I have some questions about DM6437 video port drivers and their documentation:

1. Where can I find a detailed explanation of all the API functions? The only document I've found is SPRA918A, which is for DM642 and doesn't cover all functions.

2. What are the functionalities of FVID_queue() and FVID_dequeue()? I see them used in TI examples but haven't found any documents about them.

3. What's the difference between FVID_alloc() and FVID_allcBuffer()? the former is explained in SPRA918A; the latter is not.

4. I tested two examples from TI on the DM6437 EVM. One is video_encdec; the other is vpbe_vpfe. Both use the video port drivers and both function correctly, but their performances differ significantly. I wonder what is causing such a difference.

Here is how I tested them: both examples call the function FVID_exchange() for video capture and display. FVID_exchange() seems to be a blocking call, so I can measure cycles around it. I first ran the video_encdec example, measured the capture cycles and the display cycles. The total is the video performance. Then I repeated the same test for the vpbe_vpfe example.

Here are my results: for video_encdec, the video performance is ~20,000 cycles per frame in most cases; the worst case scenario is less than 6 million cycles per frame. For vpbe_vpfe, it's ~20 million cycles per frame on average. That's one thousand times slower. 

Why is there such a big difference in performance?

 

  • RobbySun said:
    1. Where can I find a detailed explanation of all the API functions? The only document I've found is SPRA918A, which is for DM642 and doesn't cover all functions.

    I believe the documents you are looking for are found within your DVSDK install in C:\dvsdk_1_11_00_00\pspdrivers_1_10_00\packages\ti\sdo\pspdrivers\drivers\vpfe\docs for the capture end and C:\dvsdk_1_11_00_00\pspdrivers_1_10_00\packages\ti\sdo\pspdrivers\drivers\vpbe\docs for the display end. The DM642 document you are looking at should give some idea of the APIs however it is really not targeted at DM6437, you could say that the DM6437 FVID APIs are descendants of the DM642 FVID APIs.

    RobbySun said:
    2. What are the functionalities of FVID_queue() and FVID_dequeue()? I see them used in TI examples but haven't found any documents about them.

    These are discussed in the documents mentioned above, essentially they give you the ability to give a frame to the driver or take a frame from the driver, it is a way of passing ownership to capture or display.

    RobbySun said:
    3. What's the difference between FVID_alloc() and FVID_allcBuffer()? the former is explained in SPRA918A; the latter is not.

    FVID_allocBuffer is for actually allocating the space for frame buffers, so it serves a bit of a different purpose than the FVID_alloc from a DM642, the FVID_alloc from a DM642 was more like a FVID_dequeue of today in that it was getting a frame from the driver (and the opposite function FVID_free was essentially replaced by FVID_queue). You can see this by the FVID_exchange call which used to be serial calls to FVID_free and FVID_alloc, but is today serial calls to FVID_queue and FVID_dequeue.

    RobbySun said:

    4. I tested two examples from TI on the DM6437 EVM. One is video_encdec; the other is vpbe_vpfe. Both use the video port drivers and both function correctly, but their performances differ significantly. I wonder what is causing such a difference.

    Here is how I tested them: both examples call the function FVID_exchange() for video capture and display. FVID_exchange() seems to be a blocking call, so I can measure cycles around it. I first ran the video_encdec example, measured the capture cycles and the display cycles. The total is the video performance. Then I repeated the same test for the vpbe_vpfe example.

    Here are my results: for video_encdec, the video performance is ~20,000 cycles per frame in most cases; the worst case scenario is less than 6 million cycles per frame. For vpbe_vpfe, it's ~20 million cycles per frame on average. That's one thousand times slower. 

    Why is there such a big difference in performance?

    To try to understand this better, you are measuring the time from when a FVID_exchange call completes and when it is called again? Since it is a blocking call that will wait for a new frame every time through it would not make sense to do the opposite and measure the time for the FVID_exchange call to complete. If you want to measure the performance of the driver I would suggest looking at the video_preview example instead, since it is just a simple loop with the FVID_exchange call with no other processing going on in the system, where as most of the other examples such as encdec will be running actual video codecs that will put significant load on the DSP.

  • Hi Bernie,

    Thanks for your timely response. I have more questions:  ;-)

    Bernie Thompson said:

    FVID_allocBuffer is for actually allocating the space for frame buffers, so it serves a bit of a different purpose than the FVID_alloc from a DM642, the FVID_alloc from a DM642 was more like a FVID_dequeue of today in that it was getting a frame from the driver (and the opposite function FVID_free was essentially replaced by FVID_queue). You can see this by the FVID_exchange call which used to be serial calls to FVID_free and FVID_alloc, but is today serial calls to FVID_queue and FVID_dequeue.

    If the function FVID_alloc() doesn't actually allocate frame buffers, how are the frame buffers allocated in the example vpbe_vpfe? I don't see FVID_allocBuffer() called in the example.

    Bernie Thompson said:
    To try to understand this better, you are measuring the time from when a FVID_exchange call completes and when it is called again? Since it is a blocking call that will wait for a new frame every time through it would not make sense to do the opposite and measure the time for the FVID_exchange call to complete. If you want to measure the performance of the driver I would suggest looking at the video_preview example instead, since it is just a simple loop with the FVID_exchange call with no other processing going on in the system, where as most of the other examples such as encdec will be running actual video codecs that will put significant load on the DSP.

    Here is the pseudo-code of how I measure the vidoe performance.
    ---------------------------------------------------------------------------------------------------------------------------------
        while (1) {
            /* grab a fresh video input frame */
            tl0 = TSCL;
            FVID_exchange(hGioVpfeCcdc, &frameBuffPtr);
            tl1 = TSCL;
            LOG_printf(&trace, "video capture: %u", (tl1-tl0-1));    // (1)
            /* other stuff, not revevant here */
            /* display the video frame */
            tl0 = TSCL;
            FVID_exchange(hGioVpbeVid0, &frameBuffPtr);
            tl1 = TSCL;
            LOG_printf(&trace, "video display: %u", (tl1-tl0-1));    // (2)
        }
    -------------------------------------------------------------------------------------------------------------------------------
    The sum of (1) and (2) within an iteration will be the video performance per frame.

     

  • I do notice a big difference between the two examples. The DSP/BIOS config for video_encdec allocates a 64KB heap in L1DSRAM. If I reduce the heap size, the code stops working. But when I check the value of frameBuffPtr->frame.frameBufferPtr, it points to DDR2. 

    Now I'd like to verify that the video capture/display actually use this heap on L1DSRAM. If I can force the video drivers NOT to use the heap on L1DSRAM, but on DDR2, and see the performance go down, I can be quite sure that the big heap on L1DSRAM is the key factor. But how do I tell FVID_allocBuffer() to use the heap on DDR2?

     

  • A related question: before CCS 3.3, I could configure everything with the CDB file. Now the TCF file doesn't seem to provide as much control over how my application uses the HW resources. If I want a fine control, such as where the frame buffers should be allocated, how do I do that?

    Thanks again,