This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

C6748 memory access optimization

Hi All,

I'm using the VPIF on the OMAP L138 LCDK to capture 12-bit per pixel raw images. My mission is to take up to 15 frames and run frame averaging. Code is written on the DSP.

As far as I understand, best way to do this (performance wise) is to allocate two lines buffers on the L2 cache. Each buffer will contain about 10 pixel lines from the current frame (due to memory limitations).

Another full frame buffer will be allocated on the external DDR.

When a line interrupt will be received from the VPIF, I'll direct the VPIF to the second lines buffer and will sum the new lines from the lines buffer to the full frame buffer.

The divide operation will be done only after all frames were captured.

My pixel clock is 50Mhz and I have 1280 pixels per line. The frame buffer will hold 16-bits per pixel to allow summing up to 15 frames without overflow.

I would like to know if this is the correct memory configuration for that task, how should I configure the catch before summing the pixels to the frame buffer and how the sum operation should be done to achieve best performance.

Thanks!

  • Eyal,

    You do not want to transfer the video data into L2 cache. This memory area is not to be accessed by the user but only by the DSP. Instead, you want to use some of the remaining L2 SRAM as your target for the line buffers. So you will want to configure anything other than L2 being 100% cache. If you need L2 cache for other things, then 50% cache will leave 50% for use as SRAM, or you can use 0% L2 cache and have all of L2 available as SRAM. Please see the available cache settings to decide how much of the 256KB L2 memory you want to use as cache or as SRAM.

    Regards,
    RandyP

  • Thanks RandyP.

    What is the recommended documentation for understanding how to configure and use the catch?

    This is the loop I'm using to sum the lines buffer into the average buffer. Is it the correct optimized way to do this?

    for (i = 0 ; i < BYTES_TO_HANDLE/4; i += 4)
    {
    _amemd8(&AverageBuff[i]) += _amemd8_const(&LinesBuff[i]);
    _amemd8(&AverageBuff[i+2]) += _amemd8_const(&LinesBuff[i+2]);
    }

    Thanks,

    Eyal.

  • Eyal,

    Eyal Lahat said:
    What is the recommended documentation for understanding how to configure and use the catch?

    Start with the TRM for information on the cache.

    Eyal Lahat said:
    Is it the correct optimized way to do this?

    Start with no attempts at optimization so you can get the algorithm functioning correctly, then start applying optimization techniques that you can learn from the Wiki and from training material.

    The unoptimized code, once validated to operate correctly for your algorithm, will offer a known code set that you can use to compare the results of your optimied version to see if you get the right results.

    Regards,
    RandyP