This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VM: BAM plugins

Part Number: TDA4VM
Other Parts Discussed in Thread: TDA2

Hello guys,

I'd like to know more about the BAM plugins in VXLIB: http://software-dl.ti.com/jacinto7/esd/processor-sdk-rtos-jacinto7/latest/exports/docs/vxlib_c66x_1_1_4_0/docs/doxygen/html/bamplugins.html

What does "BAM block based processing framework" mean? Is it somehow related or similar to the User Kernel Tiling extension in OpenVX (https://www.khronos.org/registry/OpenVX/specs/1.1/html/d0/d84/page_design.html#sec_known_extensions)?

Thanks for your help,

Fernando A. Endo

  • Fernando,

    Fernando Endo said:
    What does "BAM block based processing framework" mean?

    This is a TI custom framework that uses DMA libraries to copy blocks/tiles of images into DSP L2 RAM in a ping pong fashion, and does tiling processing across a graph of supported kernels.  If the graph has 3 sequential kernels, for example, then then one block of input is DMA'd to L2 RAM and processed through all 3 kernels using only L2 SRAM scratch memory for intermediate outputs, and then final output is DMA'd back to DDR.  In parallel to the kernel processing, the DMA would have already retrieved the next block.  This continues until the full image is processed through all 3 kernels.

    The benefit here is the faster processing due to data already being in L2 SRAM instead of latencies due to DDR access, as well as reduced overall DDR bandwidth as compared to doing all 3 kernels as full frame processing to/from DDR.

    The link you have given is from VXLIB documentation, which indicates which kernels have BAM plugins already available to be submitted in callbacks to the BAM framwork.

    In TIOVX documentation :

    you can see the TIOVX target kernel functions that are used to create BAM graphs within a target OpenVX kernel on C66.

    Fernando Endo said:
    Is it somehow related or similar to the User Kernel Tiling extension in OpenVX

    It is achieving a similar goal, but it is not using the Khronos Tiling extension.

    NOTE: As of PSDRA 6.02 release last week, the TIOVX and VXLIB components support BAM, however, the DMAUTILS component which BAM is dependent on has not been fully ported to TDA4x (this works on TDA2/3x if you want to prototype).  This is planned, but since this functionality is deemed an optimization, it has been planned at lower priority to other functional features.  Please let me know if you have further interest, and if you would like to use this feature, at what point will it be needed from a scheduling point of view.

  • Hello Jesse,

    Thanks for your help and insightful explanations.

    Jesse Villarreal said:
    Please let me know if you have further interest, and if you would like to use this feature, at what point will it be needed from a scheduling point of view.

    It depends on how fast computation can be with BAM. Do you have any pointer of speedups achieved in TDA2/3 compared to the non-BAM kernels?

    Kind regards,

    Fernando

  • In general, it depends on the kernel. If the kernel is highly bottlenecked by the memory IO latency, I have seen more than 2x improvement in performance. If the kernel is highly compute limited, then you may NOT see a big speedup.

     

    Regards,

    Jesse