This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

A8 ARM OMX vlpb component in c6xtest doesn't get handle

Hello TI experts,

As been told in http://e2e.ti.com/support/dsp/davinci_digital_media_processors/f/717/t/154371.aspx#560197, i have tried to take the example c6xtest and modify it to use VLPB on A8. The c6xtest doesn't return an handle, it doesn't return with an error instead it stuck here:

  eError =
OMX_GetHandle (&pAppData->pVlpbHandle,
(OMX_STRING) "OMX.TI.A8.VLPB", pAppData->vlpbILComp,
&pAppData->pCb);

Am i missing something?

Thanks,
Gabi 
  • This would sound contradictory w.r.t what has been posted on the thread you have cited, but in the present EZSDK, its not possible to support OMX components running on the A8. The reason for this is that the OMX 'base' class library is not released in source in the EZSDK and you would need the source code to port it on Linux. 

    We can look at supporting A8 based OMX components in the future. Since you have access to data buffer pointers (via EmptyThisBuffer and FillThisBuffer calls) that the other OMX components are manipulating,  you can directly use them for your purposes. Is there any specific reason for looking at OMX components on A8?

    Archith

  • Hi Archith,

    Thanks for your answer, we are building a product for video enhancement, mostly implemented on the DSP, thus the DSP is overloaded, since the TI8168 have also the A8 which has a considerable computational power and it is busy only with the OMX management, we can use it for taking some of the computational load from the DSP. We are sure that more TI8168 users have similar needs.

    Regarding your answer about the workaround for adding the A8 OMX component, suppose that we have the foloowing OMX chain capture->DSP->display, and we need to create the following OMX chain capture->DSP->A8->display, what you suggest is to break the first OMX chain, leave two components with loose edges (DSP and display) and write a thread that will take care of FillThisBuffer and EmptyThisBuffer for the DSP, perform the computation on A8 and take care of EmptyThisBuffer for the display?

    Thanks,
    Gabi 

  • Gabi Gvili said:
    Regarding your answer about the workaround for adding the A8 OMX component, suppose that we have the foloowing OMX chain capture->DSP->display, and we need to create the following OMX chain capture->DSP->A8->display, what you suggest is to break the first OMX chain, leave two components with loose edges (DSP and display) and write a thread that will take care of FillThisBuffer and EmptyThisBuffer for the DSP, perform the computation on A8 and take care of EmptyThisBuffer for the display?

    Thats correct.

    One issue I see is that with the current EZSDK is that the Shared Region #2 from where data buffers are allocated is not cached on A8. So if you do buffer manipulations on A8, performance will not be good. The next EZSDK release should have this exposed as a configuration. Also, you would have to do Cache invalidation operations on A8 in your application. once Cache gets enabled.

    Archith

  • Hi Archith,

    Thanks for your answer, Is Shared Region #2 is cached on DSP? because i see very poor performances on the DSP (after using -O3) and i suspect that it might be the reason. can TI please also add A8 OMX component in the next release? when is the next release will be available?

    Thanks,
    Gabi 

  • By default, SR2 is not cached on the DSP. That will be a reason for the poor performance you are seeing. You can enable it by adding this line of code in your main () function on the DSP.

    Cache_setMar (MEMCFG_SRBASE2, MEMCFG_SRSIZE2, Cache_Mar_ENABLE);

    where MEMCFG_SRBASE2 is the start address of SR2 as seen by the DSP and MEMCFG_SRSIZE2 is the size of SR2.

    Archith

  • Also, A8 based OMX components are not planned for EZSDK.  Since data pointers for all video/audio buffers are accessible on the A8, users can write their own modules for manipulating these data buffers.

    If you have a specific usecase in mind, we can talk in more detail.

    Archith

  • Hi Archith,

    First i want to thank you for the support on making Shared Region #2 cached by the DSP, it really made a huge difference in performances.

    Second, the use-case that i had in mine is as follows: suppose that you want to run a heavy computational load algorithm of video enhancement on the TI8168, the DSP is already near its limits and you have a powerful ARM at your disposal that  is not too busy i believe, since it runs only the Linux kernel and the OMX chain, you might want to use the A8 that has a considerable computational power at your OMX chain. You say and it is true that all of the buffers in the OMX chain are visible by the A8 and you just need to cut the OMX chain create a thread that will take care of empty/Fill buffers in the loose edges and also perform the computations you need, but the question is way making it the ugly way? why not making it compatible with the beautiful concept of the OMX, everything you need is an OMX component, all components obey to the same rules no matter where their physical place is. you create an OMX chain that will include among other components another component that is located on the A8, the OMX chain will remain intact. Please consider this.

    Thanks,
    Gabi 

  • Hi Archith,

    We're on the new EZSDK now. How can the shared region 2 be marked as cacheable by the DSP? How about marking it as cacheable by the ARM?

    Also, I see in the VLPB source code some interesting #ifdefs relating to DSP and ARM (search for _LOCAL_CORE_a8host_); does this mean that the VLPC component can now be built for ARM?

    Thanks,
    Ralph

  • Ralph,

    As I answered you in another post SR2 can be made cacheable by changing firmware loader. (cache coherency is though not taken care on A8 DOMX). VLPB should be buildable on A8.

    Regards

    Vimal

  • Okay thanks. I'll have a look at the other post you've just replied to.

    Ralph

  • Ralph,

    Please look at board-support/media-controller-utils_2_05_00_17/src/firmware_loader/memsegdef_default.c 

     

    /* Segment 3 */

      {

       1,                           /* valid */

       "IPC_SR_FRAME_BUFFERS",      /* name */

       0x0BC00000,                  /* size */

       LDR_SEGMENT_TYPE_DYNAMIC_SHARED_HEAP,        /* seg_type */

       0,                           /* flags */

       0xB3D00000,                  /* system_addr */

       0xB3D00000,                  /* slave_virtual_addr */

       1,                           /* master_core_id */

       (1 << LDR_CORE_ID_VM3) | (1 << LDR_CORE_ID_DM3) | (1 << LDR_CORE_ID_A8),

       /* core_id_mask */

       (1 << LDR_CORE_ID_VM3) | (1 << LDR_CORE_ID_DM3),     /* cache_enable_mask */

     

    You can change the cache enable mask to add cache enable on A8  (1 << LDR_CORE_ID_VM3) | (1 << LDR_CORE_ID_DM3) | (1 << LDR_CORE_ID_A8),

    After this you need to build firware_loader. { make media-controller-utils }. Use this firmware loader in filesystem /usr/bin folder. This should enable the cache for buffers on A8 side.

    Please note cache invalidations are not done in A8 side in DOMX, so if cache gets enabled data might not look fine

    Regards

    Vimal

  • Hi Vimal,

    Vimal Jain said:

     VLPB should be buildable on A8.

    Does that mean VLPB from A8 will get handle now?

    Thanks,
    Gabi 

  • well, what do you know, i have checked OMX example C6xtest with A8 and VLPB on A8 does get handle. 

  • Hi Vimal,

    Now when there is an OMX component VLPB on A8, i want to create an OMX chain capture->DEI->VLPB(A8)->Display, and i get an error of insuficient resources, my guess is that the VLPB on A8 is allocating memory not on shared region 2, how can i order the A8 VLPB OMX component allocate its buffers on shared region 2?

    Thanks,
    Gabi 

  • How can I check if I have set SR2 cached on the DSP successfully?

  • Hi Thomas,

    I wouldn't recommend making SR2 cached by the DSP, based on my experience using EDMA for coping from SR2 to DSP memory will be much better.

    Gabi

  • Hi Gabi,

    Thanks for the information. Is there any example code on how to perform EDMA copy?

    Thomas

  • You can use either the edma3 low level drivers http://processors.wiki.ti.com/index.php/Programming_the_EDMA3_using_the_Low-Level_Driver_(LLD)
    or use the ECPY (a part of framework_component)