This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

How to distinguish between codec instances in callbacks

Hi,

I am using H.264 HP encoder for C6678 platform. I need to create multiple instances of the encoder on a single core. To do so I need to provide software barrier and shared memory management callbacks in IVIDMC_t structure.

The problem is that I can not distinguish what instance of encoder is calling the callbacks:

XDAS_Void (*swbarr) (XDAS_Int32 coreID, IVIDMC_SWBARR swbarr_id,
      XDAS_UInt32 swbarr_cnt);

XDAS_Int32 *(*shmmap) (XDAS_Int32 coreID, IVIDMC_SHMEMKEY shmem_key,
         XDAS_Int32 shmem_size, IVIDMC_SHMEM_ATTRS attr, XDAS_Int32 alignment);

XDAS_Int32 (*shmmunmap)  (XDAS_Int32 coreID,
      IVIDMC_SHMEMKEY shmem_key, XDAS_Int32 *shmem_base);

XDAS_Int32 (*shmmap_sync)(XDAS_Int32 coreID,
      IVIDMC_SHMEMKEY shmem_key, XDAS_Int32 *shmem_base);

I need some kind of encoder instance ID to be passed to the callbacks to distinguish between them. I can not use coreID for that because of limitations of HP encoder - coreID = 0 for master codecs.

Is there any good way to distinguish between codecs in callbacks?

Regards,

Andrey Lisnevich

  • Hi Andrey,

    These multicore APIs are standard APIs which are used across the codecs. 

    For Barrier API as only one core is involved application does not really need Instance ID.

    shmmap APIs are used only while creating instance. Since Instance creation is sequential Application can maintain mapping array with base address and instance ID.

    Instance ID for "shmmap_sync" APIs inside process call can be tracked by using mapping array.

    Regards

    Rama 

  • Hi Rama,

    I got your point. But it does not resolves for me the following situation:

    Imagine situation when on core#0 you need to run two multicore master HP encoders (their slaves work on other cores). How to distinguish between them in swbarr callback?

    Regards,

    Andriy Lysnevych

  • Hi Andrey,

    This is interesting. We will discuss with team and let you know on if we plan updating APIs with some application handle (as parameter )containing Instance related data So that Codec Instance can call APIs with particular instance handle.

    But as workaround I think you can implement wrapper functions for each instance for ividmc APIs and assign different function pointers so that application can distinguish instances based on wrapper functions called.

    Regards,

    Rama

  • Hi Rama,

    Yes. I know about this workaround. But it leads to code generation (i.e. bigger code segment) using defines that is not very easy to develop, support and debug (at least in C).

    Ideally is to change meaning of coreID to codecID and make possible to use as codecID a random value even for master codecs. In this way you can leave current structures and functions unchanged.

    Regards,

    Andriy Lysnevych

  • Hi Andriy,

    CoreID is needed in current H264HP encoder design as it is based on data partition. So it involves few changes in codec.

    Can we know real use case of multiple instances in multi core mode? we can use  single core mode for this right?

    Regards

    Rama

  • Hi Rama,

    Imagine situation when you have only 3 cores: #0, #1 and #2 (other cores are fully loaded with different tasks)

    And you need to run 2 encoders on these 3 cores in real-time. Each encoder needs 130% of cycles of one core to do the job in real-time. So in sum it is 260% of cycles of single core and it means that we can run these encoders on 3 available cores (assuming that multi-core encoding overhead is not more than 300% - 260% = 40%).

    We can run:

    1st encoder on cores #0 and #1
    2nd encoder on cores #1 and #2

    Or

    1st encoder on cores #0, #1, #2

    2nd encoder on cores #0, #1, #2

    Both situations involve running multi-core encoders (master or slave) on the same cores.

    Regards,

    Andrey Lisnevich

  • Andrey,

    Thanks for the details about the use case. It is a valid use case and you can resolve the issue by just making framework changes. This approach doesn't need any changes in the codec or in the ividmc APIs.

    You can grep in the MCSDK Video 2.1 code base for current_working_chnum in SIU folder. The idea is to record the active channel number: current_working_chnum before the process call. This current_working_chnum is a global that can be used inside ividMC implementation as well (so that way, you don't have to get the information from inside the codec by modifying the ividMC APIs). Codec doesn't have the concept of channel. The channel runs to completion. So, you can use the current_working_chnum to identify the corresponding siuInst and access the access the appropriate shared memory or barrier memory for that channel. 

    For the core that is participating in two channels would need to have two siuInstances. Please let us know if this works for you.

    Regards,

    Vivek

  • Hi Vivek,

    I am creating transcoding solution that has very flexible configuration and allows to balancer to run encoders (master/slave/single core) on any core.

    When it is needed to run multiple encoders on one core DSP program will start separate task for each.

    The tasks will run simultaneously thus few "channels" can be active at one time and suggested approach will not work.

    Regards,

    Andrey Lisnevich

  • Andrey,

    Are you creating multiple BIOS tasks in each core (1 per channel)?

    Regards,

    Vivek

  • Vivek,

    Yes. It is what I am going to do. The tasks of master encoders should:

    1) Wait and receive new input frame from decoder

    2) Resize (optionally)

    3) Encode (process)

    The tasks of slave encoders will wait for master and call "process" simultaneously.

    Regards,

    Andrey Lisnevich

  • Andrey,

    I am not sure if you are thinking about even or uneven partitioning of codec across cores. For example, let's say if single core encoder takes 130%, you could partition the codec across two cores to use a) 65%, 65% or b) 100%, 30%.

    When codec is partitioned across multiple cores, we have a software barrier before and after the process call, where we expect master and slave cores to synchronize on those barriers. So, when the partitioning is done as 65%, 65% it is equal load balancing. However, for the 100%, 30% case - the slave core ends up waiting 100% because it has to synchronize with the master after its frame processing is complete. It is theoretically possible to use the 70% resource of the slave core to participate concurrently by partnering with another core in a different encode. This can be accomplished by the ideas you brought up - a) using notify or b) context switch from barrier wait to a different bios task - run that task until that task hits a barrier and revert to the original task and so on. However, it is tricky to get it to work - because it could force the master cores to stall in barrier. 

    Would the following approach work? Let's say single encode takes 130% of core - and you want to run 3 instances on 4 cores. Can you equally partition the encode into 4 slices and load balance equally to do 32% each? That way, all 3 channels can be encoded. The downside is you'll have 4 slices instead of 2 but the upside is - it is trivial to implement, each core runs to completion for a given channel and there is no need to change ividmc APIs and use the current_working_channel idea described in previous post. Please think about this option and let us know.

    Regards,

    Vivek

  • Hi Vivek,

    In this discussion I do not think about partitioning at all (except when giving real-life use cases).

    The main problem - if two or more encoders were initialized using the same callbacks and run concurrently in tasks I can't distinguish between them (i.e. SYS/BIOS task scheduler can switch from one task to another at any time).

    But I must be able to distinguish because each transcoder should work with its own barriers and shared memory.

    The only  solution I see without changing API - static code generation of callback functions for each encoder/algorithm. But it is not perfect solution that involves overhead in development, support, debugging, code size, etc.

    Regards,

    Andrey Lisnevich

  • Andrey,

    Like you suggested, something like below should work...

    1. Enhance barrier API to add ch_id..

    XDAS_Void siuVidMc_Swbarr(XDAS_Int32 ch_id, XDAS_Int32 core_id, IVIDMC_SWBARR sync_point, XDAS_UInt32 swbarr_cnt)

    2. Define Swbarr_chXX APIs as follows to register with the codec callbacks.

    XDAS_Void siuVidMc_Swbarr_ch1(XDAS_Int32 core_id, IVIDMC_SWBARR sync_point, XDAS_UInt32 swbarr_cnt)
    {
    XDAS_Int32 ch_id = 1;
    siuVidMc_Swbarr(ch_id, core_id, sync_point, swbarr_cnt);
    }

    XDAS_Void siuVidMc_Swbarr_ch2(XDAS_Int32 core_id, IVIDMC_SWBARR sync_point, XDAS_UInt32 swbarr_cnt)
    {
    XDAS_Int32 ch_id = 2;
    siuVidMc_Swbarr(ch_id, core_id, sync_point, swbarr_cnt);
    }

    Would this unblock you from making progress? We use ividmc APIs across multiple codecs (H264 HP Decoder, H264 HP Encoder, MPEG2 Encoder etc). So, in order to change the API, we need to have a re-release of all the dependent codecs, testing and then another release of MCSDK_Video applicaton, which is a major effort. 

    Regards,

    Vivek

  • Hi Vivek,

    Yes. I am using very similar approach now. It unblocks further development.

    I understand that API change is rather complex and requires additional effort. I will wait for new release of codecs and MCSDK Video with ability to distinguish between encoders in callbacks - it will make my DSP program better.

    Regards,

    Andrey Lisnevich