This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

H.264 HP encoder for C6678 cache coherence



Hi,

I am using latest H.264 HP encoder for C6678 platform for multi-core encoding tasks and trying to implement correct cache coherence for the encoder.

As I see samples from the encoder and MCSDK video use Cache_wbInvAll() to do the job that is not quite correct because it affects cache of all encoders/decoders/algorithms that concurrently running on the same core. Thus I am implementing Cache management recommendations from this post: http://processors.wiki.ti.com/index.php/Cache_Management

Master encoder algorithm:

Cache_wbInv for input buffer

Cache_inv for output buffer

swbarr call to wait slave cores

process call

Slave encoder algorithm:

swbarr to wait for master core

shmmap_sync with IVIDMC_SHMEM_KEY_CLEAN

Cache_inv for input buffer

Cache_inv for output buffer

process call

I noticed that if I do not call shmmap_sync(encoderId, IVIDMC_SHMEM_KEY_CLEAN, NULL) before "process" call on slave cores result video stream has artefacts - image is trembling (except first top slice that AFAIK processed by master encoder)

Questions:

1) What items should be write-back/invalidated before and after process call on master and slave cores?

2) Is it normal to call shmmap_sync on slave cores before process? It looks like workaround so I believe something is not correct in my code or inside encoder.

shmmap_sync implementation:

static XDAS_Int32 shmmap_syncMultiProc(XDAS_Int32 coreID, IVIDMC_SHMEMKEY shmem_key, XDAS_Int32 *shmem_base)
{
    IArg gateKey = GateMP_enter(sharedMemoryGate);

    if (shmem_key == IVIDMC_SHMEM_KEY_CLEAN) {
        int key;
        for (key = IVIDMC_SHMEMKEY_FIRST; key < IVIDMC_SHMEM_NUM_KEYS; key++) {
            Cache_wbInv(sharedMemoryBasesMultiProc[key], sharedMemorySizesMultiProc[key], Cache_Type_ALL, TRUE);
        }
    } else  {
        Cache_wbInv(shmem_base, sharedMemorySizesMultiProc[shmem_key], Cache_Type_ALL, TRUE);
    }

    GateMP_leave(sharedMemoryGate, gateKey);

    return 0;
}

Regards,

Andrey Lisnevich

  • Hi Andrey,

    If you call just invalidate(No write back) for all the keys for slave cores before process, it should be fine. 

    We will check and get back to you on mandating shmemsync for slave cores.

    Regards

    Rama

  • Hi Andrey,

    We have checked this in the code. Below are the things you need to take care for every process call.

    1) WbInv Input and Output buffers.

    2) WbInv all the memtabs for the instance(This is due to some of the region is directly accessed from DDR by all CPUs)

    3) Sync all Shared Memories(This is due to slave cores will read few of the shared parameters updated by master at start of frame processing before  shmemsync)

    Regards

    Rama

  • Hi Rama,

    As far as I understand output buffer should be just invalidated. Write-back is not necessary since it does not contain any input data.

    What do you mean sync all shared memories? Is it invalidate or write-back followed by invalidate.

    Also do you have similar information in case of multi-core encoding case for master and slave? For example as far as I understand write-back for input buffer should be done only on cores that filled it with YUV data (i.e. modified). Other cores should just do invalidate.

    Regards,

    Andriy Lysnevych

  • Hi Andriy,

    Yes just invalidate for output buffer is sufficient. 

    For shared memory sync it is write-back followed by invalidate.

    For multicore scenario also we need to follow 3 steps mentioned in earlier message. But for Input data, if EDMA is used for copying input data, we can just do invalidate for all the cores (including master).

    Regards

    Rama