This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

IPC HeapBufMP_free() performance

Hi,

I found a performance problem in my application which is caused by HeapBufMP_free().

I'm using a HeapBufMP instance for image buffer management on 8 cores. The problem is that HeapBufMP_free() is invalidating cache for the released data buffer within a GateMP_enter()/GateMP_leave() block. This can take a relatively long time, depending on the size of the buffer. In my application, HeapBufMP_free() gets called simultaneously on all cores. This causes a spinning loop in GateMP_enter() on every core (except the first caller), waiting for the cache operation of the previous core to complete. In other words: only one core can invalidate the data buffer's cache at the same time.

In my opinion it should be OK to invalidate the cache of the data block before entering the gate.

I changed the code in HeapBufMP.c as follows and it seems to work:

Void ti_sdo_ipc_heaps_HeapBufMP_free(ti_sdo_ipc_heaps_HeapBufMP_Object *obj, 
        Ptr block, SizeT size)
{
    IArg key;

    Assert_isTrue(((UInt32)block >= (UInt32)obj->buf) &&
        ((UInt32)block < ((UInt32)obj->buf + obj->blockSize * obj->numBlocks)),
        ti_sdo_ipc_heaps_HeapBufMP_A_invBlockFreed);
    
    /* Assert that 'addr' is block-aligned */
    Assert_isTrue((UInt32)block % obj->align == 0,
            ti_sdo_ipc_heaps_HeapBufMP_A_badAlignment);

#if 1 // new version
    /*
     *  Invalidate entire block make sure stale cache data isn't
     *  evicted later
     */
    if (obj->cacheEnabled) {
        Cache_inv(block, obj->attrs->blockSize, Cache_Type_ALL, TRUE);
    }
#endif

    /* Enter the gate */
    key = GateMP_enter((GateMP_Handle)obj->gate);

    ListMP_putTail((ListMP_Handle)obj->freeList, block);

    if (ti_sdo_ipc_heaps_HeapBufMP_trackAllocs) {
        /* Make sure the attrs are not in cache */
        if (obj->cacheEnabled) {
            Cache_inv(obj->attrs, sizeof(ti_sdo_ipc_heaps_HeapBufMP_Attrs),
                    Cache_Type_ALL, TRUE);
        }

        obj->attrs->numFreeBlocks++;

        /* Make sure the attrs are written out to memory */
        if (obj->cacheEnabled) {
            Cache_wbInv(obj->attrs, sizeof(ti_sdo_ipc_heaps_HeapBufMP_Attrs),
                    Cache_Type_ALL, TRUE);
        }
    }
    
#if 0 // original version
    /*
     *  Invalidate entire block make sure stale cache data isn't
     *  evicted later
     */
    if (obj->cacheEnabled) {
        Cache_inv(block, obj->attrs->blockSize, Cache_Type_ALL, TRUE);
    }
#endif

    /* Leave the gate */
    GateMP_leave((GateMP_Handle)obj->gate, key);
}

HepMemMP and maybe other modules seem to have the same behaviour.

Ralf

  • Ralf,

    I checked into this a bit, and we don’t see any obvious reason why the cache invalidates need to be limited to one core at a time.  I just filed and enhancement request (SDOCM00096671) to get this looked at in detail, and if there are indeed no issues, to make this change in a future release.

    Thank you for reporting and suggesting this!

    Scott