Codec Engine Upgrade

Brandy Jabkiewicz

Hello,

I have been having problems with my codec engine "hanging" on occasion (version 2_25_05_16). I read another post about this that said it is a bug and it was fixed in version 2_26_02_11. (http://e2e.ti.com/support/embedded/f/354/p/64877/233972.aspx?PageIndex=2)

It looks like my code locks up in the same location. However, I have not modified any of the DSP code. I am currently working only with a DMAI application (video_encode_io1) that I have modified to work with only my platform (instead of generic) and to run a maximum of 10 seperate threads. For instance, the 10 threads each encode 300 frames. I see the lock up with version 2.25.05.16 after running the application several times.

Following the above posts advice, I am trying to upgrade to 2.26.02.11 in my dvsdk. I assumed that all I need to do was update my "rules.make" and also the rules.make in the DMAI app to point to the new CE. I then re-built my server and my application, moved them to the Davinci and executed. Now the davinci locks all the time with only one thread running!

Did I not integrate it right or did I misunderstand the post? Please let me know what else you might need to help me debug this problem. I am wondering if it might be a memory mapping problem. However, I have left everything the way it arrived with the dvdsk - except that I changed my bootargs to have 76M for linux (so that the loadmodules.sh would run).

bootargs=mem=76M console=ttyS0,115200n8 root=/dev/nfs nfsroot=172.17.1.152:/shares/davinci,nolock,tcp rw noinitrd ip=172.17.1.55:172.17.1.152:172.17.1.1:255.255.255.0::eth0:off vpif_display.ch2_numbuffers=0 vpif_display.ch3_numbuffer=0

I lost the log with the error from 2.25.05.16, but I am trying to reproduce it. Here is the debug log for version 2.26.02.11:

7028.bench_data_NewCE_1.txt

Thanks!

Brandy

over 15 years ago

0 Robert Tivy over 15 years ago

TI__Mastermind 18260 points

Brandy,

In getting the fix for the VIDANALYTICS bug (by upgrading to CE 2.26.02.11), you are also inheriting some new behaviour in CE 2.26.02.11 - there is no longer any automatic "registration" of buffers in the translation cache when calling Memory_getBufferPhysicalAddress(). Here are some details regarding that change:
    Headline:
        SDOCM00076172 - CE's OSAL API Memory_getBufferPhysicalAddress() should not add buffer to its physical/virtual address translation cache
    Details:
        A side-effect of Codec Engine's OSAL API Memory_getBufferPhysicalAddress()'s implementation is that when successful, it adds the passed buffer to an internal list of physical/virtual address translations. This list is then consulted for future calls to that API, as well as calls to the OSAL API Memory_getBufferVirtualAddress().

This translation list is also used when allocating/freeing contiguous buffers (through CMEM) with Memory_contigAlloc() (or Memory_alloc() with Memory_CONTIG{HEAP/POOL}). Such usage can be considered "controlled", since the buffer is explicitly allocated by the Memory module, whereas addiing to this list through Memory_getBufferPhysicalAddress() is "uncontrolled" since presumably the buffer was allocated outside the Memory module (and therefore the Memory module won't know about the buffer and its kernel memory mapping going away).

This approach leads to potentially stale entries, since there is no "automatic" removal from the translation cache to go along with the automatic addition of Memory_getBufferPhysicalAddress(). Problems can result when the virt/phys mapping goes away (i.e., is destroyed in the Linux kernel) and the destroyed virtual address gets reused for a different mapping, yet the old "cached" entry remains in the CE Memory module.

Another aspect of the current implementation is that there is a limit to the number of these translations that can exist simultaneously (an implementation-specific design detail), leading to the eviction of entries when the limit is reached. When Memory_getBufferPhysicalAddress() caches translations for entries not created by a Memory allocation API, this list can quickly reach its limit.

By removing the automatic ("uncontrolled") addition from Memory_getBufferPhysicalAddress() we can reduce the potential for the problems stated above, but unfortunately other problems arise. One of these "other problems" is related to the fact that users "expect" this behaviour, and by removing it we place a new requirement on the use of non-Memory-module-allocated buffers - users must call Memory_registerContigBuf()/Memory_unregisterContigBuf() for buffers given to CE but not allocated by CE's Memory module.

Previous to this change you were getting memory leaks during each run, which would result in the application crashing after a few runs. With this change you are getting translation errors right away. In order to correct those translation errors you need to call Memory_registerContigBuf() for buffers that are passed to the codec but not allocated through CMEM. See this blog entry: http://e2e.ti.com/blogs_/b/codec_engine/archive/2010/11/03/finer-details-physically-contiguous-memory.aspx

Regards,

- Rob

0 Brandy Jabkiewicz over 15 years ago in reply to Robert Tivy

Mastermind 6325 points

Rob,

Thanks, may I confirm my understanding.

Since I am using DMAI features, in this code, Buffer_create was using Memory_getBufferPhysicalAddress().

hInBuf = Buffer_create(Dmai_roundUp(inBufSize, BUFSIZEALIGN), (Buffer_Attrs *)&gfxAttrs);

This is no longer working as the user expects and I need to add a line in the DMAI code to register the Buffer. For example:

Buffer_Handle Buffer_create(Int32 size, Buffer_Attrs *attrs)
{
Buffer_Handle hBuf;
UInt32 objSize;

    if (attrs == NULL) {
        Dmai_err0("Must provide attrs\n");
        return NULL;
    }

if (attrs->type != Buffer_Type_BASIC &&
attrs->type != Buffer_Type_GRAPHICS) {

        Dmai_err1("Unknown Buffer type (%d)\n", attrs->type);
        return NULL;
    }

objSize = attrs->type == Buffer_Type_GRAPHICS ? sizeof(_BufferGfx_Object) :
sizeof(_Buffer_Object);

hBuf = (Buffer_Handle) calloc(1, objSize);

    if (hBuf == NULL) {
        Dmai_err0("Failed to allocate space for Buffer Object\n");
        return NULL;
    }

_Buffer_init(hBuf, size, attrs);

if (!attrs->reference) {
hBuf->userPtr = (Int8*)Memory_alloc(size, &attrs->memParams);

        if (hBuf->userPtr == NULL) {
            printf("Failed to allocate memory.\n");
            free(hBuf);
            return NULL;
        }

hBuf->physPtr = Memory_getBufferPhysicalAddress(hBuf->userPtr, 4, NULL);

Memory_registerContigBuf(hBuf->userPtr, 4, hbuf->physPtr);

        Dmai_dbg3("Alloc Buffer of size %u at 0x%x (0x%x phys)\n",
                  (Uns) size, (Uns) hBuf->userPtr, (Uns) hBuf->physPtr);
    }

hBuf->reference = attrs->reference;

return hBuf;
}

So, another question - why is the size 4 and not the variable "size"? Or would it remain 4 in the getBufferPhysicalAddress and size in the register buffer.

Int Buffer_delete(Buffer_Handle hBuf)
{
Int ret = Dmai_EOK;

if (hBuf) {
if (!hBuf->reference && hBuf->userPtr) {

Memory_unregisterContigBuf(hBuf->userPtr, origState.numBytes);
            Dmai_dbg3("Free Buffer of size %u at 0x%x (0x%x phys)\n",
                      (Uns) hBuf->origState.numBytes,
                      (Uns) hBuf->userPtr,
                      (Uns) hBuf->physPtr);

            if (!Memory_free(hBuf->userPtr, hBuf->origState.numBytes,
                             &hBuf->memParams)) {
                ret = Dmai_EFAIL;
            }
        }

free(hBuf);
}

return ret;
}

Which brings me another question - is there an updated DMAI I could integrate as well?

Thanks,

Brandy

0 Robert Tivy over 15 years ago in reply to Brandy Jabkiewicz

TI__Mastermind 18260 points

Modifying Buffer_create() with the addition of Memory_registerContigBuf() is an option. Another option is for your code to do the register after calling Buffer_create(), using the hBuf->userPtr & hBuf->physPtr members of the returned Buffer_Handle, along with the 'size' param to Buffer_create() (or, I see that Buffer_delete() uses hBuf->origState.numBytes for the size, which seems fine).

Brandy Jabkiewicz said:

hBuf->physPtr = Memory_getBufferPhysicalAddress(hBuf->userPtr, 4, NULL);

Memory_registerContigBuf(hBuf->userPtr, 4, hbuf->physPtr);

So, another question - why is the size 4 and not the variable "size"? Or would it remain 4 in the getBufferPhysicalAddress and size in the register buffer.

The use of 4 for the size in Memory_getBufferPhysicalAddress() may be an artifact from some previous implementation. While it is not clean to use 4 instead of the actual size, it worked in the past, since the address returned by Memory_getBufferPhysicalAddress() is just the base address of the "buffer" and the base address doesn't change when a different size is passed. One could argue that the size parameter isn't even needed for Memory_getBufferPhysicalAddress(), and normally it wouldn't be, but the CE 2.25 implementation of Memory_getBufferPhysicalAddress() performed that auto-registration that we've discussed, and the size is needed for registration. In fact, with that auto-register in 2.25, using a fake size of 4 can lead to problems if the actual buffer size is used in a previous or later registration since two entries would be registered - one with baseAddr & size 4 and a different entry with baseAddr & actual size.

The correct implementation is to use the actual size in both Memory_getBufferPhysicalAddress() and Memory_registerContigBuf()/Memory_unregisterContigBuf(). One can get away with using just 4 for the Memory_getBufferPhysicalAddress(), but it's important that the correct size be used with Memory_registerContigBuf()/Memory_unregisterContigBuf(), so that an address somewhere in the middle of the buffer can be correctly translated. Just for the purpose of this discussion ... you might get away with using size 4 in Memory_registerContigBuf() if only the base buffer address is ever translated, but it's vital that you pass the same size to Memory_unregisterContigBuf(), else it would not match up to the registered buffer.

Brandy Jabkiewicz said:
Which brings me another question - is there an updated DMAI I could integrate as well?

I don't know about DMAI updates, but I'll try to poke around.

Regards,

- Rob

0 Brandy Jabkiewicz over 15 years ago in reply to Robert Tivy

Mastermind 6325 points

Hi Rob,

It still didn't seem to fix the situation. It looks like I am still not registering the buffers. When I look at the application code, it looks correct. When I look a bit deeper, it looks like the DMAI translates my simple buffer to a buffer_description object and then that gets passed to the encoded. Do you think this causes a change such that my registered virtual address is not being used? Or do you think this means that the DMAI maybe has other hidden buffers that are not registered?

Here is my log with the new CE and the registered buffers.

0624.NewCE_debug_2_withBufReg.txt

Also, if I change '4' to size, the code will not work at all - if I leave it at 4, it will work once (maybe) before it hangs.

So here is a summary of the behavior:

- if I do not register the buffers - CE always hangs

- if I register the buffers and change 4 to size in getBuffPhysAdd - CE always hangs

- if I register the buffers and leave 4 in the getBuffPhysAdd - CE might run once, then it will hang

What do you think? Any word on an updated DMAI or perhaps should I move this thread over to a DMAI topic? Is there a DMAI topic?

I really appreciate your help and promptness!

Here is the code that I added:

Buffer_Handle Buffer_create(Int32 size, Buffer_Attrs *attrs)
{
.....Code Remove to save space ....

if (!attrs->reference) {
hBuf->userPtr = (Int8*)Memory_alloc(size, &attrs->memParams);

        if (hBuf->userPtr == NULL) {
            printf("Failed to allocate memory.\n");
            free(hBuf);
            return NULL;
        }

        hBuf->physPtr = Memory_getBufferPhysicalAddress(hBuf->userPtr, 4, NULL);
        //Added by BJabkiewicz: TI instructed that CE 2.26.2.11 needs to explicited register/unregister buffers.
        printf("Registering Buffer \n");
      Memory_registerContigBuf((Uint32)hBuf->userPtr, size, hBuf->physPtr);

        Dmai_dbg3("Alloc Buffer of size %u at 0x%x (0x%x phys)\n",
                  (Uns) size, (Uns) hBuf->userPtr, (Uns) hBuf->physPtr);
    }

hBuf->reference = attrs->reference;

return hBuf;
}

Int Buffer_delete(Buffer_Handle hBuf)
{
Int ret = Dmai_EOK;

    if (hBuf) {
        if (!hBuf->reference && hBuf->userPtr)
        {
        //Added by BJabkiewicz: TI instructed that CE 2.26.2.11 needs to explicited register/unregister buffers.
        Memory_unregisterContigBuf((Uns)hBuf->userPtr, hBuf->origState.numBytes);

            Dmai_dbg3("Free Buffer of size %u at 0x%x (0x%x phys)\n",
                      (Uns) hBuf->origState.numBytes,
                      (Uns) hBuf->userPtr,
                      (Uns) hBuf->physPtr);

            if (!Memory_free(hBuf->userPtr, hBuf->origState.numBytes,
                             &hBuf->memParams)) {
                ret = Dmai_EFAIL;
            }
        }

free(hBuf);
}

return ret;
}

This is the function call that eventually makes it to the codec engine: it takes my registered buffer and adds it to a IVIDEO1_BufDescIn to send to the encoder.

Int Venc1_process(Venc1_Handle hVe, Buffer_Handle hInBuf, Buffer_Handle hOutBuf)
{
    IVIDEO1_BufDescIn       inBufDesc;
    XDM_BufDesc             outBufDesc;
    XDAS_Int32              outBufSizeArray[1];
    XDAS_Int32              status;
    VIDENC1_InArgs          inArgs;
    VIDENC1_OutArgs         outArgs;
    XDAS_Int8              *inPtr;
    XDAS_Int8              *outPtr;
    BufferGfx_Dimensions    dim;
    UInt32                  offset = 0;
    Uint32                  bpp;

    assert(hVe);
    assert(hInBuf);
    assert(hOutBuf);
    assert(Buffer_getUserPtr(hInBuf));
    assert(Buffer_getUserPtr(hOutBuf));
    assert(Buffer_getSize(hInBuf));
    assert(Buffer_getSize(hOutBuf));
    assert(Buffer_getType(hInBuf) == Buffer_Type_GRAPHICS);

bpp = ColorSpace_getBpp(BufferGfx_getColorSpace(hInBuf));

BufferGfx_getDimensions(hInBuf, &dim);

offset = (dim.y * dim.lineLength) + (dim.x * (bpp >> 3));
assert(offset < Buffer_getSize(hInBuf));

inPtr = Buffer_getUserPtr(hInBuf) + offset;
outPtr = Buffer_getUserPtr(hOutBuf);

    /* Set up the codec buffer dimensions */
    inBufDesc.frameWidth                = dim.width;
    inBufDesc.frameHeight               = dim.height;
    inBufDesc.framePitch                = dim.lineLength;

    /* Point to the color planes depending on color space format */
    if (BufferGfx_getColorSpace(hInBuf) == ColorSpace_YUV420PSEMI) {
        inBufDesc.bufDesc[0].bufSize    = hVe->minInBufSize[0];
        inBufDesc.bufDesc[1].bufSize    = hVe->minInBufSize[1];

        inBufDesc.bufDesc[0].buf        = inPtr;
        inBufDesc.bufDesc[1].buf        = inPtr + Buffer_getSize(hInBuf) * 2/3;
        inBufDesc.numBufs               = 2;
    }
    else if (BufferGfx_getColorSpace(hInBuf) == ColorSpace_UYVY) {
        inBufDesc.bufDesc[0].bufSize    = Buffer_getSize(hInBuf);
        inBufDesc.bufDesc[0].buf        = inPtr;
        inBufDesc.numBufs               = 1;
    }
    else {
        Dmai_err0("Unsupported color format of input buffer\n");
        return Dmai_EINVAL;
    }

outBufSizeArray[0] = Buffer_getSize(hOutBuf);

    outBufDesc.numBufs                  = 1;
    outBufDesc.bufs                     = &outPtr;
    outBufDesc.bufSizes                 = outBufSizeArray;

inArgs.size = sizeof(VIDENC1_InArgs);
inArgs.inputID = GETID(Buffer_getId(hInBuf));

    /* topFieldFirstFlag is hardcoded. Used only for interlaced content */
    inArgs.topFieldFirstFlag            = 1;

    outArgs.size                        = sizeof(VIDENC1_OutArgs);

    /* Encode video buffer */
    status = VIDENC1_process(hVe->hEncode, &inBufDesc, &outBufDesc, &inArgs,
                             &outArgs);

.....Code removed to save space ....

0 Robert Tivy over 15 years ago in reply to Brandy Jabkiewicz

TI__Mastermind 18260 points

Brandy Jabkiewicz said:
It still didn't seem to fix the situation. It looks like I am still not registering the buffers. When I look at the application code, it looks correct. When I look a bit deeper, it looks like the DMAI translates my simple buffer to a buffer_description object and then that gets passed to the encoded. Do you think this causes a change such that my registered virtual address is not being used? Or do you think this means that the DMAI maybe has other hidden buffers that are not registered?

Brandy,

I don't think the DMAI "repackaging" is causing any issues, nor does DMAI have other hidden buffers that are not registered.

First, from your CE_DEBUG log I can see that the buffers allocated by Buffer_create() are indeed CMEM buffers, which means that they are already getting registered during the allocation. Your explicit call to Memory_registerContigBuf() then becomes a no-op. To be more precise, the call to Memory_registerContigBuf() should be qualified with:
    if (params->type != Memory_CONTIGPOOL && params->type != Memory_CONTIGHEAP) {
        Memory_registerContigBuf(...);
    }
Given this, I don't see how there would be a problem when you change the size parameter of Memory_getBufferPhysicalAddress() from the fake 4 to the actual size.

Since you report that it works "better" with the explicit registration and the fake size of 4, that would lead me to believe that Buffer_create() is getting called for non-CMEM buffers and that the addition of explicit registration does help, but I'm not ready to say that. I believe that there's just a general problem (see below) that is exhibiting itself in different ways depending on apparently unrelated changes.

One thing that I zeroed in on is the address reported for the translation error - 0x8ba13140 (for one). The CE_DEBUG log shows that this address is in the DSP server's DDRALGHEAP memory section, which spans from 0x8ba00000 -> 0x8fa00000 (which we can see from the DSPLink config dump). This means that the codec is returning this codec-internally-allocated buffer, which is not good for the app to receive since it knows nothing of this buffer and there is no application mapping for it. I asked my supervisor about this and he says there have been "bad" codecs in the past that have done this, but he says your CE_DEBUG-based "package" prints indicate that you're using a stock codec that's not too old, but who knows.

There is a Codecs Forum, and at this point I feel this problem needs to be brought over there (maybe a post just pointing to this reply, so they can see my analysis to this point).

The CE_DEBUG log from your initial post to this thread also showed the same DDRALGHEAP addresses being passed back to the ARM, but since I was focused on the change to Memory_getBufferPhysicalAddress() between CE 2.25 and CE 2.26.02.11 I didn't notice the fact that the ARM side was receiving a buffer that it didn't know about (and can't handle).

Regards,

- Rob

0 Brandy Jabkiewicz over 15 years ago in reply to Robert Tivy

Mastermind 6325 points

Hi Rob,

Thanks for all your help, I guess I did not realize this was in the linux specific forum! Good work anyhow!

I moved the question to here if you want to follow it and/or add any input:

http://e2e.ti.com/support/embedded/f/356/p/88287/305336.aspx#305336

Hopefully this is the right spot.

Is there anything else you can think of that I can do to be helpful? I am very weak on the subject of memory mapping and right now, the memory is mapped about the same as when the board came out of the box.

Brandy

Processors

Processors forum

Codec Engine Upgrade