This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

OMAP3530 Dsplink based application MMU Error

Hi, I'm a beginner DSP developer and I'm currently working on some basic filtering applications using the Beagleboard.  I've managed to build and install dsplink, bios, codec engine, lpm.  I've also verified that all the basic example programs work.  I'm currently trying to use the LOOP example and instead of having the DSP loop data back to the GPP i'm trying to have receive a buffer of data from the GPP, convolve the buffer with itself and return the convolution result to the GPP.  The main lines of code I've added are

#define MAX_SIZE 256

#define ARRAY_SIZE 128

Short buffLocal[MAX_SIZE]

Short buffLocal2[MAX_SIZE

memset(buffLocal, 0, sizeof(short)*MAX_SIZE)

memset(buffLocal2, 0, sizeof(short)*MAX_SIZE)

 

For some reason if MAX_SIZE is larger than 202 the program crashes after the GPP sends the data to the DSP and is awaiting to reclaim the data/channel from the DSP.  The message i get is

"DSP MMU Error Fault! MMU_IRQSTATUS = [0x10]"

However, this error is not always logged in my serial output window so I haven't been able to write down the interrupt numberI've been doing some readings on the the dsplink/dsp memory allocation but I can't seem to wrap my head around what I need to do to make sure that I have enough memory, I'm assuming that the DSP is running out of memory.  I followed the following steps for installing and setting up dsp link

http://ossie.wireless.vt.edu/trac/wiki/BeagleBoard_CodecEngine

the website indicates that by setting mem=80M that I have 48M for DSPLINK.

 

I know that the channel construct uses Char8, so I'm selecting the channel size as 2048 and I'm making sure that MAX_SIZE*sizeof(short) < 2048, where sizeof(short) = 2 bytes.  Also MAX_SIZE is the size of the arrays and ARRAY_SIZE is the number of elements physically in the array.  The convolution code is also simple enough

convolution(buffLocal2, ARRAY_SIZE, buffLocal2, ARRAY_SIZE, buffLocal);

 

int convolution(bufferType* in1, int length1, bufferType* in2, int length2, bufferType* out)
{
    int i=0,j=0;
    //#pragma MUST_ITERATE(2,2)
    for (i=0; i < length1+length2-1; i++)
    {
        //#pragma MUST_ITERATE(2,2)
        for (j=0; j<length1; j++)
        {
            if (i-j >= 0)
            {
                //if (i==0)
                out[i] = out[i] + in1[j]*in2[i-j];
                //out[j]=in2[j];
                //printf("i=%d j=%d i=j=%d\n", i, j, i-j);
            }
        }
    }
    return 1;

}

Thanks.  I would appreciate any help/insight I can get

  • There are some articles about the OMAP3 DSP-side MMU configuration here:

       * http://tiexpressdsp.com/index.php/DSP_MMU_Faults

       * http://tiexpressdsp.com/index.php/OMAP3_DSP_MMU_Configuration

    Chris

  • Hi,

    It is not clear from the code you have pasted, how the interaction with DSPLink changes when you add the convolution code. For example, it's not clear whether you are copying the contents of the received buffer from the GPP into your local buffer before performing the convolution operation on it. Also whether you are copying the results of your convulation buffer into the DSPLink CHNL buffer allocated from the POOL. Can you show the exact changes you made to the application to send the updated buffer back to the GPP?

    The MMU fault indicates that you're trying to access a memory address that is not mapped to the DSP. This can happen if you have some incorrect memory access on the DSP, or if the DSP is going into the weeds.

    Regards,
    Mugdha

  • Here's the code from GPP side

    // populate outgoing buffer with data

        if (DSP_SUCCEEDED (status)) {
            temp = LOOP_Buffers [0] ;
        for (i = 0 ; i < ARRAY_SIZE ; i++) {
            fillVar=gr_inst.rfDataInt[i];
             memcpy(temp, &fillVar, sizeof(fillVar));
            temp = temp + stepSize;
            }

    }

    convolution(gr_inst.rfDataInt, ARRAY_SIZE, gr_inst.rfDataInt, ARRAY_SIZE, gr_inst.rfDataOut);

    // the rest of the GPP code is the same for the LOOP example example the verify function checks the solution against the GPP convolution result

     

    DSP Side

    all the modifications I did was in the loop execute function, so I copied and pasted the whole function here

    Int TSKLOOP_execute(TSKLOOP_TransferInfo * info)
    {
        Int         status  = SYS_OK ;
        Char *      buffer  = info->buffers [0] ;
        Arg         arg     = 0 ;
        Uint32      i=0,k=0 ;
        Int         nmadus ;
        bufferType buffLocal[MAX_SIZE];
        bufferType buffLocal2[MAX_SIZE];
        /* Execute the loop for configured number of transfers
         * A value of 0 in numTransfers implies infinite iterations
         */

        //BCACHE_setMode(BCACHE_L1D, BCACHE_FREEZE);


        for (i = 0 ;
             (   ((info->numTransfers == 0) || (i < info->numTransfers))
              && (status == SYS_OK)) ;
             i++) {
            /* Receive a data buffer from GPP */
            status = SIO_issue(info->inputStream,
                               buffer,
                               info->bufferSize,
                               arg) ;
            if (status == SYS_OK) {
                nmadus = SIO_reclaim (info->inputStream,
                                      (Ptr *) &buffer,
                                      &arg) ;
                if (nmadus < 0) {
                    status = -nmadus ;
                    SET_FAILURE_REASON (status) ;
                }
                else {
                    info->receivedSize = nmadus ;
                }
            }
            else {
                SET_FAILURE_REASON(status);
            }
            /* Do processing on this buffer */
        
        if (status == SYS_OK) {


             memset(buffLocal, 0, sizeof(buffLocal));
             memcpy(buffLocal2, buffer, sizeof(bufferType)*ARRAY_SIZE);

            // to make sure that the convolution result is written back to external memory and is not in cache
             HAL_CACHE_WBALL;
             //HAL_CACHE_WB(buffLocal, sizeof(bufferType)*ARRAY_SIZE);
             memcpy(buffer, buffLocal, sizeof(bufferType)*ARRAY_SIZE);

                /* Add code to process the buffer here*/
            }
        else {
                SET_FAILURE_REASON(status);
            }
            /* Send the processed buffer back to GPP */
            if (status == SYS_OK) {
                status = SIO_issue(info->outputStream,
                                   buffer,
                                   info->receivedSize,
                                   arg);

                if (status == SYS_OK) {
                    nmadus = SIO_reclaim (info->outputStream,
                                          (Ptr *) &(buffer),
                                          &arg) ;
                    if (nmadus < 0) {
                        status = -nmadus ;
                        SET_FAILURE_REASON (status) ;
                    }
                }
                else {
                    SET_FAILURE_REASON (status) ;
                }
            }
        }
        return status ;
    }


  • Thanks for your reply, in my case would I be looking at expanding the "DSPLINK (MEM)" segment out of curiosity?  and do I need to worry about the data and program segments seperately? since I'm assuming data and instructions are in two seperate memory banks

  • One more things I forgot to mention in my code ... sorry for the multiple posts :(

    typedef short bufferType

    I'm passing the channel size as 2048 in the same manner as the TI LOOP app

  • Hi,

    I suspect stack overrun. Can you try making the local buffer arrays global instead of on the stack, or increase the task stack size when you are creating it?

    The buffer sizes you are using are small enough to not need increasing the DSPLINKMEM size. If you had to increase it, you would have got failures from either POOL_open or PROC_attach:

    http://wiki.davincidsp.com/index.php/DSPLink_POOL_FAQs

    http://wiki.davincidsp.com/index.php/Changing_DSPLink_Memory_Map

    Regards,
    Mugdha

  • Mugdhak, I finally got around testing your suggestion of making the buffers global and that fixed it ... you're a genius thanks !!! out of curiosity how can I increase the task Stack size?  I'm assuming it's through modifying the pool_alloc call?

     

    POOL_alloc( SAMPLE_POOL_ID, (Ptr*)transInfo->buffer[i], transInfo->bufferSize );

     

    would that be by changing "transInfo->bufferSize" into "transInfo->bufferSize+BufferSize1+BufferSize2"  thanks again you're a life saver :)

  • Hi,

    No, this is the DSP/BIOS TSK I'm talking about. That's created (in the loop sample) in file main.c, main ().

        tskLoopTask = TSK_create(tskLoop, NULL, 0);

    If you want to increase the stack size, you should use non-NULL params and set the stackSize in it. You can check out DSP/BIOS documentation for TSK module for the correct parameters to be used.

    Regards,
    Mugdha