This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Watchdog reset cause

I've got some code here which I suspect might be the cause of a watchdog reset occurring on our board, but so far haven't been able to identify the root cause. Is there a condition here that's not getting handled resulting in a stuck interrupt (see code below)? I'm also wondering if there are any recommended methods for tracking this problem down using DSP/BIOS v5.41?

Any help or pointers would be greatly appreciated.

void HWI_uartIRQ(void)
{
    Uint16 value;
    Uint16 event;

    /* read the interrupt identification register */
    UART_FSET(URLCR, DLAB, 0);
    value = UART_RGET(URIIR);
    while ((value & 0x01) == 0)
    {
        /* interrupt pending */
        event = (value & 0x0e) >> 1;
        switch (event)
        {
            case UART_URIIR_IID_RCLINESTATUS:
                // read the line status register and record statistics
                value = UART_RGET(URLSR);
                uartStats.lsr++;
                if (value & 0x0002)
                {
                    // count the overrun error, cleared by reading LSR
                    uartStats.oe++;
                }
                if (value & 0x0004)
                {
                    // discard character with parity error
                    UART_RGET(URRBR);
                    uartStats.pe++;
                }
                if (value & 0x0008)
                {
                    // discard character with framing error
                    UART_RGET(URRBR);
                    uartStats.fe++;
                }
                if (value & 0x0010)
                {
                    // discard character with break indicator
                    UART_RGET(URRBR);
                    uartStats.bi++;
                }
                if (value & 0x0080)
                {
                    // discard character with receiver fifo error indicated
                    UART_RGET(URRBR);
                    uartStats.rfier++;
                }
                break;
            case UART_URIIR_IID_CHARTIMEOUT:
                // receiver time-out interrupt
                uartStats.rxTimeout++;

                // fall through

            case UART_URIIR_IID_RCDATA:
                // receiver data ready
                while( UART_FGET(URLSR, DR) )
                {
                    /* read data */
                    *pWrite++ = UART_RGET(URRBR);
                    uartStats.dr++;

                    /* wrap pointer */
                    if (pWrite >= rxBuffer + UART_BUFSIZE)
                    {
                        pWrite = rxBuffer;
                    }
                }
                /* notify application */
                SEM_postBinary(uartRxSem);
                break;
            case UART_URIIR_IID_TXEMPTY:
                // transmitter holding register empty
                uartStats.thre++;
                break;
            default:
                uartStats.unknown++;
                break;
        }
        value = UART_RGET(URIIR);   

    }

    return;
}

/*
   @details Initializes the on-board UART */
Uint16 uartInit (void)
{
    Uint16 tempbaud;

    /* allocate resources */
    if ((rxBuffer = MEM_alloc(sdram_heap, UART_BUFSIZE, 4)) == NULL)
    {
        LOG_0 (LOG_ERROR, "ERROR: failed to allocate rxBuffer");
        return(AP_FAIL);
    }
    pWrite = pRead = rxBuffer;
    if ((uartRxSem = SEM_create(0, NULL)) == NULL)
    {
        LOG_0 (LOG_ERROR, "ERROR: failed to allocate uartRxSem");
        return(AP_FAIL);
    }

    /* initialize stats */
    memset(&uartStats, 0, sizeof(UartGetStatsRsp));

    /* Set BSR to disable SP2MODE */
    CHIP_FSET(XBSR, SP2MODE, 0);

    /* disable all UART events */
    UART_FSET(URLCR, DLAB, 0);
    UART_RSET(URIER, UART_NOINT);

    /* reset and enable FIFO */
    UART_RSET(URFCR, 0x7);
    UART_RSET(URFCR, UART_FIFO_DMA0_TRIG14);

    /* set DLL and DLM to values appropriate for the required baudrate */
    tempbaud = (Uint16)((Uint32)UART_CLK_INPUT_100 * (Uint32)UART_BAUD_115200 >>
 6);
    UART_FSET(URLCR, DLAB, 1);
    UART_RSET(URDLL, (tempbaud & 0xff));
    UART_RSET(URDLM, (tempbaud >> 8));
    UART_FSET(URLCR, DLAB, 0);

    /* setup word size, stop bits and parity */
    UART_RSET(URLCR, UART_WORD8 | UART_STOP1 | UART_DISABLE_PARITY);

    /* disable loopback */
    UART_RSET(URMCR, UART_NO_LOOPBACK);

    /* enable interrupts */
    IRQ_enable(IRQ_EVT_UART);
    UART_RSET(URIER, 0x5);

    /* take UART out of reset */
    UART_FSET(URPECR, URST, 1);

    return (AP_SUCCESS);
}

 

  • The first thing I notice is that you do not use the 'interrupt' keyword or #pragma INTERRUPT on your IRQ. That will probably freeze your program soon after the first interrupt, or otherwise cause crashing behavior.

  • The configuration tool is being used with the "Use Dispatcher"  option selected, so the interrupt keyword is not required AFAIK. I'm not seeing any crashing behavior in any case and after implementing a throttle for this interrupt such that only x number of events per interval would be processed, the watchdog reset was still seen. Also, after reproducing the problem, breaking on the watchdog interrupt handler and inspecting the UART stats, the counts look normal; doesn't look like the problem is with this code.

    The watchdog reset was also seen after bumping the priority of the task servicing the timer to the highest possible. In another test with the watchdog timer being serviced at the software interrupt level the watchdog reset did not occur, at least in our testing so far, so it looks like something at software interrupt level is causing the watchdog to kick.

     

     

  • It's quite possible that the root cause of the watchdog reset is the heap getting corrupted. This application is using a 32K heap located in SDRAM where a 512 byte circular buffer (rxBuffer from the code) is located. Are there tools available to help track down this issue?

  • In the end this turned out to have something to do with the real time analysis tools running in the idle thread. The Idle thread was always running when this problem occurred and taking the RTA tools out resolved it.