This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

EDMA Interrupts stopped

Other Parts Discussed in Thread: TMS320C6711D

Design background:

A Master DSP (TMS320c6711D) is configured to collect data from two slave DSPs (TMS320c6711D) on McBSP at 2.17MHz clock rate. Master DSP is generating clock and frame sync for the communication. EDMA is linked to McBSP receive.

EDMA Interrupt code is as below...

-----------------------------------------------------------------------------------------------------------------------

interrupt void edma_ISR()
{

    /* change the ping pong buffer by checking address of buffer in use. take the alternate one */
    if (*((volatile unsigned int *) (EVENTD_PARAMS + DST)) <
                                            (unsigned int) &mcbsp_rcv_buffer[PONG])
        available_rcv_buf = 1;
    else
        available_rcv_buf = 0;
   
    /* reset the interrupt */
    *(volatile unsigned int *)(CIPR) = 0x00002000;

    /* process the data*/   

    get_source_data();

    mcbsp0_rcv_complete = 1;
}

----------------------------------------------------------------------------------------------

EDMA Configuration is as below..............

void edma_init()
{

    available_rcv_buf = PONG;
   
     /* EDMA MAIN CHANNEL for Mcbsp transmit main channel */
    *((unsigned int *) (EVENTD_PARAMS + OPT)) = 0x303D0002;
    *((unsigned int *) (EVENTD_PARAMS + SRC)) = McBSP0_DRR;
    *((unsigned int *) (EVENTD_PARAMS + CNT)) = 30;
    *((unsigned int *) (EVENTD_PARAMS + DST)) = (unsigned int)&mcbsp_rcv_buffer[PING];
    *((unsigned int *) (EVENTD_PARAMS + IDX)) = 0;
    *((unsigned int *) (EVENTD_PARAMS + LNK)) = (EVENTN_PARAMS & 0xffff) ;

    /* EDMA RELOAD CHANNEL for Mcbsp receive (PONG) */
    *((unsigned int *) (EVENTN_PARAMS + OPT)) = 0x303D0002;
    *((unsigned int *) (EVENTN_PARAMS + SRC)) = McBSP0_DRR;
    *((unsigned int *) (EVENTN_PARAMS + CNT)) = 30;
    *((unsigned int *) (EVENTN_PARAMS + DST)) = (unsigned int) &mcbsp_rcv_buffer[PONG];
    *((unsigned int *) (EVENTN_PARAMS + IDX)) = 0;
    *((unsigned int *) (EVENTN_PARAMS + LNK)) = (EVENTO_PARAMS & 0xffff) ;

    /* EDMA RELOAD CHANNEL for Mcbsp receive (PING) */
    *((unsigned int *) (EVENTO_PARAMS + OPT)) = 0x303D0002;
    *((unsigned int *) (EVENTO_PARAMS + SRC)) = McBSP0_DRR;
    *((unsigned int *) (EVENTO_PARAMS + CNT)) = 30;
    *((unsigned int *) (EVENTO_PARAMS + DST)) = (unsigned int) &mcbsp_rcv_buffer[PING];
    *((unsigned int *) (EVENTO_PARAMS + IDX)) = 0;
    *((unsigned int *) (EVENTO_PARAMS + LNK)) = (EVENTN_PARAMS & 0xffff) ;

    *((unsigned int *) EER) = 0x00002000;        /* enable event 13 &14 McBSP0 Rcv */
    *((unsigned int *) CIER)= 0x00002000;        /* enable interrupt 13 & 14 */

                     
}

-----------------------------------------------------------------------------------------------------

Problem Symptoms seen from field:

After the unit ran for some months.. EDMA interrupts stopped occuring.

------------------------------------------------------

Question and request for clues:

Is it possible that EDMA Interrupts stop happening in this Master DSP. If so what can lead to it.

Any clues would be highly helpful.

  • Few that I think of,

    - McBSP receive in Main DSP is stopped working for some reason.

    - McBSP or EDMA configuration registers got corrupted.

    - EDMA interrupt got disabled.

    - Global interrupts got disabled.

    Based on the field inputs. The Main DSP is working fine with regard to other aspects. Its communicating fine with a HMI over external UART, Its updating its hear beat LED, Its communicating with a 8bit device on memory map fine, etc.

    Based on thse inputs, third and fourth are ruled out.

    Any clues from any one if they see this kind of halt in EDMA communication.

  • Bhaskar,

    So the application runs continuously for months before failing?  Do they use any form of time-stamping that may not handle register overflows properly?  Are there any McBSP errors when everything stops?

    -Tommy

  •  

    As the incident happened in field, could not able to diagnose any.  And inside this master DSP there is no error check. After power cycling, everything is normal.

    With regard to EDMA/McBSP, EDMA interrupt service routine is the only code that runs after initializing these peripherals. Other peripheral interrupts are still happening normal.

    This is first time seeing this kind of issue, Many units with the same firmware are in field for past 10months.

  • Bhaskar,

    That's a tough one to debug.  If the software seems ok because other systems continue to function correctly and the hardware works ok after reset, I would have to think that some random event like a power droop or temperature spike may have occurred and contributed to data corruption.  If the hardware fails again, it would be a good candidate for failure analysis.

    -Tommy

  • I just saw this thread...

    http://linux.omap.com/pipermail/davinci-linux-open-source/2007-September/003960.html

    Content:

    ------------------------------------------------------------------------------------------------------------

    Hi Juan,
    >
    > I tracked the read stall down to a Receive Overrun (RFULL) condition
    on
    > the McBSP serial bus. If you look at the Audio Serial Port (ASP)
    > Interface data sheet (SPRUE29), there is a Serial Port Control
    Register
    > (SPCR - physical address 0x01E02008) that shows the transmit and
    receive
    > status of data to and from the AIC33 codec. When the read operation
    > works properly, Bit 2 (RFULL) of the SPCR register is never set.
    > However, when the read stall occurs on our custom bread board (audio
    > data path from Davinci to AIC33 is longer), this bit is always set:
    >
    > Serial Port Control Register (SPCR):
    > Bit2 RFULL Receive shift register full bit.
    >
    > 0 RBR is not in overrun condition.
    > 1 DRR is not read, RBR is full, and RSR is also full with new word.
    >
    >
    > Here's the description about RFULL from SPURE29:
    > 2.4.5.1 Receive Overrun: RFULL
    >
    > RFULL = 1 in the serial port control register (SPCR) indicates that
    the
    > receiver has experienced overrun and is in an error condition. RFULL
    is
    > set when the following conditions are met:
    >
    > * DRR has not been read since the last RBR-to-DRR transfer.
    > * RBR is full and an RBR-to-DRR copy has not occurred.
    > * RSR is full and an RSR-to-RBR transfer has not occurred.
    >
    > The data arriving on DR is continuously shifted into RSR (Figure 21).
    > Once a complete element is shifted into RSR, an RSR-to-RBR transfer
    can
    > occur only if an RBR-to-DRR copy is complete. Therefore, if DRR has
    not
    > been read by the CPU or the EDMA controller since the last RBR-to-DRR
    > transfer (RRDY = 1), an RBR-to-DRR copy does not take place until RRDY
    =
    > 0. This prevents an RSR-to-RBR copy. New data arriving on the DR pin
    is
    > shifted into RSR, and the previous contents of RSR are lost. After the
    > receiver starts running from reset, a minimum of three elements must
    be"
    > received before RFULL can be set, because there was no last RBR-to-DRR
    > transfer before the first element.
    > This data loss can be avoided if DRR is read no later than two and a
    > half CLKR cycles before the end of the third element (data C) in RSR,
    as
    > shown in Figure
    --------------------------------------------------------------------------------------------------
    As per my understanding of the ti data sheet spru190d, the problem should go away when next frame sync is seen.
    If EDMA fails to read a DRR, it should only lead to loss of one element otherwise the EDMA should keep receiving and generating interrupts.
    Am I wrong with this assumption.
  • Rewriting the last sentence in above post..

    --------------------

    As per my understanding of the ti data sheet spru190d,

    the problem should go away when next frame sync is seen.
    If EDMA fails to read a DRR, it should only lead to loss of one element

    otherwise the EDMA should keep receiving and generating interrupts.


    Am I wrong with this assumption.

  • Bhaskar,

    My intepretation is that the McBSP can overwrite the RSR if DRR is not read in time during the OVERRUN condition described in SPRU580G.  Further, the OVERRUN condition is cleared by either reading the DRR or resetting the receiver via the RRST control bit.

    However, if EDMA synchronization is lost during the OVERRUN condition, a subsequent frame sync would not necessarily generate another REVT to EDMA:

    SPRU580G said:
    RRDY directly drives the McBSP receive event to the DMA/EDMA controller (via REVT).

    Since RRDY is high during OVERRUN, it will not trigger another event until the OVERRUN state has been cleared.

    -Tommy

  • Is it possible that OVERRUN can happen when McBSP receive is linked to EDMA,

    Other inputs.

    -Code is executing from both internal and external SRAM. No L2 cache.

    -memcpy routines are used in the code.

  • "However, if EDMA synchronization is lost during the OVERRUN condition, a subsequent frame sync would not necessarily generate another REVT to EDMA:"

     

    How exactly EDMA can loose synchronization with McBSP?

    Assuming EDMA didnt read DRR in time, it may lead to OVERRUN bit set.

    But still one REVT is pending for EDMA to read DRR. Once EDMA reads this DRR, OVERRUN bit should get clear and communication should continue. We may miss one element during this scenario. But when next frame sync comes (along with data), EDMA count should satisfy and generate interrupts. This should lead to packet errors which we are not seeing.

  • Bhaskar,

    I agree that under normal circumstances, even if EDMA misses the real-time deadline for servicing McBSP, it should eventually service the event and clear the OVERRUN condition.

    My guess is that a missync could be caused by register corruption in either the McBSP or EDMA.  Another possibility is if the EDMA setup requires the CPU to acknowledge/reset the parameters, an REVT event could be missed before EDMA is primed again.

    -Tommy