This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

PROCESSOR-SDK-AM335X: UART Receive FIFOLvl Register reads not reliable.

Part Number: PROCESSOR-SDK-AM335X
Other Parts Discussed in Thread: AMIC110, SEGGER

Intermittently within an isr we're seeing incoherent data from the FIFOLvl register. For example, at the start of the interrupt it may be read as 16 (interrupt level  set to 16), but when we loop through reading RHR using FIFOLvl as the indicator of when to stop (within the same interrupt), we might stop reading after 2 bytes. 

Another variation of this problem is that intermittently on Receive timeout interrupt the FIFOLvl will be 0. Our code intentionally left 2 bytes in the FIFO in the previous interrupt.

Are there any known issues/caveats regarding reading FIFOLvl.

We're using the UART descrived in  Chapter 19 Universal Asynchronous Receive/Transmitter TI AM335x TRM Rev P (AM335x and AMIC110 Sitara Processors)

Thanks in advance,

Paul Hetherington

  • Please post which SDK you are using.
  • Can you share the ISR code that is having an issue?
  • We're using CodeBench / GCC from Mentor. Possibly related to our use of volatile pointers to registers?
  • I have other things in mind, but I'd prefer to review your code rather than speculate on every conceivable thing that could be wrong.
  • Excuse the mess. Its old, and instrumented for troubleshooting.

    void psd_Rx_ISR( PSD_DEVICE *dev )
    {
    volatile unsigned short *regLSR;
    volatile unsigned short *regFIFOLvl;
    volatile unsigned short *regIIR;
    volatile unsigned short *regIER;
    BOOL rxDone = FALSE;
    UINT8 rcvError, rcvData;
    unsigned short rxFifoThres = 2;
    unsigned short iir = 0;

    inISR++;
    sequenceCount++;

    addIsrList(RX_ISR);

    PSD_LINK *pLink = &dev->Link; // buffer and buffer info

    #ifdef ADDITIONAL_DIAGNOSTICS // diagnostic: log data received while we're not expecting data to different location.
    if(dev->Link.rxDone == true)
    pLink = &psdCommUnexpectedLink;
    #endif

    regLSR = (volatile unsigned short *)(dev->pbUartBase + AM3X_UARTLSR_OFFSET);
    regFIFOLvl = (volatile unsigned short *)(dev->pbUartBase + AM3X_UARTRXFIFO_LVL);
    regIIR = (volatile unsigned short *)(dev->pbUartBase + AM3X_UARTIIR_OFFSET);
    regIER = (volatile unsigned short *)(dev->pbUartBase + AM3X_UARTIER_OFFSET);

    rxDone = ((*regIIR) & AM3X_UARTIIR_RXTIMEOUT) == AM3X_UARTIIR_RXTIMEOUT;

    if(rxDone)
    {
    rxFifoThres = 0;
    if(*regFIFOLvl == 0) // this isn't supposed to happen on time-out. Read RHR to clear the time-out. (it won't be read below)
    {
    rcvData = (*psdChannel.uartControl->fnReadFIFO)(dev->pbUartBase);
    psdCommTimoutFIFOEmpty++;
    }
    }

    iir = *regIIR;

    pRxIsrInfo = &rxIsrInfo[rxIsrInfoIndex];
    rxIsrInfoIndex++;
    rxIsrInfoIndex &= 0x1f;

    pRxIsrInfo->iir = iir;
    pRxIsrInfo->it_type = (iir & 0x3f) >> 1;
    pRxIsrInfo->sequenceCount = sequenceCount;
    pRxIsrInfo->fifoLvl = *regFIFOLvl;
    pRxIsrInfo->lsr = lineStatusRegister;
    pRxIsrInfo->fifoBytesRead = 0;

    if(psdCommFirstRx)
    { // first rcv interrupt FIFOLvl stats.
    uint32_t fifoLvl = *regFIFOLvl;
    if(fifoLvl < 16)
    fifoLvl = fifoLvl + dummy;
    psdCommFirstRxIntCount++;
    psdCommFirstRxIntFifoLvlLast = fifoLvl;
    if(fifoLvl < psdCommFirstRxIntFifoLvlMin)
    psdCommFirstRxIntFifoLvlMin = fifoLvl;
    else if (fifoLvl > psdCommFirstRxIntFifoLvlMax)
    psdCommFirstRxIntFifoLvlMax = fifoLvl;
    psdCommFirstRxIntFifoLvlSum += fifoLvl; // for the avg. Do the math outside of the interrupt.
    psdCommRxFIFOShort = 0;
    }

    //if(*regLSR & AM3X_UARTLSR_OVERRUN_ERROR) // RXOE (Overrun Error)
    if(lineStatusRegister & AM3X_UARTLSR_OVERRUN_ERROR) // RXOE (Overrun Error)

    {
    psdCommRXOE = true;
    psdCommRXOECount++;
    psdCommRXOETime = get_32kHz_tick_count();
    psdCommIntDisableDurationPrior = intDisableDurationLast;
    psdCommIntDisableTimeStart = intDisableDurationLastStart;
    psdCommRXOEFifoLvl = *regFIFOLvl;
    }

    if ( *regLSR & AM3X_UARTLSR_FIFO_ERROR ) // RXFIFOSTS (RXBI|RXFE|RXPE) (Break, Framing, or Parity)
    {
    psdCommRXFIFOSTS = true;
    psdCommRegLSR = *regLSR & UART_LSR_ERRORS; // LSR with tx status bits masked off.

    while ( *regLSR & AM3X_UARTLSR_FIFO_ERROR )
    {
    while ( *regFIFOLvl != 0 ) // RXFIFOSTS (AM3X_UARTLSR_FIFO_ERROR) will clear when there are no bytes in the FIFO with these errors.
    {
    rcvError = (*psdChannel.uartControl->fnReadFIFO)( dev->pbUartBase);
    psdCommRXFIFOSTSCount++;
    }
    }
    }
    else // no errors, read data
    {
    asm volatile ("" : : : "memory");
    while ( *regFIFOLvl > rxFifoThres ) //leave chars in FIFO so Rx Timeout can occur
    {
    rcvData = (*psdChannel.uartControl->fnReadFIFO)( dev->pbUartBase);
    pRxIsrInfo->fifoBytesRead++;
    if(pLink->rxDataIndex < RX_BUFF_LEN)
    {
    pLink->rxDataBuff[pLink->rxDataIndex++] = rcvData;
    }
    }

    if ( rxDone )
    {
    pLink->rxDone = TRUE;
    pLink->rxDataLength = pLink->rxDataIndex;
    pLink->rxDataIndex = 0;
    #ifndef ADDITIONAL_DIAGNOSTICS // diagnostic: don't disable the receiver between xfers. Recording data received.
    (*psdChannel.uartControl->fnDisableReceive)( dev->pbUartBase );
    #endif
    // (void)OSActivateHISR( (OS_HISR *)&dev->devHisr );
    }
    else
    {
    if(pRxIsrInfo->fifoBytesRead < pRxIsrInfo->fifoLvl - 2)
    psdCommRxFIFOShort++;
    }
    }
    rcvError = rcvError - 0; //gets rid of compiler warning.

    inISR--;
    psdCommFirstRx = false;
    }
  • The memory barrier is not actually in place. (asm volatile ("" : : : "memory");)
  • I haven't spotted any major issues yet, though there are a few things I didn't understand in the code. For example, let's look at this snippet:

    while ( *regFIFOLvl > rxFifoThres ) //leave chars in FIFO so Rx Timeout can occur
    {
    rcvData = (*psdChannel.uartControl->fnReadFIFO)( dev->pbUartBase);
    pRxIsrInfo->fifoBytesRead++;
    if(pLink->rxDataIndex < RX_BUFF_LEN)
    {
    pLink->rxDataBuff[pLink->rxDataIndex++] = rcvData;
    }
    }

    Why are you leaving characters in the FIFO?
  • We use the receive timeout to indicate the end of an incoming message. Our understanding is that the receive timeout will not (should not anyway) happen unless there is still data in the receive FIFO. This is actually one of the issues we're seeing and don't understand. Intermittently we have a timeout and the FIFO is empty . That's judging by our read of FIFOLvl. I don't have confidence that we're reading FIFOLvl correctly though.
  • continued... so for RHR interrupts, rxFifoThres = 2. For the time-out interrupt, rxFifoThres = 0;
  • PAUL HETHERINGTON said:
    Our understanding is that the receive timeout will not (should not anyway) happen unless there is still data in the receive FIFO

    Correct.

    PAUL HETHERINGTON said:
    Intermittently we have a timeout and the FIFO is empty .

    Looks like your FIFO_ERROR code will empty the FIFO.  Perhaps you're hitting that condition?

    PAUL HETHERINGTON said:
    I don't have confidence that we're reading FIFOLvl correctly though.

    Could there be an issue with your fnReadFIFO() function?  Can you share that function too?  Also, please share the definitions for AM3X_UARTLSR_OFFSET, AM3X_UARTRXFIFO_LVL, AM3X_UARTIIR_OFFSET, and AM3X_UARTIER_OFFSET.  I'm also curious about the type used in the definition of pbUartBase.

    TCR         0x00800080        

  • static unsigned char fnReadFIFO( unsigned char *uBase )
    {
    unsigned char ch = *(uBase + AM3X_UARTRHR_OFFSET);

    return(ch);
    }

    #define AM3X_UARTIER_OFFSET 0x04
    #define AM3X_UARTIIR_OFFSET 0x08
    #define AM3X_UARTRXFIFO_LVL 0x64
    #define AM3X_UARTLSR_OFFSET 0x14

    RXFIFO_LVL is 16 bits with the msbyte reserved. We're not masking out the msbyte. If it was ever not 0, we'd fail.
  • unsigned char *pUartBase
  • Looks like your FIFO_ERROR code will empty the FIFO. Perhaps you're hitting that condition? Break point there and logging indicating not the case. Have never seen those errors. Over-runs yes, but resolved that.
  • Have you tried running this UART code in isolation? In other words, could some other thread somehow be interfering (e.g. making bad memory accesses, etc.)? Also, can you give a rough idea of how often these issues are observed? For example 1 in 100 bytes? 1 in 10,000? 1 in 1,000,000? Have you tried sending data patterns into the device? Are you receiving all of it, or do you end up with missing or repeated data?
  • Have not tried in isolation. Could possibly set that up. Exchanging 100 messages/sec, response is 34 bytes, it might take hours for this failure to happen. Sometimes 10 hrs. Haven't tried patterns. 1.5 mbits/sec, FIFOtrig level at 16 bytes.
  • Good case caught below:

    - 34 byte incoming message

    - RHR int below on FIFOLvl 16

    - only 2 bytes read from FIFO. Should have been 14 (or more if data arrived during int)

    - Either FIFOLvl register momentarily reported the wrong value or our read faulted or was corrupted. 

    - Search for //@@@ comments below

    void psd_Rx_ISR( PSD_DEVICE *dev ) //@@@ First of 3 interrupts for incoming 34 byte message
    //@@@ RHR interrupt on 16 byte FIFO level
    {
    volatile unsigned short *regLSR;
    volatile unsigned short *regFIFOLvl;
    volatile unsigned short *regIIR;
    volatile unsigned short *regIER;
    BOOL rxDone = FALSE;
    UINT8 rcvError, rcvData;
    unsigned short rxFifoThres = 2; // //@@@ rxFifoThres = 2 on breakpoint below (good)
    unsigned short iir = 0;

    inISR++; //@@@ on break point below, inISR = 1 (good)
    sequenceCount++; //@@@ on breakpoint below, sequenceCount = 1 (first int on incomming 34 byte message) (good)

    addIsrList(RX_ISR);

    PSD_LINK *pLink = &dev->Link; // buffer and buffer info

    #ifdef ADDITIONAL_DIAGNOSTICS // diagnostic: log data received while we're not expecting data to different location.
    if(dev->Link.rxDone == true)
    pLink = &psdCommUnexpectedLink;
    #endif

    regLSR = (volatile unsigned short *)(dev->pbUartBase + AM3X_UARTLSR_OFFSET);
    regFIFOLvl = (volatile unsigned short *)(dev->pbUartBase + AM3X_UARTRXFIFO_LVL);
    regIIR = (volatile unsigned short *)(dev->pbUartBase + AM3X_UARTIIR_OFFSET);
    regIER = (volatile unsigned short *)(dev->pbUartBase + AM3X_UARTIER_OFFSET);

    rxDone = ((*regIIR) & AM3X_UARTIIR_RXTIMEOUT) == AM3X_UARTIIR_RXTIMEOUT;

    if(rxDone)
    {
    rxFifoThres = 0;
    if(*regFIFOLvl == 0) // this isn't supposed to happen on time-out. Read RHR to clear the time-out. (it won't be read below)
    {
    rcvData = (*psdChannel.uartControl->fnReadFIFO)(dev->pbUartBase); //@@@ breakpoint not hit here
    psdCommTimoutFIFOEmpty++;
    }
    }

    iir = *regIIR;

    pRxIsrInfo = &rxIsrInfo[rxIsrInfoIndex];
    rxIsrInfoIndex++;
    rxIsrInfoIndex &= 0x1f;

    pRxIsrInfo->iir = iir; //@@@ pRxIsrInfo->iir = 196 on breakpoint below. (good)
    pRxIsrInfo->it_type = (iir & 0x3f) >> 1; //@@@ = 2 (RHR int good)
    pRxIsrInfo->sequenceCount = sequenceCount;
    pRxIsrInfo->fifoLvl = *regFIFOLvl; //@@@ on breakpoint below, pRxIsrInfo->fifoLvl = 16 (Good)
    pRxIsrInfo->lsr = lineStatusRegister; //@@@ = 97
    pRxIsrInfo->fifoBytesRead = 0; //@@@ = 2 (whould have been 14, or more if data arrived during this isr)

    if(psdCommFirstRx)
    { // first rcv interrupt FIFOLvl stats.
    uint32_t fifoLvl = *regFIFOLvl;
    if(fifoLvl < 16)
    fifoLvl = fifoLvl + dummy;
    psdCommFirstRxIntCount++;
    psdCommFirstRxIntFifoLvlLast = fifoLvl;
    if(fifoLvl < psdCommFirstRxIntFifoLvlMin)
    psdCommFirstRxIntFifoLvlMin = fifoLvl;
    else if (fifoLvl > psdCommFirstRxIntFifoLvlMax)
    psdCommFirstRxIntFifoLvlMax = fifoLvl;
    psdCommFirstRxIntFifoLvlSum += fifoLvl; // for the avg. Do the math outside of the interrupt.
    psdCommRxFIFOShort = 0;
    }

    //if(*regLSR & AM3X_UARTLSR_OVERRUN_ERROR) // RXOE (Overrun Error)
    if(lineStatusRegister & AM3X_UARTLSR_OVERRUN_ERROR) // RXOE (Overrun Error)

    {
    psdCommRXOE = true; //@@@ breakpoint not hit here
    psdCommRXOECount++;
    psdCommRXOETime = get_32kHz_tick_count();
    psdCommIntDisableDurationPrior = intDisableDurationLast;
    psdCommIntDisableTimeStart = intDisableDurationLastStart;
    psdCommRXOEFifoLvl = *regFIFOLvl;
    }

    if ( *regLSR & AM3X_UARTLSR_FIFO_ERROR ) // RXFIFOSTS (RXBI|RXFE|RXPE) (Break, Framing, or Parity)
    {
    psdCommRXFIFOSTS = true; //@@@ breakpoint not hit here
    psdCommRegLSR = *regLSR & UART_LSR_ERRORS; // LSR with tx status bits masked off.

    while ( *regLSR & AM3X_UARTLSR_FIFO_ERROR )
    {
    while ( *regFIFOLvl != 0 ) // RXFIFOSTS (AM3X_UARTLSR_FIFO_ERROR) will clear when there are no bytes in the FIFO with these errors.
    {
    rcvError = (*psdChannel.uartControl->fnReadFIFO)( dev->pbUartBase);
    psdCommRXFIFOSTSCount++;
    }
    }
    }
    else // no errors, read data
    {
    //@@@ *regFIFOLvl was recorded as 16 above, rxFifoThres was still 2 on breakpoint below
    while ( *regFIFOLvl > rxFifoThres ) //leave chars in FIFO so Rx Timeout can occur
    {
    //@@@ only two bytes read here. (Not good).
    //@@@ rxFifoThres = 2 on breakpoint below.
    //@@@ One of the following happened.
    //@@@ - the FIFOLvl register contained value 2 or less after two reads here. How could that be possible?
    //@@@ - the read of FIFOLvl register didn't happen correctly
    //@@@ - the read of FIFOLvl register was corrupted.
    rcvData = (*psdChannel.uartControl->fnReadFIFO)( dev->pbUartBase);
    pRxIsrInfo->fifoBytesRead++;
    if(pLink->rxDataIndex < RX_BUFF_LEN)
    {
    pLink->rxDataBuff[pLink->rxDataIndex++] = rcvData;
    }
    }
    //@@@
    //@@@
    //@@@ mouse over *regFIFOLvl when at break point below and its 32 !
    //@@@ In total 34 bytes arrived (correct number) and 32 would be correct at this point
    //@@@ if 18 more arrived after this interrupt was raised at 16 and only 2 were
    //@@@ read above.
    //@@@
    //@@@
    //@@@

    if ( rxDone )
    {
    pLink->rxDone = TRUE;
    pLink->rxDataLength = pLink->rxDataIndex;
    pLink->rxDataIndex = 0;
    #ifndef ADDITIONAL_DIAGNOSTICS // diagnostic: don't disable the receiver between xfers. Recording data received.
    (*psdChannel.uartControl->fnDisableReceive)( dev->pbUartBase );
    #endif
    // (void)OSActivateHISR( (OS_HISR *)&dev->devHisr );
    }
    else
    {
    if(pRxIsrInfo->fifoBytesRead < pRxIsrInfo->fifoLvl - 2)
    psdCommRxFIFOShort++; //@@@ breakpoint hit here, pRxIsrInfo->fifoBytesRead = 2, pRxIsrInfo->fifoLvl = 16
    }
    }
    rcvError = rcvError - 0; //gets rid of compiler warning.

    inISR--;
    psdCommFirstRx = false;
    }

  • Please follow the instructions in this section of the wiki to collect an rd1 file from your target:

    http://processors.wiki.ti.com/index.php/AM335x_Clock_Tree_Tool#Importing_Data_from_Actual_Hardware

    Please zip it and attach it to the thread so I can review.

    Also, please measure VDD_CORE.  Is it 1.100V?  Is it stable?  Can you connect your oscilloscope to it over a long period and set it to trigger on a falling edge at 1.05V?  Your observed issue is sufficiently rare that it makes me wonder if you have some kind of transient power issue that makes things behave strangely.

  • Also, are you able to reproduce this issue on a development board such as the BeagleBone Black or AM335x EVM?
  • Installed CCS and CTT.

    Getting this error following step 6 Load am335x-ctt.dss in the scripting console by executing "loadJSFile <path-to-dss-file>/am335x-ctt.dss" in the link above.
    js:> loadJSFile \C:\tools\TI\am335x-ctt.dss
    Could not open session. No matching devices found. (C:\tools\TI\am335x-ctt.dss#232)
    js:>
  • line 232 in am335x-ctt.dss
    debugSessionDAP = ds.openSession("*","CS_DAP_M3");
  • Change it to CS_DAP_DebugSS.
  • Make sure that after creating your ccxml file you have launched the configuration, though do not connect to any cores.
  • same error with
    debugSessionDAP = ds.openSession("*","CS_DAP_DebugSS");
  • Post a screenshot of your CCS. Something isn't right...
  • Oh, I see. It's because you're using a Segger debugger and not the TI debugger. I think it will still work if you make line 232:

    debugSessionDAP = ds.openSession("*","CortxA8");

    In general I don't like connecting to the Cortex A8 to get this data. If the MMU is enabled then it will end up reading the wrong info. Let's give it a try. If you have any TI emulators like XDS100v2, XDS110, etc. that would be better.
  • I recommend that you make a small change to DPLL_CORE configuration. Your M4 output is currently 125 MHz, though the typical configuration for this clock is 200 MHz. From a register perspective, this corresponds to the programming of CM_DIV_M4_DPLL_CORE at address 0x44E00480. Your register dump shows a value of 0x00000230, but I recommend programming this register to a value of 0000022A.

    Can you please make that change, verify the register is at the expected value, and then see if that has any impact on the issue?

    Also, did you get a chance to look at VDD_CORE? At a minimum it would be good to get a sanity measurement at run-time that it is 1.100V, though better yet you should monitor with a scope to see if there is any kind of transient issue.
  • OK will try the clock change. Looking into measuring VDD_CORE. Not conveniently done.
  • After clearing the rx fifo using the RX_FIFO_CLEAR_BIT in FCR, do you have to poll the same bit to know when the clear has finished? The description in Table 19-40 says "1h = Clears receive FIFO and resets its counter logic to 0. Returns to 0 after clearing the FIFO". We use this to clear the FIFO after sending. If this operation didn't complete until after some bytes were received, would be trouble. (Although I don't think this is happening to us. I think our FIFOLvl read is intermittently wrong)

    Still trying get help on measureing VCORE.

  • Hi Brad, I didn't try the clock change yet, but will. I caught another instance and captured a screen shot that shows the same thing I explained last week. Just better evidence.  Problem not stack corruption, not register overwrites by higher priority interrupt, not sequence of operations issue wrt a volatile pointer. It really looks like the register read itself didn't happen correctly.

    The reference manual describes a special procedure needed to access the timer count register TCCR. (section pasted in the attached zip file) I believe its because the register is changing and is in a different clock domain. Im wondering if this is whats happening here with this UART FIFOLvl register read.  Its similar to the TCCR case in that the code is reading a register that's counting. The reference manual doesn't go into as much detail regarding UART register access as it does with the Timer register access. Is it possible that there is a special procedureFirstInt16BytesRead1.zip required to read FIFOLvl registers?

    See attached  FIFOLvlReadFailureCommented.zip. 4 files included:

    1  FIFOLvlReadFailreCommented.png.  Screen shot of debugger on breakpoint catching the failure. Shows *regFIFILvl returned incorrect value twice, 0 in one case, probably 0 in the other.

    2 psd_rx_ISR.c  the source I was running

    3 psd_rx_ISR_MixedAsmC.txt  - Mixed c and asm generated live,  at the breakpoint

    4. ReadingTimerTCCR,txt - Snippet of the Reference Manual explaining special procedure to read timer TCCR value. (the counting register)

  • Hi Brad, I added more info and a question in a Reply to one of my own Replies, inadvertently. Please see.
  • PAUL HETHERINGTON said:
    After clearing the rx fifo using the RX_FIFO_CLEAR_BIT in FCR, do you have to poll the same bit to know when the clear has finished? The description in Table 19-40 says "1h = Clears receive FIFO and resets its counter logic to 0. Returns to 0 after clearing the FIFO". We use this to clear the FIFO after sending. If this operation didn't complete until after some bytes were received, would be trouble. (Although I don't think this is happening to us. I think our FIFOLvl read is intermittently wrong)

    When/why are you clearing the FIFO?  In general that doesn't sound like a normal thing that should be done.  I know you have been having issues related to the FIFO level, but I'm concerned you could actually be causing issues rather than solving issues if you have code doing this.  For example, in the scenario where you get the timeout interrupt and are expecting two bytes of data but instead see zero, could something have cleared the FIFO?

  • PAUL HETHERINGTON said:
    It really looks like the register read itself didn't happen correctly.

    Have you had a chance to update the DPLL that I mentioned?  I wonder if that could possibly be related.  Another thought that comes to mind, at least termporarily, is to implement a special "read FIFO level" routine where you keep reading the register until you get the same value twice in a row.

    PAUL HETHERINGTON said:
    The reference manual describes a special procedure needed to access the timer count register TCCR.

    The TCRR special procedure is something entirely different  The goal there is to get a consistent view of the full 32-bits in the case that you're doing two 16-bit accesses.  The FIFO level register is simply a 16-bit register, so those concerns don't apply.

  • I did make the clock change and the code crashed early. It was a memory access exception. I didn't follow up beyond that. Your clock question still not answered. Is it possible that the clock configuration we're using is not correct if the application is going to use the UART?
  • The code is (was) setting that bit following every write to the TxFIFO. Agreed seems abnormal, I don't know why it was done. The communication protocol is send a message, wait for response. No overlap. However, what happens after that last byte is questionable. I removed the RX FIFO clear and the problem persisted. I also disabled interrupts during this isr and the problem persisted. Eliminates possibility of corruption from another ISR. Also made sure optimization was disabled and inspected the asm code, volatile pointers are being used as expected and no funny ordering issue happening. I think your clock configuration question is the best theory. Is ours valid when using the UART?
  • Also, evidence that the FIFO wasn't cleared by RX_FIFO_CLEAR_BIT on this error....the debugger display of *regFIFOLvl showed the correct value, it think was 33 in the last screen shot I sent.
  • Re the special readFIFOLevel(). Thats a thought. But level could be increasing from incoming data. Might have to be something like

    while(*regFIFOLvl < (FIFOLvlReadLast - 1)) ; // spin waiting for valid level read. Only works if invalid lvl reads are always zero, or less then the actual.

    because there is incoming data adding the the fifo while we're reading it.

    Or, on FIFOLvlTrig interrupts (set to 16 at the moment), just read 14 bytes all the time. Don't look at FIFOLvl. Doesn't matter if we leave more than 2 behind. On the timeout interrupt, read while RXFIFOE in LSR is 1.

    But still hoping to solve this.
  • Another thought is could the UART Interface be going into a low power mode? Im reading CM_WKUP_UART0_CLKCTRL register when the error happens and its always a 2; IDLEST = 0, MODULEMODE = 2. Module fully functional and enabled. But is there something else I should be checking?
  • PAUL HETHERINGTON said:
    Re the special readFIFOLevel(). Thats a thought. But level could be increasing from incoming data. Might have to be something like

    while(*regFIFOLvl < (FIFOLvlReadLast - 1)) ; // spin waiting for valid level read. Only works if invalid lvl reads are always zero, or less then the actual.

    because there is incoming data adding the the fifo while we're reading it.

    In general, I expect your reading of the register will be orders of magnitude faster than you are actually receiving data, so even if you hit a boundary case where the count happens to increase between the two accesses, that should be quickly and easily solved by a 3rd access which should read the same as the second.  I was thinking more along the lines of:

    FIFOLvlReadLast = readReg();

    FIFOLvlReadCurrent = readReg();

    while (FIFOLvlReadLast != FIFOLvlReadCurrent)

    {

    FIFOLvlReadLast = FIFOLvlReadCurrent;

    FIFOLvlReadCurrent = readReg();

    }

    I think something simple like this might be a good test to try out.  If you wanted to make it more robust you could add another condition, e.g. like "maxTries" where you at most loop 10 times or whatever number you decide is reasonable.

  • PAUL HETHERINGTON said:
    I did make the clock change and the code crashed early. It was a memory access exception. I didn't follow up beyond that. Your clock question still not answered. Is it possible that the clock configuration we're using is not correct if the application is going to use the UART?

    Please post the rd1 file corresponding to this configuration.  I'd like to look at it more closely.  If you have more details on the crash that might also be useful.  At this point in time, an issue related to the clock configuration is my top suspect, though I'd still like to know the VDD_CORE voltage.  How are you powering VDD_CORE?  Is it a PMIC like the TPS65217C/D?  Perhaps at a minimum we could read the associated configuration register from the PMIC and at least see what we think the VDD_CORE is at.

  • VDD_CORE.zipVDD_CORE is coming from TPS65261-1RHB   LX3 pin. Its labeled at 1.1v. I still waiting for help to measure. Its not accessible in the system I have.

    I tried the clock change by changing its value in a debugger configuration file. When debugging, that's how its set.  Otherwise, a bootloader sets it.  In order to use Code Composer to get the register dump,  I'd have to have the boot loader updated.  Will look into that and will repeat what I had tried as a sanity check and to see if I can get more info on the crash.

  • If it is a lot of work to get the measurements, I suggest looking at both VDD_CORE and VDD_MPU while you're "under the hood" (that's assuming it's an incremental effort for the second measurement). It looks like you have things setup for 1.1V operation, but it will be good to sanity check.

    I recommend changing the bootloader itself with the appropriate clock configuration.
  • New rd1 file with CM_DIV_M4_DPLL_CORE set to 0x22a (HSDIVIDER = 0x0a)  attached. This time the system booted and ran as usual. I must have done something wrong the first time I tried.  I was able to disconnect CodeBench  and the code continued running. Connected with Code Composer and did the register dump. Unfortunately the problem we're chasing persisted.am335x-ctt_2019-03-19_092513.zip

  • PAUL HETHERINGTON said:

    New rd1 file with CM_DIV_M4_DPLL_CORE set to 0x22a (HSDIVIDER = 0x0a)  attached. This time the system booted and ran as usual. I must have done something wrong the first time I tried.  I was able to disconnect CodeBench  and the code continued running. Connected with Code Composer and did the register dump. Unfortunately the problem we're chasing persisted.(Please visit the site to view this file)

    This looks better.  I recommend that you make this update permanent.  This is the recommended DPLL_CORE configuration from the TRM as well as what's typically implemented.
  • PAUL HETHERINGTON said:
    Another thought is could the UART Interface be going into a low power mode? Im reading CM_WKUP_UART0_CLKCTRL register when the error happens and its always a 2; IDLEST = 0, MODULEMODE = 2. Module fully functional and enabled. But is there something else I should be checking?

    What value is programmed into the UART's SYSC register?  For the time being, I suggest using a value of SYSC = 0x0008 (no-idle).

  • 0x0008 is what we're using.
  • Hi Paul,

    Did you get any voltage measurements yet? If not, have you tried the "multiple read method"? It would be good to at least have a workaround in place. That would also be a good data point to understand that the register is just temporarily returning the wrong value, and it's not that the register itself is corrupted.

    Best regards,
    Brad