This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS/AM5728: PCIe MSI Interrupt Overlap

Part Number: AM5728

Tool/software: TI-RTOS

Hello,


Not sure if this is an AM5728 question or TI-RTOS question, but I guess I need to start someplace....


We have a AM5728 Custom board running TI-RTOS on the DSP Cores, using PCIe to talk to an FPGA.  The PCIe
is working at Gen2 and we are using MSI interrupt from the EP (Altera FPGA) to the RC (DSP).  The DSP
driver code is modeled very much after the PCIe sample code provided in the PDK.


The PCIe engine in the FPGA sends an MSI interrupt when a data transfer (EP->RC) completes.  This allows
the RC to know when data has arrived and can then process it.  We can transfer data from two possible
streams.  Data from a single stream seems to work just fine.  However, when both streams are active the
DSP will stop getting the MSI interrupt (everything else remains working, just the MSI interrupt stops)
and a complete reload/restart of the DSP is required.  I can see the MSI memory write coming across the
PCIe interface (we have a Lecroy Protocol Analyzer connected), so I know the MSI is still working from
the EP, it is just not getting into the DSP.


I checked all the RC MSI/Interrupt setup registers and none of them change between working and non-working cases. 
I started to dig deeper into what happens when we get an MSI interrupt and found the following. 

PCIECTRL_TI_CONF_IRQSTATUS_MSI - this register is set when you get an INTx or MSI interrupt.
PCIECTRL_PL_MSI_CTRL_INT_STATUS_N - this register is set to indicate the MSI interrupt number.

These are used by the example code to read and clear the interrupts with the functions, Pcie_getPendingFuncInts()
and Pcie_clrPendingFuncInts(). 

I've also found forum posts that reference the PCIECTRL_TI_CONF_IRQ_EOI register.  It is supposedly
required to be written to to enable further MSI interrupts, but it does not appear to be used by the
sample code and does not appear to be required since MSI interrupts are working in general.

So, I started looking at the IRQSTATUS_MSI and MSI_CTRL_INT_STATUS registers more closely when I was
in the failure case. It seemed like the MSI_CTRL_INT_STATUS was set but the IRQSTATUS_MSI was not. This
got me to look at the timing of the two interrupts.  It appeared that if two MSI interrupts happened
within less than 4.5 usec of each other, the MSI interrupt would stop working. And, in fact, if I were to clear
the MSI_CTRL_INT_STATUS register (by writing its same value back to it) the MSI interrupt would start
working again.

So, that got me looking at the sample code function PlatformMsiIntxIsr().  The code starts with a call to
Pcie_getPendingFuncInts() to read the two registers IRQSTATUS_MSI and MSI_CTRL_INT_STATUS.  Then bits from
those are checked to see if this was indeed an MSI interrupt and if it was the MSI interrupt expected.  I
also added code to read a memory location to detect which of the two possible data transfers completed.
Then at the end of the function Pcie_clrPendingFuncInts() was called to clear the interrupt and presumably
enable future interrupts.  Note that Pcie_clrPendingFuncInts() does not write to the EOI register.

So, I did a test, I moved the call to Pcie_clrPendingFuncInts() from the end of the ISR function to the
immediately the function Pcie_getPendingFuncInts() at the start of the ISR.  This greatly improved things.

The code would run much longer before breaking, but would eventually stop servicing MSI
interrupts as before.  So, then I changed the code to not only call Pcie_clrPendingFuncInts() at the
start of the ISR but also at the end.  This allowed the code to continue running much longer, but I do
miss interrupts, as probably the IRQSTATUS_MSI is not set correctly, and will still eventually stop getting MSI.

It appears there can be a race condition when you get two MSI interrupts very close together.  Perhaps this
was the purpose for the EOI register.  Prevent more MSI interrupts until the SW is done servicing the
current MSI interrupt.  However, that does not seem to be used.

So, I guess I am looking for suggestions on how to make this more reliable.  How to handle closely
spaced MSI interrupts.

Thanks,

Chris

  • The RTOS team have been notified. They will respond here.

    FYI: AM57x related RTOS questions should be posted on the Sitara forum, so you posted correctly.
  • Hi Chris,

    Really appreciate your contribution to the TI Processor RTOS software and this forum, especially for the debug work and proposed solutions! I need to look into this how to improve the close spaced MSI interrupt reliability and get back to you.

    Regards, Eric
  • Chris,

    For the PCIECTRL_TI_CONF_IRQ_EOI register, the TRM says "Software End-Of-Interrupt: Allows the generation of further pulses on the interrupt line, if an new interrupt event is pending, when using the pulsed output. Unused when using the level interrupt line (depending on module integration)." Legacy interrupt is level triggered and MSI is edge triggered. I need to check why we don't write into EOI and if this helps to improve the interrupt missing.

    As you have the way to generate close spaced MSI interrupt and modified the MSI reception ISR code, are those based on our Processor SDK RTOS PCIE example? If yes, would you be able to share the code for us to reproduce the issue?

    Regards, Eric
  • I've been thinking about this some more. In any hardware based interrupt there is going to a problem when you have a new interrupt that comes in before the current interrupt is done being serviced. Since we are trying to use the same interrupt for two different FPGA/DSP PCIe transfers, it is really up to the FPGA to not violate that servicing time. Obviously the shorter the time the better, so it would be best if the process was understood completely to help know the ISR time was minimized. I don't think the EOI would help since it won't hold off the FPGA from asserting the second MSI interrupt too quickly. In fact, I'm not really sure why the EOI is needed. So, we are looking at ways in the FPGA to detect and prevent this problem.

    For your question, the MSI interrupt is coming from our endpoint implementation in the FPGA, so I can't really share that with you in a meaningful way. If I were going to test this with out an FPGA, I might use two EVM's with one as an RC and the other an EP. Have the RC request data from the EP which when complete will trigger an MSI. Then have a timer in the EP to send a second block closely spaced to the first. Too little timer time and the RC should miss the interrupt and possibly get it into the bad state.
  • Hi,

    I don't want to Hijack your thread. But the issue sounds somewhat familiar. I have problems with wifi Cards connected via PCIe to an AM5728. I'm running Linux not the RTOS, so this might be more a HW Problem.

    The behaviour I observe on the higher level is that the Wifi Card stops working after some time and I get a crashdump. When going deeper I observe that I don't get anymore MSIs. (The intel wifi cards have a programable Timer Interrupt when I keep it enabled I can see that this interrupt is also not coming anymore). I documented my tests in this thread: https://e2e.ti.com/support/embedded/linux/f/354/t/564867

    I will now have a look into the Registers you mentioned and see if I can get similar changes in behaviour.

    Regards,

    Michael

  • Hi again,

    Here is a dump of the registers.

    After boot:

    PCIECTRL_PL_MSI_CTRL_ADDRESS = 0xae61f000
    PCIECTRL_PL_MSI_UPPER_ADDRESS = 0x00000000
    PCIECTRL_PL_MSI_CTRL_INT_ENABLE_0 = 0x00000003
    PCIECTRL_PL_MSI_CTRL_INT_MASK_0 = 0x00000000
    PCIECTRL_PL_MSI_CTRL_INT_STATUS_0 = 0x00000000
    PCIECTRL_PL_MSI_CTRL_INT_ENABLE_1 = 0x00000000
    PCIECTRL_PL_MSI_CTRL_INT_MASK_1 = 0x00000000
    PCIECTRL_PL_MSI_CTRL_INT_STATUS_1 = 0x00000000
    PCIECTRL_TI_CONF_REVISION = 0x500a7200
    PCIECTRL_TI_CONF_IRQ_EOI = 0x00000000
    PCIECTRL_TI_CONF_IRQSTATUS_RAW_MAIN = 0x00000000
    PCIECTRL_TI_CONF_IRQSTATUS_MAIN = 0x00000000
    PCIECTRL_TI_CONF_IRQSTATUS_RAW_MSI = 0x00000000
    PCIECTRL_TI_CONF_IRQSTATUS_MSI = 0x00000000
    PCIECTRL_TI_CONF_IRQENABLE_SET_MSI = 0x0000001f
    PCIECTRL_TI_CONF_DEVICE_TYPE = 0x00000004

    After the driver crashes:

    PCIECTRL_PL_MSI_CTRL_ADDRESS = 0xae633000
    PCIECTRL_PL_MSI_UPPER_ADDRESS = 0x00000000
    PCIECTRL_PL_MSI_CTRL_INT_ENABLE_0 = 0x00000003
    PCIECTRL_PL_MSI_CTRL_INT_MASK_0 = 0x00000000
    PCIECTRL_PL_MSI_CTRL_INT_STATUS_0 = 0x00000002
    PCIECTRL_PL_MSI_CTRL_INT_ENABLE_1 = 0x00000000
    PCIECTRL_PL_MSI_CTRL_INT_MASK_1 = 0x00000000
    PCIECTRL_PL_MSI_CTRL_INT_STATUS_1 = 0x00000000
    PCIECTRL_TI_CONF_REVISION = 0x500a7200
    PCIECTRL_TI_CONF_IRQ_EOI = 0x00000000
    PCIECTRL_TI_CONF_IRQSTATUS_RAW_MAIN = 0x00000000
    PCIECTRL_TI_CONF_IRQSTATUS_MAIN = 0x00000000
    PCIECTRL_TI_CONF_IRQSTATUS_RAW_MSI = 0x00000000
    PCIECTRL_TI_CONF_IRQSTATUS_MSI = 0x00000000
    PCIECTRL_TI_CONF_IRQENABLE_SET_MSI = 0x0000001f
    PCIECTRL_TI_CONF_DEVICE_TYPE = 0x00000004

    As you can see PCIECTRL_PL_MSI_CTRL_INT_STATUS_0 is not 0 but PCIECTRL_TI_CONF_IRQSTATUS_MSI is.

    My simple hack was now to reset PCIECTRL_PL_MSI_CTRL_INT_STATUS_0 (as in 'write 0x2 to it') at the end of the interrupt handler. Et voila: I have stable wifi.

    The question would now be: How can this bug be fixed properly?

    Regards,

    Michael

  • Me again,

    I did some comparison between Manual and (Linux)-Code.

    How its written in the TRM:

    Typical MSI interrupt service routine on the RC-configured PCIe controller:
    • Remote EP function transmits an MSI write, with an address and format previously assigned by the RC
    SW using configuration accesses.
    • MSI write access is routed to the RC (local controller)
    • Access is identified as an MSI thanks to its unique address, and is routed to the internal "MSI
    controller" instead of being routed to the AXI master.
    • PCIECTRL_TI_CONF_IRQSTATUS_MSI[4] MSI bit goes to 1
    • MSI interrupt line is asserted
    • SW reads PCIECTRL_TI_CONF_IRQSTATUS_MSI status register, and identifies an (unspecified) MSI
    event, as opposed to a PCI legacy event.
    • SW accesses the MSI related PL registers, which identifies a given event for a given EP function.
    • SW accesses this EP function over PCIe to process the interrupt: a dedicated interrupt service routine
    • SW clears the vector in the MSI PL register
    • SW clears the PCIECTRL_TI_CONF_IRQSTATUS_MSI status bit to 0, assuming there is no other
    outstanding event in the MSI queue.
    • MSI interrupt line is deasserted, assuming there is no other asserted event.

    And this is the code in the Linux Kernel:

    The dra7xx specific irq handler is here: https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/drivers/pci/host/pci-dra7xx.c#n186

    The designware specific MSI handler is here: https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/drivers/pci/host/pcie-designware.c?h=linux-4.9.y#n258

    So the MSI handler iterates through the vector and clears one bit at a time and then returns. The IRQ Handler just clears the PCIECTRL_TI_CONF_IRQSTATUS_MSI bit and returns.

    From the part I marked in yellow above I understand that one should check if the MSI Vector (aka. PCIECTRL_PL_MSI_CTRL_INT_STATUS_N) really is zero before zeroing PCIECTRL_TI_CONF_IRQSTATUS_MSI and if not handle that MSI.

    The way 'dw_handle_msi_irq' is currently implemented it sets back the msi vector before handling the IRQ. So by changing these two statements I get slightly better stability.

    I have now added a while loop to the handler so it only exits when PCIECTRL_PL_MSI_CTRL_INT_STATUS_N is 0. That seems to run stable.

    irqreturn_t dw_handle_msi_irq(struct pcie_port *pp)
    {
        unsigned long val;
        int i, pos, irq;
        irqreturn_t ret = IRQ_NONE;
        for (i = 0; i < MAX_MSI_CTRLS; i++) {
            dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_STATUS + i * 12, 4,
                        (u32 *)&val);
            while(val != 0)
            {
                ret = IRQ_HANDLED;
                pos = 0;
                while ((pos = find_next_bit(&val, 32, pos)) != 32) {
                    irq = irq_find_mapping(pp->irq_domain,
                                   i * 32 + pos);
                    dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS +
                                i * 12, 4, 1 << pos);
                    generic_handle_irq(irq);
                    pos++;
                }
                dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_STATUS + i * 12,
                        4, (u32 *)&val);
            }
        }
        return ret;
    }

    Regards,

    Michael

    (Okay, in the end I hijacked the thread. Sorry for that.)

  • I got the PCIe now running stable over the weekend. But I had to use Code which explicitly clears the MSI Vector after I cleared the PCIECTRL_TI_CONF_IRQSTATUS_MSI status bit.

    I also want to add that I think there a contradicting statements in 24.9.4.6.2.2.1:

    The PCIECTRL_TI_CONF_IRQSTATUS_MSI status bit MSI must remain set as long as a vector is still
    set. Once the last vector has been cleared, the MSI bit clears automatically.

    vs

    SW clears the PCIECTRL_TI_CONF_IRQSTATUS_MSI status bit to 0, assuming there is no other
    outstanding event in the MSI queue.

    Regards,

    Michael

  • Great info you have provided here Michael.  I think we had slightly different issues.  I had a case where I got two of the same interrupt too close together and you had two different MSI interrupts too close together.  Since you basically implemented a polling loop inside your ISR, you could maybe get stuck in the ISR if you had a case where interrupts came too quickly.  TI attempted to handle GPIO based interrupts with a slightly different approach.  When they got the interrupt they would read the entire 32 bit status word, then clear it.  The core would then service the interrupts it cared about, but since it cleared all the interrupts the ones not cared about would also be cleared.  This becomes a problem when you have two different cores pending on interrupts from the same GPIO bank. 

    You could do something similar here.  Save off the value of the MSI register, clear it, service those MSI interrupts that were set.  However, you'd still have the issue of a second interrupt coming in while servicing.  Maybe this is an appropriate use of a SWI?  I have not yet used SWI's.  Perhaps in this case the HWI should only read the MSI register, send the value to the SWI, then clear the MSI and IRQ Status registers.  This way the HWI is returned to service as quickly as possible, you don't have any possible polling loop, and the SWI could have a queue to service more MSI values from the HWI.  Now, how to implement that, I do not know.

    Thanks,

    Chris

  • I guess I wasn't clear on that part: I only have the Wifi Card attached and it only has one Interrupt assigned. So it resembles pretty much your case.

    But you're right. I also don't like using the while loop inside the ISR.
    Maybe clearing the MSI after reading the the vector before handling it might be working. But I now I'll wait for a suggestion/fix from TI.

    Michael
  • So, I thought I would try the SWI approach to see if it helped. The idea was to make the HWI as short as possible. I came up with this as the PCIe HWI:

    void PlatformMsiIntxIsr(uintptr_t vhandle)
    {
    AM57xxGpio::gpioWrite(AM57xxGpio::FPGA_RST_N, HIGH);

    UInt32 pendingBits = *(volatile UInt32*)(0x51802034);
    if ((pendingBits & 0x00000010) == 0x00000010)
    {
    uint32_t msiBits = *(volatile UInt32*)(0x51800830);
    Swi_or(pcieSwi, msiBits);

    *(volatile UInt32*)(0x51800830) = msiBits;
    *(volatile UInt32*)(0x51802034) = 0x00000010;
    }

    AM57xxGpio::gpioWrite(AM57xxGpio::FPGA_RST_N, LOW);
    }

    So, check for an MSI interrupt, read the MSI vector and send to the SWI, clear the MSI vector and clear the MSI interrupt. The gpio writes are to allow me to observe the ISR execution time on the oscilloscope.

    Can't get much simpler than that I would expect. The results are mixed. It did definitely reduce the minimum observed ISR service time, but the maximum time is still somewhat disappointing. The minimum service time is 600 nsec and the maximum time is 6.84 usec. So, while the min time saw a nice improvement, the max time did not. I'm not sure why the ISR sees such a large variance in execution time. We would still have to have at least a 7 usec ISR delay in our FPGA to make sure interrupts never overlap.
  • BTW, my original HWI, has a min service time of 3.8 usec and a max of 13.4 usec. So, that is a decent improvement.