This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS/TMS320C6670: Missing MSI

Part Number: TMS320C6670

Tool/software: TI-RTOS

Hello!

I have a system where C6670 communicates with Spartan 6 FPGA over PCIe. We use MSI to make interrupts to DSP.

The system was in use for several years already being pretty stable. However, recently we face some issue with MSI, which is hard to trace. Under certain circumstances, when there is intensive PCIe traffic it looks like PCIe subsystem fails to trigger interrupt. I monitor MSI0_IRQ* registers and see:

(*0x21800100)	0x0000000C
(*0x21800104)	0x00000004
(*0x21800108)	0x00000005
(*0x2180010c)	0x00000005

So according to MSI0_IRQ_ENABLE_SET (0x21800108) bits 1 and 2 are enabled, that is MSI vectors 8 and 16. According to both MSI0_IRQ_STATUS (0x21800104) and MSI0_IRQ_STATUS_RAW (0x21800100) bit 2 (i.e. 0x4) is pending. However, interrupt is not triggered. For the matter of experiment, I have cleared bit 2 by writing 4 to 0x21800104, then retriggered it by writing 16 to MSI_IRQ (0x21800054). I see the bit of interest gets cleared, then asserted again, but interrupt does not occur.

MSIs are routed through EventCombiner, relevant piece of config looks as:

EventCombiner.eventGroupHwiNum[0] = 7;
EventCombiner.eventGroupHwiNum[1] = 8;
EventCombiner.eventGroupHwiNum[2] = 9;
EventCombiner.eventGroupHwiNum[3] = 10;
 
EventCombiner.events[17].fxn    = '&pcie_isr';
EventCombiner.events[17].arg    = 0;
EventCombiner.events[17].unmask = false;

EventCombiner.events[18].fxn    = '&pcie_isr';
EventCombiner.events[18].arg    = 4;
EventCombiner.events[18].unmask = false;

They both get enabled in runtime and operational for sure. The point is that MSI vector 16 stops triggering after certain activity. I have checked what looks relevant to me:

Still I could not find what could be a reason for MSI ISR to be not triggered.

Please advise what else should I check.

Thanks

  • One more observation. It looks that all 4 MSIs of event 4 became nonfunctional. I tried vector 24 with no luck. However, when I wrote 4 (as event 4 for MSI interrupts 0, 8, 16, 24) to IRQ_EOI - immediately whole machinery was unblocked. I do explicitly write to EOI in my ISR:

    void pcie_isr( xdc_UArg arg )
    {
        volatile CSL_Pciess_appRegs * const pcieAppRegs = (volatile CSL_Pciess_appRegs *)CSL_PCIE_CONFIG_REGS;
        u_int32 core_num = CSL_chipReadReg(CSL_CHIP_DNUM);
        u_int32 evt_idx, evt_num, msi_vector;
        int bit;
    
    
        /*********************************************************************
         * When using PCIe LLD to clear interrupt in PCIe subsytem it seems
         * that ISR is fired twice per trigger. Multiple reports on the issue
         * found on e2e. The workaround so far is to use direct register
         * manipulations.
         **********************************************************************/
    
        /*
         * Refer Table 2-10 in SPRUGS6:
         * Interrupt        Interrupt Description
         * Event Number
         *    4             MSI interrupts 0,  8, 16, 24 (EP/RC modes)
         *    5             MSI interrupts 1,  9, 17, 25 (EP/RC modes)
         *    6             MSI interrupts 2, 10, 18, 26 (EP/RC modes)
         *    7             MSI interrupts 3, 11, 19, 27 (EP/RC modes)
         *    8             MSI Interrupts 4, 12, 20, 28 (EP/RC modes)
         *    9             MSI Interrupts 5, 13, 21, 29 (EP/RC modes)
         *    10            MSI Interrupts 6, 14, 22, 30 (EP/RC modes)
         *    11            MSI Interrupts 7, 15, 23, 31 (EP/RC modes)
         *
         *    Core0 receives MSI 0-8-16-24 and 4-12-20-28 --> events 4 and 8
         *    Core1 receives MSI 1-9-17-25 and 5-13-21-29 --> events 5 and 9
         *
         * Do not confuse these event Id with interrupt controller event ids,
         * which are 17 and 18 for events 4, 5, 6, 7 and 8, 9, 10, 11
         * respectively. PCIESS event number is needed to access MSIX_IRQ,
         * MSI_IRQ_STATUS registers, which are array of [8]. Also this event
         * number is needed to signal End Of Interrupt (EOI).
         * Core number is found reading register. HWI for events 4, 5, 6, 7
         * receives arg = 0, while HWI for events 8, 9, 10, 11 receives arg = 4.
         * That is defined through SYS/BIOS config. For register arrays indexing
         * we use evt_idx, which is 0..7 for events 4..11.
         */
        evt_idx = core_num + arg;
        evt_num = evt_idx  + 4;
    
        //#if DEBUG_PCIE && DEBUG_HIGH
        //Log_print2( Diags_USER1, "pcie_isr(): core_num=%d, arg = %d", core_num, arg );
        //#endif
    
        /**********************************************************************
         * There are 8 MSIX_IRQ registers, we index them with evt_idx
         * found just above. Every register is 4 bit field. Each bit matches
         * MSI vectors 0, 8, 16, 24 in register[0], vectors 1, 9, 17, 25 in
         * register[1] and so on. Here we find the first vector being set,
         * giving priority to lower vectors.
         * Writing 1 to appropriate bit clears that interrupt status.
         * If there are more pending interrupt vectors, they will get processed
         * in the next invocation of the HWI.
         **********************************************************************/
        for ( bit = 1; bit <= 8; bit <<= 1 )
        {
            if ( bit & pcieAppRegs->MSIX_IRQ[evt_idx].MSI_IRQ_STATUS )
            {
                //pcieAppRegs->MSIX_IRQ[evt_idx].MSI_IRQ_STATUS = bit;
                break;
            }
        }
    
        /*
         * Calculate MSI vector depending on bit found in MSI_IRQ_STATUS
         * and PCIe event index.
         */
        switch ( bit )
        {
        case 1:
            msi_vector = evt_idx;
            break;
        case 2:
            msi_vector = evt_idx +  8;
            break;
        case 4:
            msi_vector = evt_idx + 16;
            break;
        case 8:
            msi_vector = evt_idx + 24;
            break;
        default:
            #if DEBUG_PCIE && DEBUG_HIGH
            Log_print1( Diags_USER1, "pcie_isr(): invalid bit = 0x%X ", bit );
            #endif
            return;
        }
    
        #if DEBUG_PCIE && DEBUG_HIGH
        Log_print3( Diags_USER1, "pcie_isr on Core%d: event %d, MSI vector %d ", core_num, evt_num, msi_vector );
        #endif
    
        /*
         * Check whether callback routine is registered and call it if so.
         * Note, that callback functions is still executed in HWI context.
         * For larger code it might be reasonable to post SWI from callback routine.
         */
        if ( NULL != msi_cb_fcn[msi_vector] )
        {
            msi_cb_fcn[msi_vector]();
        }
        #if DEBUG_PCIE && DEBUG_HIGH
        else
        {
            Log_print1( Diags_USER1, "pcie_isr(): no callback routine for MSI vector %d ", msi_vector );
        }
        #endif
        /*************************************************************************
         * End Of Interrupt must be written explicitly. Note, that MSI interrupt
         * event number for interrupt vector is from PCIESS perspective, i.e
         * {4..11} (ref. SPRUGS6D, Table 2-10). Not to be confused with
         * CorePack Primary Interrupt event number (ref. SPRS689D, Table 7-38),
         * where event number 17 is PCIEXpress_MSI_INTn. This number is used to
         * define HWI trigger source.
         *************************************************************************/
        pcieAppRegs->MSIX_IRQ[evt_idx].MSI_IRQ_STATUS = bit;
        pcieAppRegs->IRQ_EOI = evt_num;
    
        return;
    }
    

    Please advise, what could be a reason, that write to EOI from ISR did not make proper effect sometimes.

    Thanks.

  • Yet another observation. It seems the trouble happens when more than one bit is set in status register. Then I observe really weird thing:

    31 119 706,    "pcie_isr(): core_num=0, arg = 0, msi_irq_status = 0x9, bit = 0x1",      // 0x9=1001 - two bits set
    31 119 706,    "pcie_isr on Core0: event 4, MSI vector 0 ",                             // 0x1 bit selected for processing
    31 119 712,        msi_iqr_status = 0x8,                                                // After clearing 0x1 just before leaving ISR 0x8 remains in the register
    31 119 713,    "pcie_isr(): core_num=0, arg = 0, msi_irq_status = 0x0, bit = 0x10",     // BANG! ISR invoked, but there is no bit in MSI0_IRQ_STATUS
    

    It looks that MSI0_IRQ_STATUS was lost and ISR invoked with no pending interrupt.