This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320C6670: C66x: what happens on PCIe completer delay?

Part Number: TMS320C6670


Hello experts!

I'd like to ask for clarification on PCIe operation. In our design we have C6670 connected to Spartan 6 over PCIe link. FPGA has DMA engine in it and we use to upload huge amount of data to DSP's DDR3. I see some suspicious misbehavior, when those transfers are mixed with PIO read requests from DSP.

Suppose there is PIO read request in the code, like dereference of address within PCIe data space, which is translated to something like LDx instruction. As I understand, PCIe subsystem makes Read Request TLP and sits waiting for completer. What happens to DSP in this time? Does it gets in some kind of stall? Does it accept any interrupts in this time?

Now imagine read request TLP was transferred to EP, but EP was busy transferring large chunk of DMA data, and completer return is delayed considerably. What happens to DSP in this case?

Thanks in advance.

  • The team is notified. They will post their feedback directly here.

    BR
    Tsvetolin Shulev
  • Well, it took me time to find the reason.
    In fact, I could not confirm whether completer delay had any influence. What I found, that under certain conditions PCIe engine at FPGA was broken and was asserting start of frame on transaction interface, and lock up in this state without corresponding end of frame signal. I bet that is severe violation of the spec, but just wonder, whether DSP can detect such abuse and take recovery actions.
  • Hi,

    From your description, it looks the PCIE link is still alive but the FGPA side is locked up. You may look at the DSP's PCIE user guide section 2.16 Error Handling: 2.16.2.2 PCI Express Baseline Error Handling and 2.16.2.3 PCI Express Advanced Error Reporting, do you see any error reported for below registers?

    register address
    STATUS_COMMAND 0X 21801004
    DEV_STAT_CTRL 0X 21801078
    LINK_STAT_CTRL 0X 21801080
    PCIE_CERR 0X 21801110
    PCIE_UNCERR 0X 21801104
    ROOT_ERR_ST 0X 21801130

    Regards, Eric
  • Hello Eric,

    Thank you for suggestion.

    I have looked at those registers. When FPGA locks up, I see

    STATUS_COMMAND  0x21801004  0x00100546  
    DEV_STAT_CTRL   0x21801078  0x0002281F  NFATAL_ERR
    LINK_STAT_CTRL  0x21801080  0x30110008  DLL_ACTIVE
    PCIE_CERR       0x21801110  0x00000000
    PCIE_UNCERR     0x21801104  0x00004000  CMPL_TMOT_ST
    ROOT_ERR_ST     0x21801130  0x0000002C  NFERR   MULT_FNF    ERR_FNF
    

    Comparing to normal operation, only 3 last error registers get nonzero value, and former 3 registers are exactly the same.

    Originally the problem was found by my colleague. He noticed, that during some long DMA transfers, executed by DMA engine on FPGA side, i.e. FPGA as EP sends posted write requests to to DSP as RC, he sees that timer interrupt on DSP was missing events. We traced down, that there was read activity on PCIe,that is DSP was sending non-posted read requests to FPGA and was waiting for completer to come. So my original guess was that because of completer waiting DSP is stalled and not processing interrupt requests. This is the question of the original post and I still interested to know what happens when DSP is reading memory mapped device, but that read takes some longer time to complete.

    With some simple crutches I let completers to come back faster, and around that time I've found, that in fact, SoF/EoF sequence was broken too. 

    We are loading the program to DSP with emulator and debugging in CCS. On some first attempt it looked as MSI interrupt was missing, which is understandable, when SoF/EoF sequence broken. However, very often we see that emulation gets suspended and the following message appears in console:

    C66xx_0: Trouble Reading Memory Block at 0x8dadd4 on Page 0 of Length 
    0x554: (Error -1060 @ 0x8DADF8) Device is not responding to the request. 
    Reset the device, and retry the operation. If error persists, confirm 
    configuration, power-cycle the board, and/or try more reliable JTAG 
    settings (e.g. lower TCLK). (Emulation package 7.0.188.0) 

    it looks like

    and I can't do nothing further in my debug session.

    In PCIe UG it is mentioned, PCIe errors could be used to trigger interrupts. I am going to try that, however, it looks like I would be unable to Log_print as emulation gets dropped.

    Could you please commend on above?

    Thanks.

  • Hi,

    The PCIE read request from DSP to FPGA is a CPU read, if it takes longer to read back while the timer interrupt happening. I thought the interrupt have the priority and ISR should be entered. Does the ISR have any work with PCIE related process that can't finish due to PCIE read is not returned (so the ISR is missed)? Or the ISR is never entered at all?

    "C66xx_0: Trouble Reading Memory Block at 0x8dadd4 on Page 0 of Length
    0x554: (Error -1060 @ 0x8DADF8) Device is not responding to the request" =====>C6670 L2 has 1MB so who is using around this region, your code or some data? The screenshoot you showed PC counter in MSMC, I am not sure about the relationship between this two.

    For the interrupt perspective, from PCIE user guide Table 2-10 PCIESS Interrupt Events, there is error interrupt. From C6670 datasheet, 48 PCIEXpress_ERR_INT Protocol error interrupt, this secondary interrupt to CIC. You can implement a PCIE ISR.

    Regards, Eric
  • Hello Eric,

    Thank you for keep helping.

    Most of program code resides in MSMCSRAM according to

    Program.sectMap[".text"]        = "MSMCSRAM"; 
    Program.sectMap[".const"]       = "MSMCSRAM"; 
    Program.sectMap[".qmss"]        = "MSMCSRAM"; 
    Program.sectMap[".cppi"]        = "MSMCSRAM";
    Program.sectMap[".fardata"]     = "MSMCSRAM"; 
    Program.sectMap[".switch"]      = "MSMCSRAM"; 
    Program.sectMap[".vecs"]        = "MSMCSRAM"; 
    Program.sectMap[".cinit"]       = "MSMCSRAM"; 
    Program.sectMap[".cio"]         = "MSMCSRAM"; 
    Program.sectMap[".bss"]         = "MSMCSRAM"; 
    Program.sectMap[".rodata"]      = "MSMCSRAM"; 
    Program.sectMap[".neardata"]    = "MSMCSRAM";
    
    Program.sectMap["platform_lib"] = "MSMCSRAM";
    
    Program.sectMap["SystemHeap"]   = "L2SRAM";
    Program.sectMap[".far"]         = "L2SRAM"; 
    Program.sectMap[".stack"]       = "L2SRAM";
    Program.sectMap[".localBuf"]    = "L2SRAM"; 
    Program.sectMap["tcp3DriverSection"] = "L2SRAM";
    
    Program.sectMap["DataHeap"]         = "DDR3";
    Program.sectMap["CalHeap"]          = "DDR3";
    Program.sectMap[".Wavep"]           = "DDR3";
    Program.sectMap[".capture_meas"]    = "DDR3";

    I have to say, that address of location which was trouble to read seems to be different on every occasion. What is important, I cannot recover debugging unless close and reopen debug session.

    As to timer ISR possible lock up due to PCIe activity, that's not a case. In my test scenario that ISR is incrementing counter every 156.25 us, and posting and SWI on each 4th run, giving effective period of 625 s. Please look at the following log.

    I have a timer, firing every 625 us and printing the counter. I set DMA transfer to be executed by DMA engine on FPGA side and immediately issue four times read request to FPGA. Completers of these read request interfere with DMA flow and eventually break up DMA engine. That's my problem and I'm looking to solve it. However, it looks to me, that when DSP is in the middle of the memory read sequence, it does not handle interrupts.

      Time, us      Log message                             Sequence number
    4 760 130       main(): FPGA revision = 20170714:101,   3,
    4 803 483       lte_timers_processor(): clk = 0x6edc,   4,
    4 803 484       main(): FPGA revision = 20060000:0,     5,
    4 846 837       main(): FPGA revision = 20060000:0,     6,
    4 890 190       main(): FPGA revision = 20060000:0,     7,
    4 890 500       lte_timers_processor(): clk = 0x6ee0,   8,
    4 891 125       lte_timers_processor(): clk = 0x6ee4,   9,
    4 891 750       lte_timers_processor(): clk = 0x6ee8,   10,
    4 892 375       lte_timers_processor(): clk = 0x6eec,   11,

    At sequence number 3 we see successful read of FPGA version register (the value is correct). Next line, seq.#4 is output of timer ISR, note its time was 4 803 483 us. Next we 3 times completers for outstanding read requests, they broken as I described, and that is not important. Rally important thing is that there is pretty large delay between reads of seq.# 5 to 6 and 6 to 7, about 40 000 us each. And what is worse, time difference between occurrence of timer processor ISR, seq# 4 and 8 is about 90 000 us. Note also, clk counter is incrementing by 4, so there is no missed sequence numbers. If we look at seq.#8,9,10, when all PCIe troubles are over, there is exactly 625 us difference between them.

    This makes me think, that when there is outstanding memory read, DSP cannot be interrupted in the middle of that cycle.

    Now I'm going to speculate. My understanding of interrupt sequence is that BIOS is saving registers which could be affected by ISR execution, then spawns ISR and restore registers on ISR completion. Imagine, there is memory read, sourced from PCIe or other unpredictable peripheral towards register. Read request sent, but no response arrived yet. I guess, DSP is stalling in that state, but again, I only speculate. Imagine, interrupt request happened in the middle of this sequence. Peripheral did not supplied completer yet, so BIOS could not save right value before spawning ISR. Suppose, ISR was invoked, and completer arrived after that - it would write in requested register, but it belong to other process now.

    Could you please comment on this?

    Meanwhile I am trying error interrupt. Bad thing is that PCIe failure makes emulator disconnected, so I could not see Log_prints. Want to try reset on PCIe subsystem, but I had no success with it in numerous prior attempts. We had a plan to restart DSP application and needed to restart PCIe link. Never succeeded in that. If you might have some hint, that would be very welcome too.

    Thanks in advance.

  • Please comment, what could go wrong, if memory read delays. Thanks.
  • Hello Eric,
    I've managed to set up secondary events handling. To make sure it was operational, I used CpIntc_postSysInt( 0, 48 ); function call to trigger 48 PCIEXpress_ERR_INT on CIC0. I saw relevant message printed. However, when I run the system, this error event is not detected, even if I see error bit was set in 0x21801104 PCIE_UNCERR register.
    Could you please suggest, what else should I set up to catch that error.
    Thank you.
  • Hi,

    We have an example to routine legacy INTA (50 PCIEXpress_Legacy_INTA)to CPU, see the code pdk_c667x_2_0_x\packages\ti\boot\examples\pcie\pcieboot_interrupt\src, this would be similar to your (48 PCIEXpress_ERR_INT). Let me know if ISR can be entered.

    Regards, Eric
  • Hello Eric,

    Thank you for the suggestion, but looks that I need something beyond that.

    I know, how to capture PCIe interrupts as I have MSI function operational in my system, though MSIs are primary events. Recently I finally mastered routing of secondary event like error event under consideration. Moreover, I have EDMA event routed similar way and it behaves properly. And to make sure I was triggering system event manually - it was okay.

    So I suspect, that PCIe error reporting has to be enabled or setup somehow. Will appreciate pointing to proper reading or better example.

    Thanks in advance.

  • Hello Eric,

    I've tried to look for relevant settings and came up with the following:

    pcieRet_e setup_pcie_error_intr(void)
    {
        pcieRet_e retval;
    
        pcieRegisters_t             setRegs;
        pcieErrIrqEnableSetReg_t    errIrqEnableSet;
        pcieRootErrCmdReg_t         rootErrCmd;
    
    
        // Clear all register structures
        memset( &setRegs,         0, sizeof(setRegs)         );
        memset( &errIrqEnableSet, 0, sizeof(errIrqEnableSet) );
        memset( &rootErrCmd,      0, sizeof(rootErrCmd)      );
    
        errIrqEnableSet.errCorr     = 1;
        errIrqEnableSet.errFatal    = 1;
        errIrqEnableSet.errNonFatal = 1;
        errIrqEnableSet.errSys      = 1;
    
        setRegs.errIrqEnableSet = &errIrqEnableSet;
    
        rootErrCmd.ferrRptEn    = 1;
        rootErrCmd.nferrRptEn   = 1;
    
        setRegs.rootErrCmd      = &rootErrCmd;
    
        if ( pcie_RET_OK != (retval = Pcie_writeRegs(handle, pcie_LOCATION_LOCAL, &setRegs) ) )
        {
        #if DEBUG_ERRORS
            System_printf("setup_pcie_error_intr: enabling error IRQ failed!\n");
        #endif // DEBUG_ERRORS
            return retval;
        }
    
        CpIntc_enableSysInt( CSL_CP_INTC_0, CSL_INTC0_PCIEXPRESS_ERR_INT );
    
        return pcie_RET_OK;
    }

    Also I have in my config the following:

    EventCombiner.eventGroupHwiNum[0] = 7;
    EventCombiner.eventGroupHwiNum[1] = 8;
    EventCombiner.eventGroupHwiNum[2] = 9;
    EventCombiner.eventGroupHwiNum[3] = 10;
     
    /* Create handlers for PCIe MSI statically */
    /*
     * Note that event Id 17, 18 is from CorePack perpective. To not confuse 
     * with MSI interrupt event number from PCIESS perpective. 
     * The Hwi is created implicitly for Event Combiner in enabled state. 
     * We setup Event Combiner call-backs in disabled state and will enable 
     * in interrupt setup routine. 
     */
    EventCombiner.events[17].fxn    = '&pcie_isr';
    EventCombiner.events[17].arg    = 0;
    EventCombiner.events[17].unmask = false;
    
    EventCombiner.events[18].fxn    = '&pcie_isr';
    EventCombiner.events[18].arg    = 4;
    EventCombiner.events[18].unmask = false;
    
    /* ISR for Eth Recieve, triggered by Event 48 */
    EventCombiner.events[48].fxn    = '&eth_rx_isr';
    EventCombiner.events[48].arg    = 0;
    EventCombiner.events[48].unmask = false;
    
    
    
    /* System event of EDMA routed through CIC0 */
    CpIntc.sysInts[36].fxn          = '&edma_isr';
    CpIntc.sysInts[36].arg          = 36;
    CpIntc.sysInts[36].hostInt      = 0;
    CpIntc.sysInts[36].enable       = true;
    CpIntc.mapHostIntToEventCombinerMeta( CpIntc.sysInts[36].hostInt );
    
    /* ISR for CIC0_OUT0 event 56 */
    EventCombiner.events[56].fxn    = CpIntc.dispatch;
    EventCombiner.events[56].arg    = CpIntc.sysInts[36].hostInt;
    EventCombiner.events[56].unmask = true;
    
    CpIntc.sysInts[48].fxn          = '&pcie_err_isr';
    CpIntc.sysInts[48].arg          = 48;
    CpIntc.sysInts[48].hostInt      = 1;
    CpIntc.sysInts[48].enable       = true;
    CpIntc.mapHostIntToEventCombinerMeta( CpIntc.sysInts[48].hostInt );
    //CpIntc.mapHostIntToHwiMeta( CpIntc.sysInts[48].hostInt, 11 );
    
    /* ISR for CIC0_OUT1 event 57 */
    EventCombiner.events[57].fxn    = CpIntc.dispatch;
    EventCombiner.events[57].arg    = CpIntc.sysInts[48].hostInt;
    EventCombiner.events[57].unmask = true;

    I tested MSIs and EDMA - they operate as expected. I've tried to trigger system event through CpIntc_postSysInt( 0, 48 ); - it does trigger required ISR.

    In 0x21801130 ROOT_ERR_ST I see 0x0000002C, which is NFERR | MULT_FNF | ERR_FNF, I see 0x00004000 in 0x21801104h PCIE_UNCERR. However, ISR was not triggered. If you might have idea, please suggest.

    Registers view:

    0x21801104	0x00000000
    0x218001c8	0x0000000F
    0x218001c4	0x00000000
    0x218001c0	0x00000000
    0x21801078	0x0000281F
    0x21801130	0x00000000
    0x21801108	0x00000000
    0x2180112c	0x00000006
    

    Thanks in advance.

  • Hi,

    From your description, it looks the system interrupt is setup properly. In the PCIE user guide, there is ERR interrupt description from 3.1.72 to 3.1.75, I saw you have 0x1c8 set, I think if you have interrupt happened, you will see the bit set in 0x1c0 and 0x1c4. Even you don't setup anything in 0x1c8, you should see the bit set in 0x1c0 (RAW STATUS) when interrupt happened.

    You can manually poke 0x1c0 and 0x1c4 to setup a bit, then you should invoke the PCIE ERR interrupt ISR. This can confirm the PCIE ERR interrupt setup.

    Then, can you confirm after error happened in the real system, the 0x1c0 and 0x1c4 are all zero? How do you know if there is an error happened? I saw 0x1104 is 0 (so no uncerr happened?), and why 112c is set to 0 (reported is not enabled)? Reading User Guide 2.16 may have more idea what was missed.

    Regards, Eric
  • Hello Eric,

    Thank you for keep supporting me.

    First of all, perhaps there was misreading about ROOT_ERR_CMD. In the very last line of registers reading there is 0x2180112c 0x00000006, i.e. I have bit 1 and 2 set, corresponding to FERR_RPT_EN, NFERR_RPT_EN, but not CERR_RPT_EN. Cost me nothing to enable that too, but anyway, I had fatal/non-fatal error reporting enabled and in 0x21801130 ROOT_ERR_ST I saw 0x0000002C, which is NFERR | MULT_FNF | ERR_FNF.

    Registers upon init

    *((unsigned *)0x21801104)	unsigned int	0x00000000 (Hex)	0x21801104	
    *((unsigned *)0x218001c8)	unsigned int	0x0000000F (Hex)	0x218001C8	
    *((unsigned *)0x218001c4)	unsigned int	0x00000000 (Hex)	0x218001C4	
    *((unsigned *)0x218001c0)	unsigned int	0x00000000 (Hex)	0x218001C0	
    *((unsigned *)0x21801078)	unsigned int	0x0000281F (Hex)	0x21801078	
    *((unsigned *)0x21801108)	unsigned int	0x00000000 (Hex)	0x21801108	
    *((unsigned *)0x2180112c)	unsigned int	0x00000007 (Hex)	0x2180112C	
    *((unsigned *)0x21801130)	unsigned int	0x00000000 (Hex)	0x21801130	
    

    Registers after error transfer

    *((unsigned *)0x21801104)	unsigned int	0x00004000 (Hex)	0x21801104	
    *((unsigned *)0x218001c8)	unsigned int	0x0000000F (Hex)	0x218001C8	
    *((unsigned *)0x218001c4)	unsigned int	0x00000000 (Hex)	0x218001C4	
    *((unsigned *)0x218001c0)	unsigned int	0x00000020 (Hex)	0x218001C0	
    *((unsigned *)0x21801078)	unsigned int	0x0002281F (Hex)	0x21801078	
    *((unsigned *)0x21801108)	unsigned int	0x00000000 (Hex)	0x21801108	
    *((unsigned *)0x2180112c)	unsigned int	0x00000007 (Hex)	0x2180112C	
    *((unsigned *)0x21801130)	unsigned int	0x0000002C (Hex)	0x21801130	
    

    I had 0x218001c0 0x00000020, so already there was an error ERR_AER - ECRC error raw status. However, as you suggested, manually I poked raw register 1C0 and miracle happened - error ISR was triggered! I've made sure ERR_IRQ_ENABLE_SET at 1C8 was 0xF, so it should react on ERR_CORR ERR_NONFATAL ERR_FATAL ERR_SYS. There might be something more to setup, but I've lost myself already.
    If you might have any better idea, please suggest.

    Thanks in advance.

  • Well, I think the key issue is that ERR Interrupt Enabled Status Register (ERR_IRQ_STATUS) at 1C4 is all zero, though I tried to enable interrupts through writing to ERR Interrupt Enable Set Register (ERR_IRQ_ENABLE_SET) at 1C8. Somehow that had no effect...
  • Hello,

    It's me again. Consider Table 3-1 PCI Express Application Registers of KeyStone Architecture Peripheral Component Interconnect Express (PCIe) User Guide SPRUGS6D—September 2013. There are 4 interesting registers:

    1C0h ERR_IRQ_STATUS_RAW Raw ERR Interrupt Status Register Section 3.1.72
    1C4h ERR_IRQ_STATUS ERR Interrupt Enabled Status Register Section 3.1.73
    1C8h ERR_IRQ_ENABLE_SET ERR Interrupt Enable Set Register Section 3.1.74
    1CCh ERR_IRQ_ENABLE_CLR ERR Interrupt Enable Clear Register Section 3.1.75

    To my understanding, 1C4h ERR_IRQ_STATUS is in effect mask register, enabling or disabling specific events. Next, to my understanding, that register is manipulated through set/clear register pair at 1C8h ERR_IRQ_ENABLE_SET and 1CCh ERR_IRQ_ENABLE_CLR. Writing 1 to former should enable events, writing to latter - disable. However, what I see, is that write to 1C8h ERR_IRQ_ENABLE_SET makes no effect onto 1C4h ERR_IRQ_STATUS ERR.

    Initial state:

    *((unsigned *)0x218001c0)	unsigned int	0x00000000 (Hex)	0x218001C0	
    *((unsigned *)0x218001c4)	unsigned int	0x00000000 (Hex)	0x218001C4	
    *((unsigned *)0x218001c8)	unsigned int	0x00000000 (Hex)	0x218001C8	
    *((unsigned *)0x218001cc)	unsigned int	0x00000000 (Hex)	0x218001CC	
    

    Writing 0x3F to 1C8h ERR_IRQ_ENABLE_SET:

    *((unsigned *)0x218001c0)	unsigned int	0x00000000 (Hex)	0x218001C0	
    *((unsigned *)0x218001c4)	unsigned int	0x00000000 (Hex)	0x218001C4	
    *((unsigned *)0x218001c8)	unsigned int	0x0000003F (Hex)	0x218001C8	
    *((unsigned *)0x218001cc)	unsigned int	0x0000003F (Hex)	0x218001CC	
    

    Writing 0x30 to 1CCh ERR_IRQ_ENABLE_CLR

    *((unsigned *)0x218001c0)	unsigned int	0x00000000 (Hex)	0x218001C0	
    *((unsigned *)0x218001c4)	unsigned int	0x00000000 (Hex)	0x218001C4	
    *((unsigned *)0x218001c8)	unsigned int	0x0000000F (Hex)	0x218001C8	
    *((unsigned *)0x218001cc)	unsigned int	0x0000000F (Hex)	0x218001CC	
    

    Note, set/clear pair changes as expected, but make no effect onto 1C4h ERR_IRQ_STATUS ERR.

    Please advise, what else could I try.

    Thanks in advance.

  • Okay, what finally helped was enabling Power Management Event together with system error enables in ROOT_CTRL_CAP at 108C.

    Another observation is that 1C4h ERR_IRQ_STATUS is no way "Enabled" register, as it was described in user guide. My observation, that whenever I hit ERR_IRQ_STATUS_RAW register at 1C0 manually, enabled bits get asserted in at 1C4 according to mask in 1C8.

    Could you please clarify meaning and operation of 1C4h ERR_IRQ_STATUS?

    Somehow we stepped aside of original question, what happens when core issues memory read, but peripheral takes longer time to respond. Please let me know, should I ask about that elsewhere.

    Thanks in advance.

  • Hi,

    ERR_IRQ_ENABLE_SET is used to enable the interrupt.
    ERR_IRQ_ENABLE_CLR is used to disable the interrupt.
    ERR_IRQ_STATUS_RAW is mainly for debug purpose: even you don't set the ERR_IRQ_ENABLE_SET, it sets the bit if there is an interrupt
    ERR_IRQ_STATUS sets the bit when there is an interrupt AND ERR_IRQ_ENABLE_SET is enabled. It is also used to clear the interrupt.

    Thanks for finding out the ROOT_CTRL_CAP to enable the interrupt. So I think you have error interrupt working now?

    When interrupt happens, the OS needs to save registers and restore them after serving the ISR for continuing execution of the code. If the PCIE read stalls the CPU for a long time, there may be some issue but I am not sure what will happen. Will the ISR be missed or the ISR is served but wrong registers are saved and code execution may go weird?

    Regards, Eric
  • Hello,

    Your explanation makes perfect sense. What I meant is that ERR_IRQ_STATUS register is referred as ERR Interrupt Enabled Status Register. That "Enabled" twisted my mind. 

    Yes, I have error reporting chain operational now. I did not attempt any recovery action so far, but at least I have some tool to know, something went wrong with PCIe. At current point that's more important to me.

    As to original question, my observation is that CPU stalls, deadlocks as long as there is outstanding read request. In earlier post I have shown, that timer interrupt events, scheduled 156.25 us apart were delayed to approx 40 000 us, which is close to default 50 ms of completer timeout I've seen somewhere in the manual. So my feeling is that CPU is completely stalled, which makes me think, that PCIe read in time critical app is dangerous. I wish I was wrong.

    Thank you very much for guiding me through this case. If you might have extra considerations about memory read delay, please share. After that I'm going to mark this thread resolved.

    Thanks again.