This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

XIO2001: help figuring out slow memory mapped I/O on a Dell 7090 desktop

Part Number: XIO2001

Hello,

We're hoping that an engineer at TI can help us figure out why an XIO2001 PCIe-to-PCI bridge is slow on a specific Dell desktop.

For comparison, here are some 32-bit read times using memory mapped I/O:

  • Dell XE2: 2 to 5 us
  • Dell 7090: 49 to 64 us

Unfortunately the Dell XE2 is now obsolete so we're testing the Dell 7090 as a possible replacement (when we ran into this issue during testing).

Both of those desktops use the same PCIe to PCI bridge, the Texas Instruments XIO2001 PCI Express-to-PCI Bridge (0x104c:0x8240).

My hope is that the BIOS is just configuring the bridge improperly on the Dell 7090 but I'm not sure if that's the case and would like guidance trying to find the cause of the latency on the Dell 7090.

I took a quick look at the XIO2001 datasheet and it's over my head. It looks like there are various clock speeds (from 25 MHz to 66 MHz) that are configurable based on hardwired pins. Although, I'm not sure how I can query the device to see what frequency is used (or to determine why the I/O is fast on one PC but slow on another). If I knew that BIOS was configuring something wrong then I could go to Dell with proof and ask for a BIOS fix. If it ends up that Dell wired the XIO2001 incorrectly on the Dell 7090 then we'd just have to find another desktop for this system instead.

We see this both in QNX and Linux.

Thanks

  • Hi Devin,

    The only way for increased response time could be either through bios or different clock configuration on the pcb. However, i believe there must be a reason why response time has been slowed down. It is possible - even if we knew the exact reason - if we speed up the response time there could be some other negative side effects.  

    Regards ,, Nasser

  • Such large differences are not caused by different clocks, but by configuration changes that affect things like caching, prefetching, or bursts.

    Please show the lspci -vv output for the XIO2001 and the PCI device, on both machines.

    Does your device driver change any XIO configuration registers? How does the device driver map the MMIO memory ranges?

  • Hi Nasser,

    I disagree. The Dell XE2, with the same bridge, has significantly faster transfers without having any noticeable negative side effects.

    Thanks,

    Devin

  • Hi Clemens,

    Our device driver does not change any XIO configuration registers.

    MMIO is mapped the same on XE2 and 7090 as follows in the test code:

    // --------------------------------------------------------------

    int fd = open("/dev/mem", O_RDWR);

    volatile unsigned int *ioBase = (unsigned int *) mmap(0, 4, PROT_READ | PROT_WRITE,

                                                      MAP_SHARED, fd, (off_t)(bar0 + 0x1000)); // DT330_STATUS is at 0x1000

    // --------------------------------------------------------------

    I can see that there are differences in the configurations between XE2 and 7090, although I don't understand what the implications are or if they would explain the speed difference.

    XE2 (2 to 5 us 32-bit reads):
    ------------------------------------------------
    ubuntu@ubuntu:~$ lspci -vv -P -PP -nn
    <snip>
    00:1c.1/03:00.0 PCI bridge [0604]: Texas Instruments XIO2001 PCI Express-to-PCI Bridge [104c:8240] (prog-if 00 [Normal decode])
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0, Cache Line Size: 64 bytes
    Bus: primary=03, secondary=04, subordinate=04, sec-latency=32
    I/O behind bridge: [disabled]
    Memory behind bridge: f7b00000-f7bfffff [size=1M]
    Prefetchable memory behind bridge: [disabled]
    Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
    BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16+ MAbort- >Reset- FastB2B-
    PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
    Capabilities: <access denied>

    ubuntu@ubuntu:~$ sudo lspci -d 0x104c:0x8240 -xxx
    03:00.0 PCI bridge: Texas Instruments XIO2001 PCI Express-to-PCI Bridge
    00: 4c 10 40 82 07 00 10 00 00 00 04 06 10 00 01 00
    10: 00 00 00 00 00 00 00 00 03 04 04 20 f1 01 a0 22
    20: b0 f7 b0 f7 f1 ff 01 00 00 00 00 00 00 00 00 00
    30: 00 00 00 00 40 00 00 00 00 00 00 00 ff 00 12 00
    40: 0d 48 00 00 28 10 c1 05 01 50 03 06 08 00 40 00
    50: 05 70 88 00 00 00 00 00 00 00 00 00 00 00 00 00
    60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    70: 10 00 72 00 02 80 90 05 00 20 19 00 11 3c 06 00
    80: 40 01 11 10 00 00 00 00 00 00 00 00 00 00 00 00
    90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    b0: 00 00 00 00 00 00 07 00 00 00 00 00 00 00 00 00
    c0: 01 00 00 03 08 01 12 00 00 20 14 32 00 00 00 00
    d0: 28 10 c1 05 5f 02 00 86 00 00 00 00 40 00 00 00
    e0: 00 00 00 00 00 00 00 00 43 04 08 00 7f 00 c0 01
    f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    ------------------------------------------------

    7090 (49 to 64 us 32-bit reads):
    ------------------------------------------------
    ubuntu@ubuntu:~$ lspci -vv -P -PP -nn
    <snip>
    00:1c.0/01:00.0 PCI bridge [0604]: Texas Instruments XIO2001 PCI Express-to-PCI Bridge [104c:8240] (prog-if 00 [Normal decode])
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0
    Bus: primary=01, secondary=02, subordinate=02, sec-latency=0
    I/O behind bridge: [disabled]
    Memory behind bridge: 71400000-714fffff [size=1M]
    Prefetchable memory behind bridge: [disabled]
    Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
    BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16- MAbort- >Reset- FastB2B-
    PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
    Capabilities: <access denied>

    ubuntu@ubuntu:~$ sudo lspci -d 0x104c:0x8240 -xxx
    02:00.0 PCI bridge: Texas Instruments XIO2001 PCI Express-to-PCI Bridge
    00: 4c 10 40 82 07 00 10 00 00 00 04 06 00 00 01 00
    10: 00 00 00 00 00 00 00 00 02 03 03 00 f1 01 a0 22
    20: d0 70 d0 70 f1 ff 01 00 00 00 00 00 00 00 00 00
    30: 00 00 00 00 40 00 00 00 00 00 00 00 ff 00 02 00
    40: 0d 48 00 00 00 00 00 00 01 50 03 06 08 00 40 00
    50: 05 70 88 00 00 00 00 00 00 00 00 00 00 00 00 00
    60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    70: 10 00 72 00 02 80 90 05 20 20 19 00 11 3c 06 00
    80: 42 01 11 10 00 00 00 00 00 00 00 00 00 00 00 00
    90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    b0: 00 00 00 00 00 00 07 00 00 00 00 00 00 00 00 00
    c0: 01 00 00 02 08 01 12 00 00 20 14 32 00 00 00 00
    d0: 00 00 00 00 5f 02 00 86 00 00 00 00 40 00 00 00
    e0: 00 00 00 00 00 00 00 00 43 04 08 00 7f 00 c0 01
    f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    ------------------------------------------------

    Thanks,

    Devin

  • Hello,

    I plotted the differences in config space in a Google spreadsheet. I should be able to run the commands listed below to make the 7090 config space match the XE2 config space.

    My questions are:

    1) Where can I find out more information about what these config space registers do?

    2) Are these all the standard "PCI config" space or are there some that are unique to the XIO2001?

    3) Are there any of these that I should not be running? For example, if BAR are stored between 0x10 and 0x28, then should I exclude those ones?

    4) Are there others that I should be checking (outside of the range that lspci -xxx covers)?

    5) Are there any here that might explain the differences in speed (or similarly, could any of the lspci output listed previously explain the speed difference)?

    sudo setpci -v -s 02:00.0 0C.w=0x10

    sudo setpci -v -s 02:00.0 18.w=0x03
    sudo setpci -v -s 02:00.0 19.w=0x04
    sudo setpci -v -s 02:00.0 1a.w=0x04
    sudo setpci -v -s 02:00.0 1b.w=0x20

    sudo setpci -v -s 02:00.0 20.w=0xb0
    sudo setpci -v -s 02:00.0 21.w=0xf7
    sudo setpci -v -s 02:00.0 22.w=0xb0
    sudo setpci -v -s 02:00.0 23.w=0xf7

    sudo setpci -v -s 02:00.0 3e.w=0x12

    sudo setpci -v -s 02:00.0 44.w=0x28
    sudo setpci -v -s 02:00.0 45.w=0x10
    sudo setpci -v -s 02:00.0 46.w=0xc1
    sudo setpci -v -s 02:00.0 47.w=0x05

    sudo setpci -v -s 02:00.0 78.w=0x00

    sudo setpci -v -s 02:00.0 80.w=0x40

    sudo setpci -v -s 02:00.0 c3.w=0x03

    sudo setpci -v -s 02:00.0 d0.w=0x28
    sudo setpci -v -s 02:00.0 d1.w=0x10
    sudo setpci -v -s 02:00.0 d2.w=0xc1
    sudo setpci -v -s 02:00.0 d3.w=0x05

    Thanks,

    Devin

  • You did not show the configuration space of the actual PCI device. But I guess that it has a single range of uncacheable memory.

    Do not change the bus numbers or memory addresses.

    Some registers are defined in the PCI(e) specification; some are specific to the XIO2001.

    Please note that to write bytes, you must not use .w but .b.

    0C: The cache line size has an effect only when writing entire cache lines.

    3E: Matters only for old VGA cards; ignore it.

    44/D0: Subsystem ID: identifies the motherboard; ignore it.

    78: The maximum payload size matters only for cached reads/writes.

    80: Bits 0/1 enable power management; this can actually make a difference. I guess the ASPM setting in the BIOS affects this.

    C3: read only

  • Hi Clemens,

    The PCI device itself has the same configuration space settings between the XE2 and the 7090 (aside from their memory addresses). So I didn't think it was worthwhile dumping that data here (although, I should have mentioned that they were identical, sorry for that).

    I just tried changing the register at 0x80 on the XIO2001 from 0x42 to 0x40 and... it works!

    The time for a 32-bit read on the 7090 in Linux just dropped to 7 us!

    Where can I find more information about the register at 0x80? Is that specific to the XIO2001?

    I'll review the BIOS for general power saving features. It should currently be set to "highest performance" though, but I'll confirm that. I know for PCI settings, there's only one setting available to either enable or disable the PCI port.

    I'll try this in QNX with our real-time application, but in Linux this looks very promising.

    Thanks so much for taking the time to review my issue and offer some guidance.

  • Setting this in the BIOS is preferrable if you do not want to change your software. The 7090 Service Manual says that the setting can be found in System setup options / Power / Active State Power Management.

    ASPM is defined by the PCIe specification; also see en.wikipedia.org/wiki/Active_State_Power_Management. In Linux, you can change it with the pcie_aspm kernel parameter.

  • Hi Clemens,

    Unfortunately this BIOS setting doesn't help. It could very well be a bug in the Dell BIOS though.

    The setting that you found has three possible settings: Auto, Disabled, and L1 Only. I just tried all three settings and confirmed that 0x80 of the XIO2001 is still 0x42 for all three of those settings. If any of the BIOS options changed the 0x80 value to 0x40 then it would probably be a proper solution.

    I tried reaching out to Dell through their community forums but encountered more difficulty than help. I'll see if I can reach someone helpful at Dell through another channel.

    In the meantime, I want to try manually writing 0x40 to 0x80 in a separate piece of software that runs on our system during boot-up. If that fixes the problem for us then I'll gladly accept this as an answer (and continue pushing Dell to address this with a BIOS update in parallel).

    If you see any other options in the BIOS that you want me to try then let me know. The one that you found definitely looks like it should be the proper option though. The 7090 that I'm working with does have the latest BIOS applied as well.

    Thanks

  • Clemens,

    Is 0x80 unique to the XIO2001 or do all PCI bridges need to implement it? Is there an XIO2001 datasheet that I could review or is 0x80 part of the PCIe spec?

    To ensure that ASPM is disabled, is clearing 0x2 sufficient? Or are there other bits that we should be checking and clearing in 0x80 as well? Is 0x2 for ASPM L1? Is there a bit for ASPM L0?

    I would like to make an internal fix for this that is generically applied to all PCI bridges (class 06, subclass 04, reg 00). However, if 0x80 is unique to the XIO2001 then I don't want to be clearing bits on other PCI bridges if those bits do not correspond to disabling ASPM.

    Also, shouldn't 0x10 be the PCIe Link Control Register instead of 0x80?

    Thanks

  • Clemens,

    I got my hands on a PCIe specification and it makes sense now. Previously I only worked with PCI devices. PCIe uses a Capability Pointer to point to a list of capabilities, so I need to traverse that list to find the capability and then the offset within that capability to the Link Control register.

    Thanks for your help. I should be able to come up with a generic solution for this issue.

    It's too bad Dell isn't helpful on addressing this in their BIOS. The ASPM setting is failing to disable ASPM.

    Thanks