This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DM814x PCIe or PCI hotplug driver support

Other Parts Discussed in Thread: XIO2001

Our setup is the following:

DM814x Board PCI-Express <--> TI XIO2001 PCIe-to-PCI Bridge <--> FPGA (PCI device)

The boot-up sequence is as follows:

  1. U-boot boots Linux OS
  2. Linux OS boot-up includes PCI bus initialization and enumeration and filesystem mounting.  In this case, only the PCIe RC and PCIe-to-PCI bridge will be detected.
  3. Once filesystem is mounted, the FPGA gets programmed.
  4. PCI bus re-enumeration is done using “echo 1 > /sys/bus/pci/rescan”

However, the FPGA does not get detected on the PCI bus re-enumeration right now.  Only the PCI Express Root Complex and the PCIe-to-PCI bridge XIO2001 are present when listing the PCI devices on the bus.  Here is the dump of the lspci during this time.

# ./lspci -vv

00:00.0 PCI bridge: Texas Instruments Device 8888 (rev 01) (prog-if 00 [Normal decode])

       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
       Latency: 0, Cache Line Size: 64 bytes
       Region 0: Memory at <ignored> (32-bit, non-prefetchable)
       Region 1: Memory at <ignored> (32-bit, prefetchable)
       Bus: primary=00, secondary=01, subordinate=02, sec-latency=0
       Memory behind bridge: 20000000-200fffff
       Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
       BridgeCtl: Parity+ SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
               PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
       Capabilities: [40] Power Management version 3
               Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
               Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
       Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
               Address: 0000000000000000  Data: 0000
       Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00
               DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                       ExtTag- RBE+ FLReset-
               DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                       RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                       MaxPayload 128 bytes, MaxReadReq 512 bytes
               DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
               LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s, Latency L0 <2us, L1 <64us
                       ClockPM- Surprise- LLActRep+ BwNot-
               LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- Retrain- CommClk-
                       ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
               LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
               RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
               RootCap: CRSVisible-
               RootSta: PME ReqID 0200, PMEStatus- PMEPending-
               DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd-
               DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
               LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                        Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                        Compliance De-emphasis: -6dB
               LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
                        EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
       Capabilities: [100 v1] Advanced Error Reporting
               UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
               UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
               UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
               CESta:  RxErr+ BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
               CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
               AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-

 

01:00.0 PCI bridge: Texas Instruments XIO2001 PCI Express-to-PCI Bridge (prog-if 00 [Normal decode])
       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
       Latency: 0, Cache Line Size: 64 bytes
       Bus: primary=01, secondary=02, subordinate=02, sec-latency=0
       Memory behind bridge: 20000000-200fffff
       Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
       BridgeCtl: Parity+ SERR- NoISA- VGA- MAbort+ >Reset- FastB2B-
               PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
       Capabilities: [40] Subsystem: Gammagraphx, Inc. (or missing ID) Device 0000
       Capabilities: [48] Power Management version 3
               Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
               Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
               Bridge: PM- B3+
       Capabilities: [50] MSI: Enable- Count=1/16 Maskable- 64bit+
               Address: 0000000000000000  Data: 0000
       Capabilities: [70] Express (v2) PCI/PCI-X Bridge, MSI 00
               DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                       ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
               DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                       RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- BrConfRtry-
                       MaxPayload 128 bytes, MaxReadReq 512 bytes
               DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
               LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <1us, L1 <16us
                       ClockPM+ Surprise- LLActRep- BwNot-
               LnkCtl: ASPM Disabled; Disabled- Retrain- CommClk-
                       ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
               LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
               DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
               DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
               LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
                        Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                        Compliance De-emphasis: -6dB
               LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                        EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
       Capabilities: [100 v1] Advanced Error Reporting
               UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
               UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
               UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
               CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout+ NonFatalErr+
               CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
               AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
 

If a reboot occurs, the FPGA device can now be detected on the PCI bus.

# ./lspci -tv

-[0000:00]---00.0-[01-02]----00.0-[02]--+-00.0  Altera Corporation Device 0010

According to the PCI Express Root Complex Driver User Guide (http://processors.wiki.ti.com/index.php/TI81XX_PSP_PCI_Express_Root_Complex_Driver_User_Guide),

Note-3: TI81XX PCIe hardware does not support hot plug and if an EP directly connected to the TI81XX RC goes down (e.g., powered down or disabled), the complete PCIe h/w initialization needs to be repeated and PCI enumeration re-triggered. This is NOT SUPPORTED by the RC driver and will require code modification to handle such cases.

Does anybody have the correct instructions on how to modify the RC driver so that a rescan of the PCI bus can be achieved?

Also, on our setup, is our FPGA programming considered to be hot-plugging from the PCIe RC or just from the PCIe-to-PCI bridge (through XIO2001)? I’m trying to determine if the rescan of the PCI bus should be done from the PCI Express RC (echo 1 > /sys/bus/pci/rescan) or just from the PCIe-to-PCI bridge (echo 1 > /sys/bus/pci/devices/0000:01:00.0/rescan).  Anyway, both cases do not resolve the issue.


Any help is appreciated.

Regards,
Maynard 

  • Hello,

    As stated in the TRM: 19.1.7 Features Not Supported [PCIe]: Built-in hardware support for hot plug.
    There is no support in the RC driver as well (it is here http://processors.wiki.ti.com/index.php/TI81XX_PCI_Express_Root_Complex_Driver_User_Guide#Features, as you pointed out)

    On this platform officially PCIe Hot Plug is not supported by TI.

    I personally think that implementing this possibly could require significant efforts (experienced PCIe engineer required).

    I am sorry, this functionality is not available.

    BR

    Vladimir

  • Hi Vladimir,

    Thanks for the reply.

    Yes, I understand that PCIe HotPlug is not officially supported on this platform.  However, I did find this post and Hemant Pedanekar (TI employee) provided a solution that helped one person that he posted on Jan 23, 2012 at 11:09AM.  I know that the PCIe Root Complex driver modifications that he suggested are untested.  But, it looks like the person that originally posted (Dave Foster92145) the question followed it and got it working.  Here’s the link to the discussion.

    http://e2e.ti.com/support/dsp/davinci_digital_media_processors/f/716/t/154734.aspx?pi69806=1

    I implemented his instructions (except for the optional IRQ handling of the PCIe reset).  However, I am getting the same result as the other person (Christian Shroeder) that tried to follow his instructions. 

    http://e2e.ti.com/support/dsp/davinci_digital_media_processors/f/716/t/154734.aspx?pi70909=2

    Our FPGA does not send a reset command to the DM814x board.

    It looks like doing an “echo 1 > /sys/bus/pci/rescan” does not call ti81xx_pcie_scan() function that will try to re-enumerate the PCI bus again.

    Would you know if Hemant Pedanakar is still available to answer some follow-up questions I have regarding his suggestion?

    Regards,
    Maynard

    http://e2e.ti.com/support/dsp/davinci_digital_media_processors/f/716/t/154734.aspx?pi70909=2

  • Hello Maynard,

    I wrote him about that. Please, post your follow-up questions in the meantime?

    Thank you.

    BR

    Vladimir

  • Hi Vladimir,

    Here are my follow-up questions to Hemant’s instructions.

    1. Does the RC driver changes to be able to rescan the PCI bus again only applicable if a local reset to the PCIe  module is applied?

    1a. If the answer is yes to Question #1, does it require a hardware reset (e.g. GPIO line to the PCIe reset signal) or can it be accomplished by toggling the PCRM LRST bit?  Is it possible to toggle the PCRM LRST bit through devmem2 utility?

    1b. If the answer is no to Question #1, your instruction #3 said to toggle the PCRM LRST bit.  Where should this be done?

    2. Is the optional step to implement the PMR interrupt handler necessary if the rescanning of the bus is done manually through sysfs (/sys/bus/pci/rescan)?

    3. Were you expecting that ti81xx_pcie_scan() will be called after invoking “echo 1 > /sys/bus/pci/rescan”?  Looking at the code under linux/drivers/pci/probe.c, triggering a rescan through sysfs will call the function pci_rescan_bus().  I’m not sure if there is a reference link between this function and the ti81xx_pcie_scan().  Here’s the function that I expected the ti81xx_pcie_scan() will get called.

     /**
     * pci_rescan_bus - scan a PCI bus for devices.
     * @bus: PCI bus to scan
     *
     * Scan a PCI bus and child buses for new devices, adds them,
     * and enables them.
     *
     * Returns the max number of subordinate bus discovered.
     */
    unsigned int __ref pci_rescan_bus(struct pci_bus *bus)
    {
            unsigned int max;
            struct pci_dev *dev;

            max = pci_scan_child_bus(bus);

            down_read(&pci_bus_sem);
            list_for_each_entry(dev, &bus->devices, bus_list)
                    if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE ||
                        dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)
                            if (dev->subordinate)
                                    pci_bus_size_bridges(dev->subordinate);
            up_read(&pci_bus_sem);

            pci_bus_assign_resources(bus);
            pci_enable_bridges(bus);
            pci_bus_add_devices(bus);

            return max;
    }

    Regards,
    Maynard

  • Hi Vladimir,

    By any chance, were you able to forward my follow-up questions I posted on Jan 25 to Hemant?  We would like to have an answer to this problem before we complete the PCB layout to our boards.

    Regards,
    Maynard
     

  • Hello Maynard,

    We are still waiting for feedback. I am sorry about that one. No guarantee, but I'll make an effort to get you your enquiry answered. If I have something I'll make a post here. Sorry for the inconvenience.

    BR

    Vladimir

  • Hi Vladimir,

    Is there any updates to the questions I posted for Hermant for this linux driver work-around to get PCI hotplug working for our FPGAs?

    Thanks,
    Maynard
     

  • Hi Maynard,

    We are currently trying to check this one once again. Sorry that you had to face this tough issue.

    BR

    Vladimir

  • Maynard,

    Extremely sorry for late reply.

    One of the major differences I see between your setup and the one which worked is, the working setup had FPGA directly connected as PCIe device while in this case, you have XIO2001 bridge sitting in between.

     

    Though I can see that eventually the s/w workarounds will still be required on RC side, we first need to find out if the XIO2001 is capable of detecting PCI device connected after POR?

     

    Particularly, in the following sequence:

     

    1. U-boot boots Linux OS
    2. Linux OS boot-up includes PCI bus initialization and enumeration and filesystem mounting.  In this case, only the PCIe RC and PCIe-to-PCI bridge will be detected.
    3. Once filesystem is mounted, the FPGA gets programmed.
    4. PCI bus re-enumeration is done using “echo 1 > /sys/bus/pci/rescan”

     Here, at step 3 above: It is possible that the XIO2001 itself is not capable of detecting device which becomes present after reset or may have some timing constraints (not sure).

     

    Have you checked the above using the same XIO2001 system with FPGA connected to a PC for example (supporting hotplug)? If this setup works with PC, we can rule out the possibility I mentioned for pt 3 above.

     

    If such setup is not possible in your case h/w wise  or the FPGA is not detected even with PC setup, I suggest following (updated the above sequence with modifications required):

     

    NOTE: Ensure that the RC PCIe drive code is updated as per modifications I suggested on E2E (other thread), ensuring PM interrupt handler is also implemented to reconfigure h/w as generating XIO reset will also reset RC side PCIe h/w.

     

    1)      U-Boot, Linux up on RC

    2)      RC detects XIO

    3)      FPGA programmed

    4)      Generate a reset to XIO

    • Again, I haven’t seen XIO spec but it may be possible to just toggle some reset line (hw)
    • Or, send in band PCIe reset to XIO

    5)      After this, the RC side PCIe must be reinitialized and PCIe re-enumeration must be triggered after link is up.

    Hemant

  • The XIO2001 itself does not support hot plug, so my expectation is that moving the solution to a PC in which the PCIe root port complex does support it would yield the same results.

    The XIO2001 will propagate a PCIe PERST to PCI, which is likely the reason that this solution does function after the FPGA is programmed and the system is reset. Keep in mind the newly programmed FPGA (now a PCI device) may need a PRST itself before it will respond to config transactions passed to it from the XIO2001, which it does not receive as part of the 'hot plug' scenario, but does as part of any subsequent PCIe subsystem reset.

    My suggestion in this case would be to have SW assert the XIO2001's secondary bus reset after the FPGA is configured and prior to re-running the bus scan. This can be accomplished by toggling (0:1:0)  the SRST bit in the XIO2001 Bridge Control Register (Bit 6 @ 0x3E). Also, ensure that the SBUS_RESET_MASK bit in the XIO2001 Control and Diagnostic Register 1 (0xC4, bit 10) is not set prior to toggling the secondary bus reset. I'd assert the secondary bus reset for ~100ms to ensure the FPGA has time to latch it and be ready for de-assertion and subsequent PCI traffic.

  • Yes, I suspected that hotplug would not be supported in XIO and hence some reset has to be applied as mentioned in step 4 in my post. If it is possible to just reset the secondary bus and get the PCI device detected, it would be even better since the PCIe link would be intact and just re-enumeration would suffice.

    Hemant

  • Hello everyone:

    Maynard Cabiente said:

    Here are my follow-up questions to Hemant’s instructions.

    1. Does the RC driver changes to be able to rescan the PCI bus again only applicable if a local reset to the PCIe  module is applied?

    1a. If the answer is yes to Question #1, does it require a hardware reset (e.g. GPIO line to the PCIe reset signal) or can it be accomplished by toggling the PCRM LRST bit?  Is it possible to toggle the PCRM LRST bit through devmem2 utility?

    1b. If the answer is no to Question #1, your instruction #3 said to toggle the PCRM LRST bit.  Where should this be done?

    2. Is the optional step to implement the PMR interrupt handler necessary if the rescanning of the bus is done manually through sysfs (/sys/bus/pci/rescan)?

    3. Were you expecting that ti81xx_pcie_scan() will be called after invoking “echo 1 > /sys/bus/pci/rescan”?  Looking at the code under linux/drivers/pci/probe.c, triggering a rescan through sysfs will call the function pci_rescan_bus().  I’m not sure if there is a reference link between this function and the ti81xx_pcie_scan().  Here’s the function that I expected the ti81xx_pcie_scan() will get called.

    These questions are spot on! I found myself asking the same questions to me this Friday, and after studying the situation, I have now reached a point where I need help. Let me explain my situation, and let's hope that Hemant, or any other knowledgeable person can help us on fixing this inconvenient insufficiency of the system.

    This is my HW setup:

    TIDM8168 (RC) ---> FPGA (EP)

    Simple. Directly connected. I can successfully make both systems interact with 0 problems as long as enumeration is only done once. As soon as I get the EP enumerated in my system, the if it goes down, the only solution is rebooting the RC.

    I was happy to find Hemant's directions, and I thought I could easily fix  the situation, but I was wrong. I implemented everything, except for the PMR interrupt handler. Unfortunatelly, my modification did  not work, so I started digging a little bit. Looking at the code, it seemed to me as if ti81xx_pcie_scan function was never called as a result of doing echo 1 > /sys/bus/pci/rescan . Some prints confirmed that that is not happening. This function in my kernel code is just being called from within bios32.c, in pcibios_init_hw function upon system initialization. So I forced pci_rescan_bus in pci/probe.c to call that scan function. Sadly, this modification lead to a big kernel crash when trying a manual bus rescan.

    I reverted back the changes, and went back to original kernel code ( the kernel I am using can be found in ti-ezsdk_dm816x-evm_5_03_01_15 ) . I modified that original kernel code, getting some variables in pcie-ti81xx.c to be global, exporting the necessary symbols, and moving all of the defines to pcie-ti81xx.h to be able to buid a test kernel module (attached to this post).

    After a fresh boot of the system with that kernel, and the module in the filesystem, if I load the module, I can extract the following conclusions:

    dump1.txt (attached) I know the status of the listed registers when everything is good

    dump2.txt (attached) I know that the reset command is doing something since the registers are being reset

    dump3.txt (attached) After trying to re-configure the bus, a register stays with a different value (S0_L+PCI_COMMAND=0x00100546 instead of 0x00100000) . Is it important?

    This test sequence shows me that there is something that I am missing:

    1. I boot the system with no EP connected

    2. I start up the EP

    3. I load the test module

    4. I  unload it

    5. I request a manual rescan

    6. The EP is found

    7. I can safely load and unload the EP's driver

    8. I stop the EP

    9. I load+unload the test module

    10. I request a manual rescan

    11. The EP is still listed as a PCIe Device by the system (this isn't looking good).

    12. I repeat steps 2 to 6

    13. The kernel reports a crash when trying to load the EP's driver

    Please, help me find out what I am doing wrong/what I am missing. I am really looking forward to hearing from you. Thanks in advance:

    Xabier.

    pci_reset_diagnose.tar.gz
  • Xabier,

    At step 8 above, the PCIe link will go down and the PCIe registers won't be accessible. That is the most likely reason of the crash you see at step 13.

    I suggest you do the PCIe re-initialization (clock setup, LTSSM etc in the RC driver initialization) after step 8. IMO this can be triggered as part of PCIe PM reset interrupt.

    Once the h/w init is done, trigger (call) the scan function. Then proceed to repeat EP driver load.

    One gap I see though - you were able to unload test module at step 9 without any crash, probably because it didn't do any PCIe reads (of EP register)? In such case, you can keep 9 as is in sequence.

    Hemant

  • Hemant,

    Thank you very much for your quick repply.

    In my test I am making use of two different modules:

    1.- test_module (used at steps 3, 4, 9 and 12)

    2.- EP driver module (used at steps 7 and 13)

    The test_module does nothing when it is unloaded, and that is why step 9 happens without any crash. test_module's only purpose is to trigger PCIe re-initialization manually instead of doing it via a PCIe PM reset interrupt. Source code is attached to my previous message.

    Am I doing anything wrong during the PCIe re-initialization? I don't see why triggering it via a module would be different from doing it via an interrupt, so I guess that I am making some kind of mistake in my initialization... please, correct me if I'm wrong.

  • Xabier,

    Using interrupt is optional if you are ok with manually triggering re-init.

    You will need to ensure the PCIe initialization is done as I mentioned above (after step 8), before proceeding to call scan routine in the RC driver (or update the scan routine to do clock enable etc there itself as mentioned earlier).

    Hemant

  • Hemant,

    This is what step 9 AND step 3 are actually doing:

    ==============================================================

        /* Assert local reset */
        omap2_prm_set_mod_reg_bits(TI81XX_PCI_LRST_MASK,
                TI81XX_PRM_DEFAULT_MOD,
                TI81XX_RM_RSTCTRL);

        msleep(1000);

        /* De-assert local reset after module enable */
        omap2_prm_clear_mod_reg_bits(TI81XX_PCI_LRST_MASK,
                TI81XX_PRM_DEFAULT_MOD,
                TI81XX_RM_RSTCTRL);

        /* 100ms */
        msleep(100);

        /*
         * TI81xx devices do not support h/w autonomous link up-training to GEN2
         * form GEN1 in either EP/RC modes. The software needs to initiate speed
         * change.
         */
        __raw_writel(DIR_SPD | __raw_readl(
                    reg_virt + SPACE0_LOCAL_CFG_OFFSET + PL_GEN2),
                reg_virt + SPACE0_LOCAL_CFG_OFFSET + PL_GEN2);

        /*
         * Initiate Link Training. We will delay for L0 as specified by
         * standard, but will still proceed and return success irrespective of
         * L0 status as this will be handled by explicit L0 state checks during
         * enumeration.
         */
         __raw_writel(LTSSM_EN_VAL | __raw_readl(reg_virt + CMD_STATUS),
                 reg_virt + CMD_STATUS);

         /* 100ms */
         msleep(100);

        /*
         * Identify ourselves as 'Bridge' for enumeration purpose. This also
         * avoids "Invalid class 0000 for header type 01" warnings from "lspci".
         *
         * If at all we want to restore the default class-subclass values, the
         * best place would be after returning from pci_common_init ().
         */
        __raw_writew(PCI_CLASS_BRIDGE_PCI,
                reg_virt + SPACE0_LOCAL_CFG_OFFSET + PCI_CLASS_DEVICE);

        /*
         * Prevent the enumeration code from assigning resources to our BARs. We
         * will set up them after the scan is complete.
         */
        disable_bars();

        set_outbound_trans(res[0].start, res[0].end);

        /* Enable 32-bit IO addressing support */
        __raw_writew(PCI_IO_RANGE_TYPE_32 | (PCI_IO_RANGE_TYPE_32 << 8),
                reg_virt + SPACE0_LOCAL_CFG_OFFSET + PCI_IO_BASE);

        /*
         * FIXME: The IO Decode size bits in IO base and limit registers are
         * writable from host any time and during enumeration, the Linux PCI
         * core clears the lower 4-bits of these registers while writing lower
         * IO address. This makes IO upper address and limit registers to show
         * value as '0' and not the actual value as configured by the core
         * during enumeration. We need to re-write bits 0 of IO limit and base
         * registers again. Need to find if a post configuration hook is
         * possible. An easier and clear but possibly inefficient WA is to snoop
         * each config write and restore 32-bit IO decode configuration.
         */

        /*
         * Setup as PCI master, also clear any pending  status bits.
         * FIXME: Nolonger needed as post-scan fixup handles this (see below).
         */
    #if 0
        __raw_writel((__raw_readl(reg_virt + SPACE0_LOCAL_CFG_OFFSET
                        + PCI_COMMAND)
                | CFG_PCIM_CSR_VAL),
                reg_virt + SPACE0_LOCAL_CFG_OFFSET + PCI_COMMAND);
    #endif

        if (legacy_irq >= 0) {
            __raw_writel(0xf, reg_virt + IRQ_ENABLE_SET);
        } else {
            __raw_writel(0xf, reg_virt + IRQ_ENABLE_CLR);
            pr_warning(DRIVER_NAME ": INTx disabled since no legacy IRQ\n");
        }

        get_and_clear_err();

        pr_info (DRIVER_NAME ": enable_bar configs in progres...");
        /*
         * Setup as PCI master, also clear any pending  status bits.
         */
        __raw_writel((__raw_readl(reg_virt + SPACE0_LOCAL_CFG_OFFSET
                        + PCI_COMMAND)
                    | CFG_PCIM_CSR_VAL),
                reg_virt + SPACE0_LOCAL_CFG_OFFSET + PCI_COMMAND);

        set_inbound_trans();

        msleep(100);

    ==============================================================================

    Since step 7 is being successfully completed and step 13 isn't, I think there is something else left to be done... but what is it?

  • Sorry, I didn't see the code you attached.

    Looks good to me.

    I suspect the link is not up. Can you also dump the DEBUG0 register from test module? Offset is 0x1728 from local base IIRC. The last 5 bits should be 0x11 before proceeding to do rescan (note that you may need to call the scan routine explicitly from the module itself after h/w setup and after link is up).

    Hemant

  • Hermant,

    Sorry for taking so long to repply. I wanted to make sure I did all the tests correctly.

    Attached to this message you will be able to find the revised code for the test_module. Still not working :(:w

    Here is the information you where asking for:

    New test procedure:

    1. System boot / EP disconnected
    2. load+unload test_module
        DEBUG0 Before reset command:
        0x00006B00
        DEBUG0 After reset + init:
        0x0000DD00
    3. Bring the EP up
    4. load+unload test_module
        DEBUG0 Before reset command:
        0x03009611
        DEBUG0 After reset + init:
        0x03001E11
    5. echo 1 > /sys/bus/pci/rescan (EP is detected)
    6. load EP driver. OK!
    7. unload EP driver. OK!
    8. Bring EP down
    9. load+unload test_module
        DEBUG0 Before reset command:
        0xFFFFFFFF (What??)
        DEBUG0 After reset + init:
        0x00009F00
    10. echo 1 > /sys/bus/pci/rescan (EP has not dissapeared from /sys/bus/pci/devices)
    11. Bring the EP back up
    12. load+unload test_module
        DEBUG0 Before reset command:
        0x0300C911
        DEBUG0 After reset + init:
        0x03007911
    13. echo 1 > /sys/bus/pci/rescan (No message about EP detection, device still present in /sys/bus/pci/devices)
    14. load EP driver. Crash:


    Unhandled fault: Precise External Abort on non-linefetch (0x1008) at 0xd099d800
    Internal error: : 1008 [#1]
    last sysfs file: /sys/bus/pci/rescan
    Modules linked in: ik_dma(+) test_module [last unloaded: test_module]
    CPU: 0    Not tainted  (2.6.37 #22)
    PC is at probe+0x204/0xbe8 [ik_dma]
    LR is at ioremap_page_range+0xbc/0x1b0
    pc : [<bf029940>]    lr : [<c019cb18>]    psr: a0000013
    sp : cc1a1c70  ip : cc1a1c00  fp : cc1a1cf4
    r10: 00000000  r9 : bf02ac34  r8 : cc1d5000
    r7 : bf02b458  r6 : bf02b458  r5 : 00000000  r4 : cc354440
    r3 : d099d800  r2 : cc354440  r1 : d099e000  r0 : d099c000
    Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
    Control: 10c5387d  Table: 8cbac019  DAC: 00000015
    Process insmod (pid: 1066, stack limit = 0xcc1a02e8)
    Stack: (0xcc1a1c70 to 0xcc1a2000)
    1c60:                                     00000000 cc07a7e0 cc1a1cac cc1a1c88
    1c80: c0113310 c0112f60 cc1a1ccc cc1a1c98 c0113e3c c02f592c cca47ed0 cc07a7e0
    1ca0: cc1a1ccc cc1a1cb0 c0113478 00000000 cca47ed0 cc07a7e0 cc1a1ce0 00000000
    1cc0: cc1d5068 cc1d5000 cc1a1cec cc1a1d1c cc1d50f0 cc1d5060 cc1d5000 cc165300
    1ce0: bf02b25c 00000000 cc1a1d14 cc1a1cf8 c01b7c84 bf029748 bf02b25c cc1d5000
    1d00: 00000000 bf02b22c cc1a1d44 cc1a1d18 c01b808c c01b7c38 cc1a1d44 bf02b22c
    1d20: cc1d5000 bf02b2fc cc1d5060 cc1d5060 c0429dec c01ed514 cc1a1d74 cc1a1d48
    1d40: c01ed3ec c01b8024 00000000 cc1d5060 cc1d5094 bf02b25c c01ed514 cc165300
    1d60: bf02e000 00000000 cc1a1d94 cc1a1d78 c01ed5a8 c01ed354 00000000 cc1a1d98
    1d80: bf02b25c c01ed514 cc1a1dbc cc1a1d98 c01ecb8c c01ed520 cc840e38 cc066830
    1da0: cc844c00 00000018 bf02b25c c03e7bbc cc1a1dcc cc1a1dc0 c01ed238 c01ecb30
    1dc0: cc1a1dfc cc1a1dd0 c01ec394 c01ed220 bf02b1f8 c03e7bbc bf02b22c 00000018
    1de0: bf02b25c c03e7bbc c03f85c0 00000000 cc1a1e1c cc1a1e00 c01ed8a4 c01ec2e4
    1e00: bf02b22c 00000018 bf02b25c c03e7bbc cc1a1e3c cc1a1e20 c01b82cc c01ed838
    1e20: 00000000 00000018 bf02b334 00000000 cc1a1e4c cc1a1e40 bf02e030 c01b8298
    1e40: cc1a1e94 cc1a1e50 c0031374 bf02e00c 00000000 00000001 cc1a1e84 00000000
    1e60: 00000018 bf02b334 00000000 00000000 00000018 bf02b334 00000000 cc1a1f5c
    1e80: 00000000 00000000 cc1a1fa4 cc1a1e98 c0093f28 c0031344 00000000 00000001
    1ea0: ffffffff 00000006 c02fbfcc c02fbfcc c007ee84 c03da2e4 c02fbed0 bf02b340
    1ec0: d09954ec d09954f0 000000ae d09979e3 d0995a68 cc1a0000 00008000 c02fbf80
    1ee0: d0992000 000059ef d09956d0 d09955ef d0997328 cc2d2d00 00003480 000035f0
    1f00: 00000000 00000000 00000017 00000018 0000000f 00000000 0000000b 00000000
    1f20: 6e72656b 00006c65 00000000 00000000 00000000 00000000 00000000 00000000
    1f40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    1f60: 00000000 00000000 00000000 00000000 00000000 00000006 cc1a1fa4 00000000
    1f80: 00008000 00000003 00000080 c003d668 cc1a0000 00000000 00000000 cc1a1fa8
    1fa0: c003d4c0 c0093e6c 00000000 00008000 00012018 000059ef 00012008 00000001
    1fc0: 00000000 00008000 00000003 00000080 00012008 00000000 00012018 00000000
    1fe0: beb88ecf beb88c6c 00008da0 402397d4 60000010 00012018 00000000 00000000
    Backtrace:
    [<bf02973c>] (probe+0x0/0xbe8 [ik_dma]) from [<c01b7c84>] (local_pci_probe+0x58/0xc0)
    [<c01b7c2c>] (local_pci_probe+0x0/0xc0) from [<c01b808c>] (pci_device_probe+0x74/0x98)
     r7:bf02b22c r6:00000000 r5:cc1d5000 r4:bf02b25c
    [<c01b8018>] (pci_device_probe+0x0/0x98) from [<c01ed3ec>] (driver_probe_device+0xa4/0x1cc)
     r7:c01ed514 r6:c0429dec r5:cc1d5060 r4:cc1d5060
    [<c01ed348>] (driver_probe_device+0x0/0x1cc) from [<c01ed5a8>] (__driver_attach+0x94/0x98)
    [<c01ed514>] (__driver_attach+0x0/0x98) from [<c01ecb8c>] (bus_for_each_dev+0x68/0x94)
     r7:c01ed514 r6:bf02b25c r5:cc1a1d98 r4:00000000
    [<c01ecb24>] (bus_for_each_dev+0x0/0x94) from [<c01ed238>] (driver_attach+0x24/0x28)
     r7:c03e7bbc r6:bf02b25c r5:00000018 r4:cc844c00
    [<c01ed214>] (driver_attach+0x0/0x28) from [<c01ec394>] (bus_add_driver+0xbc/0x254)
    [<c01ec2d8>] (bus_add_driver+0x0/0x254) from [<c01ed8a4>] (driver_register+0x78/0x158)
    [<c01ed82c>] (driver_register+0x0/0x158) from [<c01b82cc>] (__pci_register_driver+0x40/0xb0)
     r7:c03e7bbc r6:bf02b25c r5:00000018 r4:bf02b22c
    [<c01b828c>] (__pci_register_driver+0x0/0xb0) from [<bf02e030>] (pci_fpga_init+0x30/0x38 [ik_dma])
     r7:00000000 r6:bf02b334 r5:00000018 r4:00000000
    [<bf02e000>] (pci_fpga_init+0x0/0x38 [ik_dma]) from [<c0031374>] (do_one_initcall+0x3c/0x1bc)
    [<c0031338>] (do_one_initcall+0x0/0x1bc) from [<c0093f28>] (sys_init_module+0xc8/0x1740)
    [<c0093e60>] (sys_init_module+0x0/0x1740) from [<c003d4c0>] (ret_fast_syscall+0x0/0x30)
    Code: e34b0f02 eb4b2c74 eaffffec e2833b06 (e593c000)
    ---[ end trace acfa266ae3614547 ]---
    Segmentation fault

  • Xabier,

    I suggest you remove step 10 above as there is no use scanning till the EP is not up. The PCIe link will only get established once the directly connected EP is up. The step 12 is where link is actually up and then you should proceed to to re-enumeration.

    I think as you have already found out, ti81xx_pcie_scan() is not getting called through sysfs path. This means the function needs to be explicitly called - either from test module (but will need to be exported from kernel) or by creating a dedicated sysfs entry can invoking the same before step 14.

    Hemant

  • Thank you for the suggestion. My first try didn't go that well...

    I have modified pcie-ti81xx.c so that it stores the sys variable as a global iku_sys variable, and I have created a iku_pci_scan() function that calls ti81xx_pci.scan(0,iku_sys); .

    The test_module is now calling iku_pci_scan() function, but sadly the result is this:

    ========================================================================

    Internal error: Oops - undefined instruction: 0 [#1]
    last sysfs file: /sys/devices/platform/davinci_emac.0/net/eth0/ifindex
    Modules linked in: test_module(+)
    CPU: 0    Not tainted  (2.6.37 #24)
    PC is at ti814x_muxmodes+0xa1c/0x2db8
    LR is at __setup_pci_setup+0x8/0xc
    pc : [<c002b870>]    lr : [<c002604c>]    psr: 20000013
    sp : cc26fe00  ip : c1b23040  fp : cc26fe14
    r10: 00000000  r9 : bf003000  r8 : c03f85c0
    r7 : 00000000  r6 : bf000948  r5 : ccb7fc40  r4 : 00000000
    r3 : ffffffff  r2 : c1b23040  r1 : cc867840  r0 : 00000000
    Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
    Control: 10c5387d  Table: 8cb38019  DAC: 00000015
    Process insmod (pid: 991, stack limit = 0xcc26e2e8)
    Stack: (0xcc26fe00 to 0xcc270000)
    fe00: c03f8ad0 00000018 cc26fe24 cc26fe18 c0055898 c0055918 cc26fe3c cc26fe28
    fe20: bf000320 c0055878 00000000 00000018 cc26fe4c cc26fe40 bf003028 bf000230
    fe40: cc26fe94 cc26fe50 c0031374 bf00300c 00000000 00000001 cc26fe84 00000000
    fe60: 00000018 bf000948 00000000 00000000 00000018 bf000948 00000000 cc26ff5c
    fe80: 00000000 00000000 cc26ffa4 cc26fe98 c0093f28 c0031344 00000000 00000001
    fea0: ffffffff 00000006 c02fbfcc c02fbfcc c007ee84 c03da2e4 c02fbed0 bf000954
    fec0: d096bab4 d096bab8 00000050 d096cb40 d096bf48 cc26e000 00004000 c02fbf80
    fee0: d096b000 00001b47 d096bc78 d096bbb7 d096c8e0 cc205a80 00000a6c 00000aec
    ff00: 00000000 00000000 00000012 00000013 0000000a 00000000 00000007 00000000
    ff20: 6e72656b 00006c65 00000000 00000000 00000000 00000000 00000000 00000000
    ff40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    ff60: 00000000 00000000 00000000 00000000 00000000 00000006 cc26ffa4 00000000
    ff80: 00004000 00000003 00000080 c003d668 cc26e000 00000000 00000000 cc26ffa8
    ffa0: c003d4c0 c0093e6c 00000000 00004000 00012018 00001b47 00012008 00000001
    ffc0: 00000000 00004000 00000003 00000080 00012008 00000000 00012018 00000000
    ffe0: bee98ece bee98c6c 00008da0 402707d4 60000010 00012018 00000000 00000000
    Backtrace:
    [<c005590c>] (ti81xx_pcie_scan+0x0/0x68) from [<c0055898>] (iku_pci_scan+0x2c/0x30)
     r5:00000018 r4:c03f8ad0
    [<c005586c>] (iku_pci_scan+0x0/0x30) from [<bf000320>] (hw_setup_function+0xfc/0x13c [test_module])
    [<bf000224>] (hw_setup_function+0x0/0x13c [test_module]) from [<bf003028>] (my_start_init+0x28/0x34 [test_module])
     r5:00000018 r4:00000000
    [<bf003000>] (my_start_init+0x0/0x34 [test_module]) from [<c0031374>] (do_one_initcall+0x3c/0x1bc)
    [<c0031338>] (do_one_initcall+0x0/0x1bc) from [<c0093f28>] (sys_init_module+0xc8/0x1740)
    [<c0093e60>] (sys_init_module+0x0/0x1740) from [<c003d4c0>] (ret_fast_syscall+0x0/0x30)
    Code: 00000000 00000000 00000000 00000000 (7c010001)
    ---[ end trace 3ec6497b0f00505a ]---
    Segmentation fault

    =============================================================

    I'm still looking into it. Any suggestions will be welcome.

  • I might be wrong, but it seems to me that the problem might be that ti81xx_pcie_scan is calling pci_scan_bus_parented, which is calling pci_create_bus.

    This pci_create_bus function is doing dev = kzalloc(sizeof(*dev), GFP_KERNEL); , which seems to be something that you wouldn't want to do twice without freeing the resources allocated the previous time you run this function.

    The tricky part is that along with this kszalloc there are multiple lists being handled along the process... are we sure that ti81xx_pcie_scan is the right function to call?

  • Ha ha! Got It! :) :) :)

    The call to ti81xx_pcie_scan is NOT necessary. The problem is that the rescan didn't work because we where just resetting the processors registers, but not the kernel's awareness of the PCI system, so an incongruity was being created when reseting and reinitialising that part.

    The fix is quite easy, actually... all of the devices in the PCI device tree must be removed before commanding the rescan.

    The manual way to do it is by issuing the following command in a console:

    echo 1 > /sys/bus/pci/devices/0000\:00\:00.0/remove

  • Xabier,

    That is great. I am very glad that the issue is solved. Thank you very much for your hard work and for your detailed feedback.

    Many thanks, Hemant. Your expertise is greatly appreciated.

    BR

    Vladimir

  • Xabier,

    This is good. Thanks for the update.

    Regarding rescan, if ti81xx_pcie_scan() is not getting called, please ensure that the post enumeration fixups (as applicable) are done after the scan. For example, the latest RC driver sets up inbound access and IO configuration after the scan which are needed to enable access from EP and I/O respectively.

    Hemant

  • Thank you for the tip, Hemant:

    Line 200 of the last attached module calls set_inbound_trans() function... is there anything else needed aside from this?

  • Xabier,

    If you want IO access (very unlikely in this case), then there is an IO filter configuration done in the scan routine. Note that these changes were added in latest code and so not sure if apply in your case. Best would be to see the ti81xx scan routine and check things done (if any) after return of PCI scan routine (pci_scan_bus()) if they are required in your case.

    Hemant

  • Hemant,

    Thank you, I will take that into consideration whenever I get back to polishing everything up.

    Just so that I know, in case I ever encounter any other problem... when you talk about "latest code", are you referring to this repository:

    http://arago-project.org/git/projects/?p=linux-omap3.git;a=summary

    or is there any better place to check for changes in the kernel sources for this platform?

  • Xabier,

    That is correct and specifically check ti81xx-master branch.

    Hemant

  • Hello Again:

    OK, so I have been working on some code modifications that would enable us to trigger the fix for our problem when the PCIe link goes down using interrupts.

    I wrote an interrupt handler, but I never saw it get called. It might be that I'm doing something wrong, but I find it interesting that once the PCIe link goes down, if I try to manually look the content of the PMRST_IRQ_STATUS register that should trigger the interrupt with devmem2 0x510001D4, I get the following error:

    -------------------------------------------------------------

    /dev/mem opened.Unhandled fault: Precise External Abort on non-linefetch (0x1018) at 0x402271d4

    Memory mapped at address 0x40227000.
    Bus error

    -----------------------------------------------------------

    Does this mean that that interrupt will never be able to work?

  • Xabier,

    This is expected behavior when accessing status register after interrupt - basically the PCIe module goes into reset on receiving the peer link down.

    Now the issue is your interrupt handler is not getting called, have you enabled the PM reset interrupt? I think the bit number 3 in register @0x1d8 offset should be set to 1 explicitly - this is not set by default.

    Hemant