This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DRA722: PCI issue in some reboot

Part Number: DRA722

Hi

we are facing a strange issue with PCI phy, some time in out of 10 , one time it is unable to get the CLASS, What could be the issue?

Below are the details,

  • only 6nodes case:
DRAGON:~# lspci
0000:00:00.0 Class 0604: Device 104c:8888 (rev 01)
0000:01:00.0 Class 0200: Device 14e4:b340 (rev 01)
0000:01:00.1 Class 0200: Device 14e4:b340 (rev 01)
0001:00:00.0 Class 0604: Device 104c:8888 (rev 01)
0001:01:00.0 Class 0200: Device 14e4:b340 (rev 01)
0001:01:00.1 Class 0200: Device 14e4:b340 (rev 01)
DRAGON:~#
DRAGON:~# devmem 0x51001000 32
0x8888104C
DRAGON:~# devmem 0x51801000 32
0x8888104C
DRAGON:~# devmem 0x51001008 32
0x06040001
DRAGON:~# devmem 0x51801008 32
0x06040001
DRAGON:~#

LINK Status:
DRAGON:~# devmem 0x5180210c 32
0x00010000
DRAGON:~#
DRAGON:~# devmem 0x5100210c 32
0x00010000
DRAGON:~#
DRAGON:~# devmem 0x51002104 32
0x00000045
DRAGON:~#
DRAGON:~# devmem 0x51802104 32
0x00000045
  • only 4nodes case:
DRAGON:~# lspci
0000:00:00.0 Class 0604: Device 104c:8888 (rev 01)
0000:01:00.0 Class 0200: Device 14e4:b340 (rev 01)
0000:01:00.1 Class 0200: Device 14e4:b340 (rev 01)
0001:00:00.0 Class 0000: Device 104c:8888 (rev 01)
DRAGON:~#
DRAGON:~# devmem 0x51001000 32
0x8888104C
DRAGON:~# devmem 0x51801000 32
0x8888104C
DRAGON:~# devmem 0x51001008 32
0x06040001
DRAGON:~# devmem 0x51801008 32
0x00000001
DRAGON:~#
DRAGON:~# devmem 0x51002104 32
0x00000045
DRAGON:~#
DRAGON:~# devmem 0x51802104 32
0x00000000

LINK Status:

DRAGON:~# devmem 0x5100210c 32
0x00010000
DRAGON:~# devmem 0x5180210c 32
0x00000000
DRAGON:~#
  • CTRL_CORE_PHY_POWER_PCIESS1 & CTRL_CORE_PHY_POWER_PCIESS2--> Power supply control module: 4 node case
DRAGON:~# devmem 0x4a003c40 32 --> PCIESS1
0x0500C000
DRAGON:~#
DRAGON:~# devmem 0x4a003c44 32 --> PCIESS2
0x00000000
  • CTRL_CORE_PHY_POWER_PCIESS1 & CTRL_CORE_PHY_POWER_PCIESS2 --> Power supply control module: 2 node case

DRAGON:~# lspci
0000:00:00.0 Class 0604: Device 104c:8888 (rev 01)
0001:00:00.0 Class 0604: Device 104c:8888 (rev 01)
DRAGON:~#
DRAGON:~# devmem 0x4a003c40 32
0x0500C000
DRAGON:~# devmem 0x4a003c44 32
0x00000000

  • CTRL_CORE_PHY_POWER_PCIESS1 & CTRL_CORE_PHY_POWER_PCIESS2 --> Power supply control module: 6 node case

DRAGON:~# lspci
0000:00:00.0 Class 0604: Device 104c:8888 (rev 01)
0000:01:00.0 Class 0200: Device 14e4:b340 (rev 01)
0000:01:00.1 Class 0200: Device 14e4:b340 (rev 01)
0001:00:00.0 Class 0604: Device 104c:8888 (rev 01)
0001:01:00.0 Class 0200: Device 14e4:b340 (rev 01)
0001:01:00.1 Class 0200: Device 14e4:b340 (rev 01)
DRAGON:~# devmem 0x4a003c40 32
0x0500C000
DRAGON:~# devmem 0x4a003c44 32
0x00000000

DRAGON:~#

  • Without Errata i870

DRAGON:~# devmem 0x4a003c18 32
0x00000000

  • Errata i870

DRAGON:~# devmem 0x4a003c18 32
0x00000003

  • lspci -vvv

0001:00:00.0 Class 0000: Device 104c:8888 (rev 01)
        !!! Invalid class 0000 for header type 01
        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Interrupt: pin A routed to IRQ 0
        Region 0: Memory at <unassigned> (32-bit, prefetchable) [disabled]
        Region 1: Memory at <unassigned> (32-bit, prefetchable) [disabled]
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        I/O behind bridge: 00000000-00000fff [size=4K]
        Memory behind bridge: 00000000-000fffff [size=1M]
        Prefetchable memory behind bridge: 00000000-000fffff [size=1M]
        Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
        BridgeCtl: Parity- SERR- NoISA- VGA- VGA16- MAbort- >Reset- FastB2B-
                PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA PME(D0+,D1+,D2-,D3hot+,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0
                        ExtTag- RBE+
                DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM L0s L1, Exit Latency L0s <512ns, L1 <64us
                        ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s (downgraded), Width x1 (downgraded)
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                RootCap: CRSVisible-
                RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
                RootSta: PME ReqID 0000, PMEStatus- PMEPending-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, NROPrPrP+, LTR-
                         10BitTagComp-, 10BitTagReq-, OBFF Not Supported, ExtFmt-, EETLPPrefix-
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS-, LN System CLS Not Supported, TPHComp-, ExtTPHComp-, ARIFwd-
                         AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
                         AtomicOpsCtl: ReqEn- EgressBlck-
                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [100 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
                AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
                RootCmd: CERptEn- NFERptEn- FERptEn-
                RootSta: CERcvd- MultCERcvd- UERcvd- MultUERcvd-
                         FirstFatal- NonFatalMsg- FatalMsg- IntMsg 0
                ErrorSrc: ERR_COR: 0000 ERR_FATAL/NONFATAL: 0000

We are using PCIe0 lane 0 & PCIe1 lane0,

Rgds

Chandra