Hi
we are facing a strange issue with PCI phy, some time in out of 10 , one time it is unable to get the CLASS, What could be the issue?
Below are the details,
- only 6nodes case:
DRAGON:~# lspci
0000:00:00.0 Class 0604: Device 104c:8888 (rev 01)
0000:01:00.0 Class 0200: Device 14e4:b340 (rev 01)
0000:01:00.1 Class 0200: Device 14e4:b340 (rev 01)
0001:00:00.0 Class 0604: Device 104c:8888 (rev 01)
0001:01:00.0 Class 0200: Device 14e4:b340 (rev 01)
0001:01:00.1 Class 0200: Device 14e4:b340 (rev 01)
DRAGON:~#
DRAGON:~# devmem 0x51001000 32
0x8888104C
DRAGON:~# devmem 0x51801000 32
0x8888104C
DRAGON:~# devmem 0x51001008 32
0x06040001
DRAGON:~# devmem 0x51801008 32
0x06040001
DRAGON:~#
LINK Status:
DRAGON:~# devmem 0x5180210c 32
0x00010000
DRAGON:~#
DRAGON:~# devmem 0x5100210c 32
0x00010000
DRAGON:~#
DRAGON:~# devmem 0x51002104 32
0x00000045
DRAGON:~#
DRAGON:~# devmem 0x51802104 32
0x00000045
- only 4nodes case:
DRAGON:~# lspci
0000:00:00.0 Class 0604: Device 104c:8888 (rev 01)
0000:01:00.0 Class 0200: Device 14e4:b340 (rev 01)
0000:01:00.1 Class 0200: Device 14e4:b340 (rev 01)
0001:00:00.0 Class 0000: Device 104c:8888 (rev 01)
DRAGON:~#
DRAGON:~# devmem 0x51001000 32
0x8888104C
DRAGON:~# devmem 0x51801000 32
0x8888104C
DRAGON:~# devmem 0x51001008 32
0x06040001
DRAGON:~# devmem 0x51801008 32
0x00000001
DRAGON:~#
DRAGON:~# devmem 0x51002104 32
0x00000045
DRAGON:~#
DRAGON:~# devmem 0x51802104 32
0x00000000
LINK Status:
DRAGON:~# devmem 0x5100210c 32
0x00010000
DRAGON:~# devmem 0x5180210c 32
0x00000000
DRAGON:~#
- CTRL_CORE_PHY_POWER_PCIESS1 & CTRL_CORE_PHY_POWER_PCIESS2--> Power supply control module: 4 node case
DRAGON:~# devmem 0x4a003c40 32 --> PCIESS1
0x0500C000
DRAGON:~#
DRAGON:~# devmem 0x4a003c44 32 --> PCIESS2
0x00000000
- CTRL_CORE_PHY_POWER_PCIESS1 & CTRL_CORE_PHY_POWER_PCIESS2 --> Power supply control module: 2 node case
DRAGON:~# lspci
0000:00:00.0 Class 0604: Device 104c:8888 (rev 01)
0001:00:00.0 Class 0604: Device 104c:8888 (rev 01)
DRAGON:~#
DRAGON:~# devmem 0x4a003c40 32
0x0500C000
DRAGON:~# devmem 0x4a003c44 32
0x00000000
- CTRL_CORE_PHY_POWER_PCIESS1 & CTRL_CORE_PHY_POWER_PCIESS2 --> Power supply control module: 6 node case
DRAGON:~# lspci
0000:00:00.0 Class 0604: Device 104c:8888 (rev 01)
0000:01:00.0 Class 0200: Device 14e4:b340 (rev 01)
0000:01:00.1 Class 0200: Device 14e4:b340 (rev 01)
0001:00:00.0 Class 0604: Device 104c:8888 (rev 01)
0001:01:00.0 Class 0200: Device 14e4:b340 (rev 01)
0001:01:00.1 Class 0200: Device 14e4:b340 (rev 01)
DRAGON:~# devmem 0x4a003c40 32
0x0500C000
DRAGON:~# devmem 0x4a003c44 32
0x00000000DRAGON:~#
- Without Errata i870
DRAGON:~# devmem 0x4a003c18 32
0x00000000
- Errata i870
DRAGON:~# devmem 0x4a003c18 32
0x00000003
- lspci -vvv
0001:00:00.0 Class 0000: Device 104c:8888 (rev 01)
!!! Invalid class 0000 for header type 01
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 0
Region 0: Memory at <unassigned> (32-bit, prefetchable) [disabled]
Region 1: Memory at <unassigned> (32-bit, prefetchable) [disabled]
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: 00000000-00000fff [size=4K]
Memory behind bridge: 00000000-000fffff [size=1M]
Prefetchable memory behind bridge: 00000000-000fffff [size=1M]
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR- NoISA- VGA- VGA16- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA PME(D0+,D1+,D2-,D3hot+,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM L0s L1, Exit Latency L0s <512ns, L1 <64us
ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s (downgraded), Width x1 (downgraded)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
RootCap: CRSVisible-
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, NROPrPrP+, LTR-
10BitTagComp-, 10BitTagReq-, OBFF Not Supported, ExtFmt-, EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS-, LN System CLS Not Supported, TPHComp-, ExtTPHComp-, ARIFwd-
AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
AtomicOpsCtl: ReqEn- EgressBlck-
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
RootCmd: CERptEn- NFERptEn- FERptEn-
RootSta: CERcvd- MultCERcvd- UERcvd- MultUERcvd-
FirstFatal- NonFatalMsg- FatalMsg- IntMsg 0
ErrorSrc: ERR_COR: 0000 ERR_FATAL/NONFATAL: 0000
We are using PCIe0 lane 0 & PCIe1 lane0,
Rgds
Chandra