This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/TDA2PXEVM: NVMe SSD write speed low performance

Part Number: TDA2PXEVM

Tool/software: Linux

Dear TI,

we have two Intel Optane 900P NVMe SSD's - one is 2.5" U.2 and the other is HHHL Form Factor.
We connected the SSD to TDA2P EVA board and we have issues regarding the write speed (this applies too both SSD's).

The SSD is connected over PCIe and the EVA board sees the SSD as GEN2 x2 PCIe device.
When we use dd command to write data to the SSD, we get on average 200MBps write speed.

The dd command is as follows:

dd if=/dev/zero of=/run/media/nvme0n1p1/test.raw bs=128k count=10k


We tested both SSD's on two different PC's (one PC has a GEN2 PCIe controller and the other one has a GEN3 PCIe controller).
In both cases the write speed was on average 1.4GBps. On PC the SSD is configured as GEN2 or GEN3 x4 PCIe device.
Based on the test results on PC, we where expecting to get 600MBps ~ 700MBps write speed when we use the EVA board.

Can you help us with this issue?


Regards,
Stefan

  • Hi Stefan,

    What's the intended use-case? Reason I ask is dd may not be the correct test to measure throughput, since there may be CPU cycles involved.
    When, I try the same test-case at my end, I notice a throughput (across both lanes) of 260MBps and ~300MBps when the bs=2MB and the size is 1K.
    Could you provide the CPU load when you run the test case at your end on the EVM?

    Regards
    Shravan
  • Hi Shravan,

    we intend to write two camera streams (1920x1080 60fps) to the SSD.

    On our side we measured the write speed with your dd command and we only got 150MBps with two-lanes PCIe GEN2.
    The A15 load goes from 30% to 50% on both cores (measured using htop).

    As we increase the count in dd command, the write speed is decreasing, for example:
    bs=2M count=2k  - write speed ~140MBps
    bs=2M count=3k  - write speed ~124MBps
    bs=2M count=4k  - write speed ~116MBps

    When we try:
    bs=2M count=300  - the write speed goes from 103MBps to 256Mbps, we tried 6 times the dd command with count=300 and this is what we get:

    256 MBps
    164 MBps
    211 MBps
    103 MBps
    233 MBps
    125 MBps

    We observed that the L3 error appears when we invoke dd for the first time after EVM power up:
    [   51.007506] omap_l3_noc 44000000.ocp: L3 application error: target 5 mod:1 (unclearable)
    [   51.016556] omap_l3_noc 44000000.ocp: L3 debug error: target 5 mod:1 (unclearable)

    Regards,
    Stefan

  • Hi Stefan,

    For the use-case the desired throughput at minimum is 470MBps?
    Can I have the lspci -vv output on the PC? I essentially want to know the read-request and write size on the PC.
    It will also help if I can have the lspci -vv output on the EVM.

    Regards
    Shravan

  • Hi Shravan,

    yes, the desired minimal throughput should be at least 470MBps.

    Please, find bellow attached files with output from lspci -vv command from TDA2P with SSD and from PC with SSD.

    Regards,
    Stefan

    lspci_pc.txt
    sudo lspci -vv
    00:00.0 Host bridge: Intel Corporation Intel Kaby Lake Host Bridge (rev 05)
    	Subsystem: ASUSTeK Computer Inc. Device 8694
    	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
    	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
    	Latency: 0
    	Capabilities: [e0] Vendor Specific Information: Len=10 <?>
    
    00:02.0 VGA compatible controller: Intel Corporation HD Graphics 630 (rev 04) (prog-if 00 [VGA controller])
    	DeviceName:  Onboard IGD
    	Subsystem: ASUSTeK Computer Inc. Device 8694
    	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    	Latency: 0, Cache Line Size: 64 bytes
    	Interrupt: pin A routed to IRQ 131
    	Region 0: Memory at f6000000 (64-bit, non-prefetchable) [size=16M]
    	Region 2: Memory at e0000000 (64-bit, prefetchable) [size=256M]
    	Region 4: I/O ports at f000 [size=64]
    	[virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
    	Capabilities: [40] Vendor Specific Information: Len=0c <?>
    	Capabilities: [70] Express (v2) Root Complex Integrated Endpoint, MSI 00
    		DevCap:	MaxPayload 128 bytes, PhantFunc 0
    			ExtTag- RBE+
    		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
    			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
    			MaxPayload 128 bytes, MaxReadReq 128 bytes
    		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
    		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
    		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
    	Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit-
    		Address: fee02004  Data: 4024
    	Capabilities: [d0] Power Management version 2
    		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
    		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
    	Capabilities: [100 v1] #1b
    	Capabilities: [200 v1] Address Translation Service (ATS)
    		ATSCap:	Invalidate Queue Depth: 00
    		ATSCtl:	Enable-, Smallest Translation Unit: 00
    	Capabilities: [300 v1] #13
    	Kernel driver in use: i915
    	Kernel modules: i915
    
    00:14.0 USB controller: Intel Corporation 200 Series PCH USB 3.0 xHCI Controller (prog-if 30 [XHCI])
    	Subsystem: ASUSTeK Computer Inc. Device 8694
    	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    	Latency: 0
    	Interrupt: pin A routed to IRQ 121
    	Region 0: Memory at f7130000 (64-bit, non-prefetchable) [size=64K]
    	Capabilities: [70] Power Management version 2
    		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0-,D1-,D2-,D3hot+,D3cold+)
    		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    	Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
    		Address: 00000000fee80004  Data: 4021
    	Kernel driver in use: xhci_hcd
    
    00:16.0 Communication controller: Intel Corporation 200 Series PCH CSME HECI #1
    	Subsystem: ASUSTeK Computer Inc. Device 8694
    	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    	Latency: 0
    	Interrupt: pin A routed to IRQ 132
    	Region 0: Memory at f714d000 (64-bit, non-prefetchable) [size=4K]
    	Capabilities: [50] Power Management version 3
    		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
    		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    	Capabilities: [8c] MSI: Enable+ Count=1/1 Maskable- 64bit+
    		Address: 00000000fee04004  Data: 4023
    	Kernel driver in use: mei_me
    	Kernel modules: mei_me
    
    00:17.0 SATA controller: Intel Corporation 200 Series PCH SATA controller [AHCI mode] (prog-if 01 [AHCI 1.0])
    	Subsystem: ASUSTeK Computer Inc. Device 8694
    	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    	Latency: 0
    	Interrupt: pin A routed to IRQ 122
    	Region 0: Memory at f7148000 (32-bit, non-prefetchable) [size=8K]
    	Region 1: Memory at f714c000 (32-bit, non-prefetchable) [size=256]
    	Region 2: I/O ports at f090 [size=8]
    	Region 3: I/O ports at f080 [size=4]
    	Region 4: I/O ports at f060 [size=32]
    	Region 5: Memory at f714b000 (32-bit, non-prefetchable) [size=2K]
    	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
    		Address: fee10004  Data: 4023
    	Capabilities: [70] Power Management version 3
    		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
    		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    	Capabilities: [a8] SATA HBA v1.0 BAR4 Offset=00000004
    	Kernel driver in use: ahci
    	Kernel modules: ahci
    
    00:1c.0 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #5 (rev f0) (prog-if 00 [Normal decode])
    	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
    	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    	Latency: 0, Cache Line Size: 64 bytes
    	Interrupt: pin A routed to IRQ 16
    	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
    	I/O behind bridge: 00002000-00002fff
    	Memory behind bridge: c8000000-c81fffff
    	Prefetchable memory behind bridge: 00000000c8200000-00000000c83fffff
    	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
    	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
    		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
    	Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
    		DevCap:	MaxPayload 256 bytes, PhantFunc 0
    			ExtTag- RBE+
    		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
    			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
    			MaxPayload 128 bytes, MaxReadReq 128 bytes
    		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
    		LnkCap:	Port #5, Speed 8GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <4us
    			ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
    		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk-
    			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
    		LnkSta:	Speed 2.5GT/s, Width x0, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
    		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+
    			Slot #0, PowerLimit 0.000W; Interlock- NoCompl+
    		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
    			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
    		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock-
    			Changed: MRL- PresDet- LinkState-
    		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
    		RootCap: CRSVisible-
    		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
    		DevCap2: Completion Timeout: Range ABC, TimeoutDis+, LTR+, OBFF Via WAKE# ARIFwd+
    		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
    		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
    			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
    			 Compliance De-emphasis: -6dB
    		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
    			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
    	Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
    		Address: 00000000  Data: 0000
    	Capabilities: [90] Subsystem: ASUSTeK Computer Inc. Device 8694
    	Capabilities: [a0] Power Management version 3
    		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
    		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
    	Kernel driver in use: pcieport
    	Kernel modules: shpchp
    
    00:1c.5 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #6 (rev f0) (prog-if 00 [Normal decode])
    	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
    	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    	Latency: 0, Cache Line Size: 64 bytes
    	Interrupt: pin B routed to IRQ 17
    	Bus: primary=00, secondary=02, subordinate=03, sec-latency=0
    	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
    	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
    		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
    	Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
    		DevCap:	MaxPayload 256 bytes, PhantFunc 0
    			ExtTag- RBE+
    		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
    			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
    			MaxPayload 128 bytes, MaxReadReq 128 bytes
    		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
    		LnkCap:	Port #6, Speed 8GT/s, Width x1, ASPM not supported, Exit Latency L0s <1us, L1 <16us
    			ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
    		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
    			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
    		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
    		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
    			Slot #9, PowerLimit 10.000W; Interlock- NoCompl+
    		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
    			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
    		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
    			Changed: MRL- PresDet- LinkState+
    		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
    		RootCap: CRSVisible-
    		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
    		DevCap2: Completion Timeout: Range ABC, TimeoutDis+, LTR+, OBFF Not Supported ARIFwd+
    		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
    		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
    			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
    			 Compliance De-emphasis: -6dB
    		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
    			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
    	Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
    		Address: 00000000  Data: 0000
    	Capabilities: [90] Subsystem: ASUSTeK Computer Inc. Device 8694
    	Capabilities: [a0] Power Management version 3
    		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
    		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
    	Capabilities: [100 v1] Advanced Error Reporting
    		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
    		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
    		UESvrt:	DLP+ SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
    		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
    		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
    		AERCap:	First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
    	Capabilities: [140 v1] Access Control Services
    		ACSCap:	SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
    		ACSCtl:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
    	Capabilities: [220 v1] #19
    	Kernel driver in use: pcieport
    	Kernel modules: shpchp
    
    00:1d.0 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #9 (rev f0) (prog-if 00 [Normal decode])
    	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
    	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    	Latency: 0, Cache Line Size: 64 bytes
    	Interrupt: pin A routed to IRQ 16
    	Bus: primary=00, secondary=04, subordinate=04, sec-latency=0
    	Memory behind bridge: f7000000-f70fffff
    	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
    	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
    		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
    	Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
    		DevCap:	MaxPayload 256 bytes, PhantFunc 0
    			ExtTag- RBE+
    		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
    			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
    			MaxPayload 256 bytes, MaxReadReq 128 bytes
    		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
    		LnkCap:	Port #9, Speed 8GT/s, Width x4, ASPM not supported, Exit Latency L0s <1us, L1 <16us
    			ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
    		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
    			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
    		LnkSta:	Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt+
    		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
    			Slot #12, PowerLimit 25.000W; Interlock- NoCompl+
    		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
    			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
    		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
    			Changed: MRL- PresDet- LinkState+
    		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
    		RootCap: CRSVisible-
    		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
    		DevCap2: Completion Timeout: Range ABC, TimeoutDis+, LTR+, OBFF Not Supported ARIFwd+
    		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd+
    		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
    			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
    			 Compliance De-emphasis: -6dB
    		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+
    			 EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
    	Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
    		Address: 00000000  Data: 0000
    	Capabilities: [90] Subsystem: ASUSTeK Computer Inc. Device 8694
    	Capabilities: [a0] Power Management version 3
    		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
    		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
    	Capabilities: [100 v1] Advanced Error Reporting
    		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
    		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
    		UESvrt:	DLP+ SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
    		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
    		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
    		AERCap:	First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
    	Capabilities: [140 v1] Access Control Services
    		ACSCap:	SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
    		ACSCtl:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
    	Capabilities: [220 v1] #19
    	Kernel driver in use: pcieport
    	Kernel modules: shpchp
    
    00:1f.0 ISA bridge: Intel Corporation 200 Series PCH LPC Controller (B250)
    	Subsystem: ASUSTeK Computer Inc. Device 8694
    	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
    	Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    	Latency: 0
    
    00:1f.2 Memory controller: Intel Corporation 200 Series PCH PMC
    	Subsystem: ASUSTeK Computer Inc. Device 8694
    	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
    	Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    	Latency: 0
    	Region 0: Memory at f7144000 (32-bit, non-prefetchable) [size=16K]
    
    00:1f.3 Audio device: Intel Corporation 200 Series PCH HD Audio
    	Subsystem: ASUSTeK Computer Inc. Device 86ca
    	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    	Latency: 32, Cache Line Size: 64 bytes
    	Interrupt: pin A routed to IRQ 133
    	Region 0: Memory at f7140000 (64-bit, non-prefetchable) [size=16K]
    	Region 4: Memory at f7120000 (64-bit, non-prefetchable) [size=64K]
    	Capabilities: [50] Power Management version 3
    		Flags: PMEClk- DSI- D1- D2- AuxCurrent=55mA PME(D0-,D1-,D2-,D3hot+,D3cold+)
    		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    	Capabilities: [60] MSI: Enable+ Count=1/1 Maskable- 64bit+
    		Address: 00000000fee08004  Data: 4023
    	Kernel driver in use: snd_hda_intel
    	Kernel modules: snd_hda_intel
    
    00:1f.4 SMBus: Intel Corporation 200 Series PCH SMBus Controller
    	Subsystem: ASUSTeK Computer Inc. Device 8694
    	Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
    	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    	Interrupt: pin A routed to IRQ 11
    	Region 0: Memory at f714a000 (64-bit, non-prefetchable) [size=256]
    	Region 4: I/O ports at f040 [size=32]
    	Kernel modules: i2c_i801
    
    00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-V
    	Subsystem: ASUSTeK Computer Inc. Ethernet Connection (2) I219-V
    	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    	Latency: 0
    	Interrupt: pin A routed to IRQ 123
    	Region 0: Memory at f7100000 (32-bit, non-prefetchable) [size=128K]
    	Capabilities: [c8] Power Management version 3
    		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
    		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
    	Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
    		Address: 00000000fee40004  Data: 4021
    	Capabilities: [e0] PCI Advanced Features
    		AFCap: TP+ FLR+
    		AFCtrl: FLR-
    		AFStatus: TP-
    	Kernel driver in use: e1000e
    	Kernel modules: e1000e
    
    02:00.0 PCI bridge: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge (rev 04) (prog-if 00 [Normal decode])
    	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
    	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    	Latency: 0, Cache Line Size: 64 bytes
    	Interrupt: pin A routed to IRQ 10
    	Bus: primary=02, secondary=03, subordinate=03, sec-latency=32
    	Secondary status: 66MHz+ FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
    	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
    		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
    	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
    		Address: 0000000000000000  Data: 0000
    	Capabilities: [78] Power Management version 3
    		Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
    		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    	Capabilities: [80] Express (v1) PCI-Express to PCI/PCI-X Bridge, MSI 00
    		DevCap:	MaxPayload 128 bytes, PhantFunc 0
    			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+
    		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
    			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ BrConfRtry-
    			MaxPayload 128 bytes, MaxReadReq 512 bytes
    		DevSta:	CorrErr- UncorrErr+ FatalErr- UnsuppReq- AuxPwr- TransPend-
    		LnkCap:	Port #1, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <512ns, L1 <2us
    			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
    		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
    			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
    		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
    	Capabilities: [c0] Subsystem: ASUSTeK Computer Inc. ASM1083/1085 PCIe to PCI Bridge
    	Capabilities: [100 v1] Virtual Channel
    		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
    		Arb:	Fixed- WRR32- WRR64- WRR128-
    		Ctrl:	ArbSelect=Fixed
    		Status:	InProgress-
    		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
    			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
    			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
    			Status:	NegoPending- InProgress-
    	Kernel modules: shpchp
    
    04:00.0 Non-Volatile memory controller: Intel Corporation Device 2700 (prog-if 02 [NVM Express])
    	Subsystem: Intel Corporation Device 3901
    	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    	Latency: 0, Cache Line Size: 64 bytes
    	Interrupt: pin A routed to IRQ 16
    	Region 0: Memory at f7010000 (64-bit, non-prefetchable) [size=16K]
    	Expansion ROM at f7000000 [disabled] [size=64K]
    	Capabilities: [40] Power Management version 3
    		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
    		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    	Capabilities: [50] MSI-X: Enable+ Count=32 Masked-
    		Vector table: BAR=0 offset=00002000
    		PBA: BAR=0 offset=00003000
    	Capabilities: [60] Express (v2) Endpoint, MSI 00
    		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 <4us
    			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
    		DevCtl:	Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
    			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
    			MaxPayload 256 bytes, MaxReadReq 512 bytes
    		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
    		LnkCap:	Port #0, Speed 8GT/s, Width x4, ASPM L0s, Exit Latency L0s <4us, L1 unlimited
    			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
    		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
    			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
    		LnkSta:	Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
    		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
    		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
    		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
    			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
    			 Compliance De-emphasis: -6dB
    		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
    			 EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
    	Capabilities: [100 v1] Advanced Error Reporting
    		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
    		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
    		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
    		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
    		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
    		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
    	Capabilities: [150 v1] Virtual Channel
    		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
    		Arb:	Fixed- WRR32- WRR64- WRR128-
    		Ctrl:	ArbSelect=Fixed
    		Status:	InProgress-
    		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
    			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
    			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
    			Status:	NegoPending- InProgress-
    	Capabilities: [180 v1] Power Budgeting <?>
    	Capabilities: [190 v1] Alternative Routing-ID Interpretation (ARI)
    		ARICap:	MFVC- ACS-, Next Function: 0
    		ARICtl:	MFVC- ACS-, Function Group: 0
    	Capabilities: [270 v1] Device Serial Number 55-cd-2e-41-4e-3e-75-a0
    	Capabilities: [2a0 v1] #19
    	Kernel driver in use: nvme
    
    

    lspci_tda2p.txt
    root@dra7xx-evm:~# lspci -vv
    00:00.0 PCI bridge: Texas Instruments Device 8888 (rev 01) (prog-if 00 [Normal decode])
            Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
            Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
            Latency: 0, Cache Line Size: 64 bytes
            Interrupt: pin A routed to IRQ 456
            Region 0: Memory at 20100000 (32-bit, non-prefetchable) [size=1M]
            Region 1: Memory at 20020000 (32-bit, non-prefetchable) [size=64K]
            Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
            Memory behind bridge: 20200000-202fffff
            Prefetchable memory behind bridge: 20300000-203fffff
            Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
            BridgeCtl: Parity+ SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
                    PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
            Capabilities: [40] Power Management version 3
                    Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA PME(D0+,D1+,D2-,D3hot+,D3cold-)
                    Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
            Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
                    Address: 00000000ae883000  Data: 0000
            Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00
                    DevCap: MaxPayload 256 bytes, PhantFunc 0
                            ExtTag- RBE+
                    DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
                            RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                            MaxPayload 128 bytes, MaxReadReq 512 bytes
                    DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                    LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM L0s L1, Exit Latency L0s <512ns, L1 <64us
                            ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
                    LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk+
                            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                    LnkSta: Speed 5GT/s, Width x2, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt+
                    RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible-
                    RootCap: CRSVisible-
                    RootSta: PME ReqID 0000, PMEStatus- PMEPending-
                    DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd-
                    DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
                    LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                             Compliance De-emphasis: -6dB
                    LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                             EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
            Capabilities: [100 v2] Advanced Error Reporting
                    UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                    UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                    UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                    CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                    CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                    AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
            Kernel driver in use: pcieport
    
    01:00.0 Non-Volatile memory controller: Intel Corporation Device 2700 (prog-if 02 [NVM Express])
            Subsystem: Intel Corporation Device 3901
            Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
            Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
            Latency: 0, Cache Line Size: 64 bytes
            Interrupt: pin A routed to IRQ 488
            Region 0: Memory at 20200000 (64-bit, non-prefetchable) [size=16K]
            [virtual] Expansion ROM at 20300000 [disabled] [size=64K]
            Capabilities: [40] Power Management version 3
                    Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                    Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
            Capabilities: [50] MSI-X: Enable- Count=32 Masked-
                    Vector table: BAR=0 offset=00002000
                    PBA: BAR=0 offset=00003000
            Capabilities: [60] Express (v2) Endpoint, MSI 00
                    DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 <4us
                            ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
                    DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                            RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
                            MaxPayload 128 bytes, MaxReadReq 512 bytes
                    DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                    LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s, Exit Latency L0s <4us, L1 unlimited
                            ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                    LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                    LnkSta: Speed 5GT/s, Width x2, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                    DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
                    DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                    LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                             Compliance De-emphasis: -6dB
                    LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                             EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
            Capabilities: [100 v1] Advanced Error Reporting
                    UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                    UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                    UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                    CESta:  RxErr+ BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                    CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                    AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
            Capabilities: [150 v1] Virtual Channel
                    Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
                    Arb:    Fixed- WRR32- WRR64- WRR128-
                    Ctrl:   ArbSelect=Fixed
                    Status: InProgress-
                    VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
                            Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
                            Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
                            Status: NegoPending- InProgress-
            Capabilities: [180 v1] Power Budgeting <?>
            Capabilities: [190 v1] Alternative Routing-ID Interpretation (ARI)
                    ARICap: MFVC- ACS-, Next Function: 0
                    ARICtl: MFVC- ACS-, Function Group: 0
            Capabilities: [270 v1] Device Serial Number 55-cd-2e-41-4e-3e-75-a0
            Capabilities: [2a0 v1] #19
            Kernel driver in use: nvme
            Kernel modules: nvme
    
    

  • Hi Stefan,

    As suspected, dd values are off the actual throughput.

    I have a small app (written in C) created which writes to a pre-allocated file in the SSD. I'm getting througput of 415MBps.

    Please follow the below steps:

    1. Create a file on the SSD using fallocate using the below command (this avoids i/o overheads)

        fallocate -l 1G /run/media/nvme/test.raw # Creates 1GB file

    2. Copy the throughput.c file attached to the target and compile it on the target

         gcc throughput.c # creates object a.out

    3. Execute the script updatefreq.sh, this ensures the scaling-governor is set to performance mode (ensuring A15 runs at max frequency)

    4. Execute a.out.

    You will notice about 980MB being transferred in about 2.2s which is roughly a througput of 440MBps. 

    Regards

    Shravan

    throughput.c.txt
    #include <unistd.h>
    #include <fcntl.h>
    #include <time.h>
    #include <stdio.h>
    int main(void)
    {
     int val=0;
     int arr[1000000] = {0};
     int filedesc = open("/media/nvme/test.raw", O_WRONLY);
     clock_t t; 
     if (filedesc < 0) {
      return -1;
     }
     t = clock();
     while(val < 256)
     {
      write(filedesc, arr, sizeof(arr));
      val++;
     }
     t = clock() -t;
     double time_taken = ((double)t)/CLOCKS_PER_SEC;
      printf("fun() took %f seconds to execute \n", time_taken);
     close(filedesc);
         return 0;
    }
    

    updatefreq.sh.txt
    echo "performance" > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
    

  • Hi Shravan,

    we followed the steps that you described in previous reply and compiled your test application.
    We only get about 290MB/s write speed.
    We also tested your application with 2GB and 3GB file sizes (we modified your example, we also created both files with fallocate)
    and the writing speeds are about 290MB/s.

    We noticed in  test application that clock() function is used to measure time. As far as we know clock() function measures only
    system time. When we invoke your test application with time command, we get:

    root@dra7xx-evm:~/ssd_write_test# time ./a.out
    fun() took 3.528974 seconds to execute
    real    0m 5.63s
    user    0m 0.00s
    sys     0m 3.53s

    Do we need to consider the real time into speed calculation, which is 5.63s for this case?

    We also wrote our test application (which is very similar to yours) and there we use the gettimeofday function to measure time.
    Is it correct to use gettimeofday to measure elapsed time, instead of clock()?

    Regards,
    Stefan

  • Hi Stefan,


    You're right, clock only measures system time, when we should be measuring real time. However, I ran the 'time' command-line utility, which gives system and real time, and I've posted the results below. Please note, delete the existing test.raw file, and then allocate a new test.raw file using the fallocate function. This will save the time taken to write (fallocate reserves blocks in the FS, thereby reducing io overheads). With the scaling governor set to performance, I'm able to achieve consistently a throughput of 420-430MBps (the application is writing 976.6MB data, ran it 5 times, each time the value is in the mentioned range).


    Regards

    Shravan

  • Hi Shravan,

    we repeated the write test on two boards.
    The first board is TDA2P EVA and the second is our custom board based on TDA2P.

    We tested the write speed seven times on each board. The results are attached bellow.
    We observed that the TDA2P EVA is slightly faster than our custom board but the real
    time is still larger than yours.

    We also have concerns about the use of fallocate, because we are going to write video streams
    to the SSD without knowing how big the resulting file will be.

    We are going to use our custom TDA2P board in our final product.

    Is there a way to read out the SOC revision on TDA2P? Maybe we are using different revisions of SOCs.

    Regards,
    Stefan



    tda2p_eva_write_speed.txt
    // TDA2P EVA
    
    root@dra7xx-evm:~/ssd_write_test# echo "performance" > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
    root@dra7xx-evm:~/ssd_write_test# rm /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# fallocate -l 1G /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# time ./a.out
    fun() took 2.263484 seconds to execute
    real    0m 3.44s
    user    0m 0.00s
    sys     0m 2.26s
    root@dra7xx-evm:~/ssd_write_test# rm /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# fallocate -l 1G /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# time ./a.out
    fun() took 2.293916 seconds to execute
    real    0m 3.04s
    user    0m 0.00s
    sys     0m 2.29s
    root@dra7xx-evm:~/ssd_write_test# rm /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# fallocate -l 1G /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# time ./a.out
    fun() took 2.279101 seconds to execute
    real    0m 3.55s
    user    0m 0.00s
    sys     0m 2.28s
    root@dra7xx-evm:~/ssd_write_test# rm /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# fallocate -l 1G /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# time ./a.out
    fun() took 2.300654 seconds to execute
    real    0m 3.53s
    user    0m 0.00s
    sys     0m 2.30s
    root@dra7xx-evm:~/ssd_write_test# rm /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# fallocate -l 1G /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# time ./a.out
    fun() took 2.291808 seconds to execute
    real    0m 3.42s
    user    0m 0.00s
    sys     0m 2.29s
    root@dra7xx-evm:~/ssd_write_test# rm /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# fallocate -l 1G /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# time ./a.out
    fun() took 2.247803 seconds to execute
    real    0m 3.93s
    user    0m 0.00s
    sys     0m 2.25s
    root@dra7xx-evm:~/ssd_write_test# rm /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# fallocate -l 1G /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# time ./a.out
    fun() took 2.286498 seconds to execute
    real    0m 3.50s
    user    0m 0.00s
    sys     0m 2.29s
    
    

    tda2p_custom_write_speed.txt
    // TDA2P CUSTOM
    
    root@dra7xx-evm:~/ssd_write_test# echo "performance" > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
    root@dra7xx-evm:~/ssd_write_test# rm /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# fallocate -l 1G /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# time ./a.out
    [   51.006279] omap_l3_noc 44000000.ocp: L3 application error: target 5 mod:1 (unclearable)
    [   51.014428] omap_l3_noc 44000000.ocp: L3 debug error: target 5 mod:1 (unclearable)
    fun() took 3.464953 seconds to execute
    real    0m 4.60s
    user    0m 0.00s
    sys     0m 3.47s
    root@dra7xx-evm:~/ssd_write_test# rm /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# fallocate -l 1G /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# time ./a.out
    fun() took 3.498544 seconds to execute
    real    0m 4.71s
    user    0m 0.02s
    sys     0m 3.48s
    root@dra7xx-evm:~/ssd_write_test# rm /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# fallocate -l 1G /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# time ./a.out
    fun() took 3.482949 seconds to execute
    real    0m 4.54s
    user    0m 0.00s
    sys     0m 3.49s
    root@dra7xx-evm:~/ssd_write_test# rm /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# fallocate -l 1G /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# time ./a.out
    fun() took 3.445009 seconds to execute
    real    0m 4.66s
    user    0m 0.00s
    sys     0m 3.45s
    root@dra7xx-evm:~/ssd_write_test# rm /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# fallocate -l 1G /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# time ./a.out
    fun() took 3.447198 seconds to execute
    real    0m 4.44s
    user    0m 0.00s
    sys     0m 3.45s
    root@dra7xx-evm:~/ssd_write_test# rm /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# fallocate -l 1G /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# time ./a.out
    fun() took 3.501788 seconds to execute
    real    0m 5.06s
    user    0m 0.02s
    sys     0m 3.49s
    root@dra7xx-evm:~/ssd_write_test# rm /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# fallocate -l 1G /run/media/nvme0n1/test.raw
    root@dra7xx-evm:~/ssd_write_test# time ./a.out
    fun() took 3.433157 seconds to execute
    real    0m 5.77s
    user    0m 0.00s
    sys     0m 3.44s
    
    

  • Hi Shravan,

    is there any chance that you provide to us your whole SD card image (boot partition and rootfs partition)?

    We want to be 100% sure that we have the same software as you do.

    If you cannot upload it to the forum, we can provide you links for upload on our servers. We would like to confirm that our HW is functioning correctly, and to check possible differences in SW.

    Regards,
    Stefan

  • Hi Stefan,

    Sorry about the delayed response. I ran the below use-cases and below are the results I observed (note I'm allocating 2GB files, since files greater than 2GB fail due to 32bit addressing in J6):

    1. Fallocate 1 2GB file, write to file -- throughput is ~420-450MBps
    2. Fallocate 1 2GB file, write to file, run sync (to flush cache) -- throughput ~380-410MBps
    3. Fallocate 5 2GB files, write to file -- througput is ~340-380MBps.

    So while fallocate does improve throughput (without fallocate throughput < 290MBps), there could be issues w.r.t caching of buffers.

    Can you provide the RAM size on the board (my EVM has 4GB ram)? Can you also run the below command to get the frequency at which your board runs?

    omapconf show opp

    I think there may be certain bottle-necks which may need further inspection, I will keep you posted if I do find anything.

    Regards
    Shravan
  • Hi Shravan,

    we no longer have the TDA2P EVM board. From now on we only use our custom TDA2P based board.
    Our board is designed based on TDA2P EVM.
    We have 4GB of RAM memory (2GB connected to EMIF1 and 2GB connected to EMIF2).

    From Linux we are able to see only ~520MB of RAM (533244 kB).
    We didn't modified the memory map segment definition in Vision SDK.

    The output of omapconf is attached bellow.

    Regards,
    Stefan

    omapconf.txt
    root@dra7xx-evm:~# omapconf show opp
    OMAPCONF (rev v1.73-17-g578778b built Thu Dec 28 05:15:12 IST 2017)
    
    HW Platform:
      Generic DRA74X (Flattened Device Tree)
      DRA76X ES1.0 GP Device (STANDARD performance (1.0GHz))
    Error: I2C Read failed
    Error: I2C Read failed
    Error: I2C Read failed
      UNKNOWN POWER IC
    
    SW Build Details:
      Build:
        Version:  _                    _                    
      Kernel:
        Version: 4.4.84
        Author: rtrk@rtrkn187-lin
        Toolchain: gcc version 5.3.1 20160113 (Linaro GCC 5.3-2016.02)
        Type: #53 SMP PREEMPT
        Date: Thu Jul 12 13:17:48 CEST 2018
    
    |-----------------------------------------------------------------------------------|
    |                        | Temperature | Voltage | Frequency      | OPerating Point |
    |-----------------------------------------------------------------------------------|
    | VDD_CORE / VDD_CORE0   | 40C / 104F  | NA      |                | NOM             |
    |   L3                   |             |         |  266  MHz      |                 |
    |   DMM                  |             |         |  266  MHz      |                 |
    |   EMIF1                |             |         |  266  MHz      |                 |
    |   EMIF2                |             |         |  266  MHz      |                 |
    |     LP-DDR2            |             |         |  666  MHz      |                 |
    |   L4                   |             |         |  266  MHz      |                 |
    |   IPU1                 |             |         | (2128 MHz) (1) |                 |
    |     Cortex-M4 Cores    |             |         | (1064 MHz) (1) |                 |
    |   IPU2                 |             |         |  2128 MHz      |                 |
    |     Cortex-M4 Cores    |             |         |  1064 MHz      |                 |
    |   DSS                  |             |         |  192  MHz      |                 |
    |   BB2D                 |             |         | (2128 MHz) (1) |                 |
    |                        |             |         |                |                 |
    | VDD_MPU / VDD_CORE1    | 42C / 107F  | NA      |                | NOM             |
    |   MPU (CPU1 ON)        |             |         |  1000 MHz      |                 |
    |                        |             |         |                |                 |
    | VDD_GPU / VDD_CORE2    | 41C / 105F  | NA      |                | HIGH            |
    |   GPU                  |             |         |  532  MHz      |                 |
    |                        |             |         |                |                 |
    | VDD_DSPEVE / VDD_CORE3 | 40C / 104F  | NA      |                | NOM             |
    |   DSP1                 |             |         |  850  MHz      |                 |
    |   DSP2                 |             |         |  850  MHz      |                 |
    |   EVE1                 |             |         |  535  MHz      |                 |
    |   EVE2                 |             |         |  535  MHz      |                 |
    |                        |             |         |                |                 |
    | VDD_IVA / VDD_CORE4    | 41C / 105F  | NA      |                | HIGH            |
    |   IVA                  |             |         |  532  MHz      |                 |
    |                        |             |         |                |                 |
    |-----------------------------------------------------------------------------------|
    
    Notes:
      (1) Module is disabled, rate may not be relevant.
    

  • Hi Stefan,

    I'm not sure I understand what you mean when you say 'from Linux we are able to see only ~520MB RAM'? Are you adding a boot-argument in kernel indicating mem=512M? If yes, any particular reason? The disparity in numbers could be because of caching (in our case high-mem free is 3.2GB and low-mem free is 600MB). You may want to send across the output of the below command.

    cat /proc/meminfo


    Also can you send the omapconf show opp output after running the updatefreq.sh script? I want to see what's the ARM frequency when the scaling governor is set to performance.

    Regards
    Shravan

  • Hi Shravan,

    'from Linux we are able to see only ~520MB RAM' means that when we run the htop ot top command it shows us that 520MB of RAM is available.

    We use the default uEnv.txt file provided by VisionSDK and we didn't modified the default boot-arguments. Default boot-arguments look as follow:
    args_mmc=part uuid mmc 0:2 uuid; setenv bootargs "console=ttyS0,115200n8 vram=16M root=PARTUUID=${uuid} rw rootwait ip=none mem=1024M"

    The contents of meminfo and the output of omapconf show opp are attached bellow.

    Regards,
    Stefan

    meminfo.txt
    root@dra7xx-evm:~# cat /proc/meminfo
    MemTotal:         534016 kB
    MemFree:          283484 kB
    MemAvailable:     410144 kB
    Buffers:            8320 kB
    Cached:           115016 kB
    SwapCached:            0 kB
    Active:            55588 kB
    Inactive:          80452 kB
    Active(anon):      14092 kB
    Inactive(anon):     6856 kB
    Active(file):      41496 kB
    Inactive(file):    73596 kB
    Unevictable:           0 kB
    Mlocked:               0 kB
    HighTotal:        259072 kB
    HighFree:         158116 kB
    LowTotal:         274944 kB
    LowFree:          125368 kB
    SwapTotal:             0 kB
    SwapFree:              0 kB
    Dirty:                 0 kB
    Writeback:             0 kB
    AnonPages:         12700 kB
    Mapped:            15016 kB
    Shmem:              8252 kB
    Slab:              30376 kB
    SReclaimable:      19512 kB
    SUnreclaim:        10864 kB
    KernelStack:        1016 kB
    PageTables:          596 kB
    NFS_Unstable:          0 kB
    Bounce:                0 kB
    WritebackTmp:          0 kB
    CommitLimit:      267008 kB
    Committed_AS:     175240 kB
    VmallocTotal:     245760 kB
    VmallocUsed:           0 kB
    VmallocChunk:          0 kB
    CmaTotal:         204800 kB
    CmaFree:          128800 kB
    

    omap_conf.txt
    root@dra7xx-evm:~# omapconf show opp
    OMAPCONF (rev v1.73-17-g578778b built Thu Aug 31 13:16:54 IST 2017)
    
    HW Platform:
      Generic DRA74X (Flattened Device Tree)
      DRA76X ES1.0 GP Device (STANDARD performance (1.0GHz))
    Error: I2C Read failed
    Error: I2C Read failed
    Error: I2C Read failed
      UNKNOWN POWER IC
    
    SW Build Details:
      Build:
        Version:  _                    _                    
      Kernel:
        Version: 4.4.84
        Author: root@rtrkw850-lin
        Toolchain: gcc version 5.3.1 20160113 (Linaro GCC 5.3-2016.02)
        Type: #4 SMP PREEMPT
        Date: Wed Jul 11 16:14:52 CEST 2018
    
    |-----------------------------------------------------------------------------------|
    |                        | Temperature | Voltage | Frequency      | OPerating Point |
    |-----------------------------------------------------------------------------------|
    | VDD_CORE / VDD_CORE0   | 43C / 109F  | NA      |                | NOM             |
    |   L3                   |             |         |  266  MHz      |                 |
    |   DMM                  |             |         |  266  MHz      |                 |
    |   EMIF1                |             |         |  266  MHz      |                 |
    |   EMIF2                |             |         |  266  MHz      |                 |
    |     LP-DDR2            |             |         |  666  MHz      |                 |
    |   L4                   |             |         |  266  MHz      |                 |
    |   IPU1                 |             |         | (2128 MHz) (1) |                 |
    |     Cortex-M4 Cores    |             |         | (1064 MHz) (1) |                 |
    |   IPU2                 |             |         |  2128 MHz      |                 |
    |     Cortex-M4 Cores    |             |         |  1064 MHz      |                 |
    |   DSS                  |             |         |  192  MHz      |                 |
    |   BB2D                 |             |         | (2128 MHz) (1) |                 |
    |                        |             |         |                |                 |
    | VDD_MPU / VDD_CORE1    | 44C / 111F  | NA      |                | NOM             |
    |   MPU (CPU1 ON)        |             |         |  1000 MHz      |                 |
    |                        |             |         |                |                 |
    | VDD_GPU / VDD_CORE2    | 44C / 111F  | NA      |                | HIGH            |
    |   GPU                  |             |         |  532  MHz      |                 |
    |                        |             |         |                |                 |
    | VDD_DSPEVE / VDD_CORE3 | 43C / 109F  | NA      |                | NOM             |
    |   DSP1                 |             |         |  850  MHz      |                 |
    |   DSP2                 |             |         |  850  MHz      |                 |
    |   EVE1                 |             |         |  535  MHz      |                 |
    |   EVE2                 |             |         |  535  MHz      |                 |
    |                        |             |         |                |                 |
    | VDD_IVA / VDD_CORE4    | 45C / 113F  | NA      |                | HIGH            |
    |   IVA                  |             |         |  532  MHz      |                 |
    |                        |             |         |                |                 |
    |-----------------------------------------------------------------------------------|
    
    Notes:
      (1) Module is disabled, rate may not be relevant.