Other Parts Discussed in Thread: MATHLIB
We are experiencing sporadic corruption of SATA read data using the TMS320C6748 hosted on both a custom developed circuit card assembly, and the TMS320C6748 LCDK, interfacing with a SATA II drive (have used several drives, including an M.2 SATA II solid state drive, and a WD SATA II HDD).
We are using the following software packages:
- XDC Tools 3.25.6.96
- SYS/BIOS 06.37.3.30 (includes FatFS R0.08a)
- C6748 PDK 2.00.00.00 (includes BIOSPSP 3.00.01.00 and NSP 1.10.00.03)
- EDMA 02.11.14.18
- DSPLIB 3.4.0.0
- MATHLIB 3.1.0.0
- C6000 Code Generation Tools (CGT) 7.4.21
- C6748 StarterWare 1.20.04.01
- SATA driver and Block Media driver from BIOSPSP 1.30.00.05
We are using Code Composer Studio 7.1.0.00016.
Some data points of interest:
- We are experiencing the same issues on our target CCA hosting the TMS320C6748, and also on the L138/C6748 Development Kit (LCDK…board rev A6). We originally thought we had layout issues on our target, but when the same behavior was seen on the LCDK, we considered our target layout validated.
- In the instances in which a sector read of the M.2 SATA II solid-state drive results in corrupted data, an external SATA Analyzer in series (Teledyne LeCroy Sierra M6-2) reports seeing the correct data, even though the C6748 returns corrupted data in our software application
- We have a software workaround in place that performs multiple reads of the same sector, which seems to alleviate the corruption. We added this workaround at the Block Media layer at the point at which a SATA sector read is commanded. We read the same sector multiple times until we get two successive reads that return the same sector contents. Most of the time, the first 2 reads return the same data (i.e. no corruption). However, roughly 0.028% of the time a 3rd read is required (i.e. data returned by first 2 reads doesn’t match, but 3rd read matches 2nd read). This works out to 280 instances out of 1,000,000 sector reads.
- If we extend our software workaround to the SATA write path (i.e. multiple reads following a SATA write to validate the data written), we have very rare cases in which we have 2 successive corroborating reads that do not match the data written. In this case, we perform the write again and validate. We have seen this at a rate of 0.0013% (e.g. 13 instances out of 1,000,000 sector writes).
- The M.2 SATA SSD and WD SATA HDD work perfectly when mounted by Linux, so the drives themselves are considered validated.
We were suspicious of some of the Port PHY Control Register values (e.g. RXEQ, RXCDR, RXTERM) but haven't been able to come up with a combination that makes any appreciable difference in our corruption. Our current values are assigned as follows:
RXEQ: 0x1 (Adaptive)
LB: 0x1 (Ultra High Bandwidth)
RXCDR: 0x6 (First order, threshold of 1 with fast lock)
RXTERM: 0x1 (Common point set to 0.8 VDDA)
Any assistance or direction would be appreciated.