Hi,
We have a C6678 talking to a Xilinx Virtex 6 over PCIe and have a curious problem. There are several kinds of PCIe operations involved: the DSP reading 4k-ish blocks of data in bursts; the DSP writing 600 byte blocks of data in bursts; and the FPGA raising an MSI interrupt, which underneath is a 4-byte payload write to the DSP. (The blocks are done as EDMA operations on 128-byte DBS channels, and are therefore broken into 128-byte packets by the PCIe peripheral).
If we do only one thing at a time (read, write or MSI) and make sure they don't overlap in time, everything works fine. There is no evidence of link-level corruption and we have plenty of bandwidth.
If operations overlap - especially if MSIs arrive while read or write operations are going on - then things start to fall over. The MSIs are every 40 usec and the problems typically take only a few dozen to a few hundred of these periods to appear. The MSIs need to be acknowledged, which is a 4-byte write to a memory mapped register on the FPGA, otherwise they stop coming. So this is also potentially a source of parallel transactions.
The symptoms are various: sometimes we get Completion Timeout uncorrectable errors, more often no PCIe error at all but the MSIs just stop coming, possibly because one of them is not received. If we shorten the frame lengths then things work for longer but the problems don't go away.
They are on the same custom PCB and there are no other devices on the bus. The FPGA's PCIe interface is based on Xilinx's PCIe core but is providing a memory mapped view into the output of a signal processing chain and some registers ie. it is not simply some "RAM on a stick" as in the Xilinx example application. We also support burst transactions which the Xilinx demo application did not. So if we tried to go "back to the demo" we'd have to re-write not to use burst transactions, and the data rates would be a lot lower, so that isn't an attractive option.
We do not know if the problems are on the DSP end, the FPGA end, or some interaction of the two. We cannot yet reproduce the problem in simulation on the FPGA.
So my questions are:
(a) has anyone seen something like this before?
(b) are there any settings on the DSP side that might help, or at least help diagnose the issue?
(c) we have a Blackhawk XDS560v2 System Trace pod, and I have been wondering if it could give us a detailed trace of PCIe transactions, which might help construct a simulation case to break the FPGA core. Can we do this, and if so, how do I get that kind of capture?
Thanks in advance,
Gordon