Greetings!
Thought I'd run a general SRIO question out there to the experts at large. First, my specifics for background:
- custom board with TMS320C6678 connected to an FPGA
- simple point to point link, no other devices connected
- 4 SRIO 3.125 Gbps channels.
- DSP is master, FPGA is slave. We only use NREAD and NWRITE
- payload size is on the order of 8K Bytes both directions
- the TX and RX data are independent, there are no system level data concurrency requirements
- the links / data passing is all working fine, it's only the bandwidth that is an issue
What I am seeing is that the bandwidth is fine in either direction when the other direction is in a
gap, but when the RX and TX activity line up, I see the throughput go down. For example, RX /
NREAD data causes large gaps in the TX / NWRITE data (looking at FPGA signals on a logic
analyzer). We tried increasing the priority of the NWRITEs, but this made no difference. I suspect
that the NWRITE packets are getting stalled in the FPGA physical layer FIFO (perhaps while a
bunch of NREADs wait to be serviced via the transport layer, etc), so I am looking into that. Note
that the source / sink in the FPGA can handle the full channel bandwidth in both directions, so
my custom FPGA logic should not be the cause. The FPGA SRIO core on the other hand could be.
But, what I am now curious about is whether I am seeking a realistic goal?
I have looked at SPRABK5A and the SRIO throughput it shows makes sense, but it only shows
RX or TX alone.
Does anyone know of any real world measurements of throughput during SRIO full duplex with a
C66x DSP as master? I'd just like to be able to rule out a fundamental issue with the DSP.
Then of course, if examples exist where the throughput on both RX and TX are good in full duplex,
then the next question is does anyone have suggestions as to what is the best way to architect
the system / structure the data requests so as to not lose bandwidth?
Do I need to manage the data requests in the DSP so that NREADs don't block NWRITEs, or
vice versa? This would be unfortunate, since the RX and TX data streams are fully independent,
and there is no system requirement to have one side's behavior depend upon the other. Am
considering changing the RX side of the DSP to be a slave to the FPGA, but that's a fair amount of
design activity I'd prefer to avoid if I don't have to.
Any insight would be mucho appreciated.
Dale