Hello,
We have a case in which we need a little bit of confirmation.
The goal is to chose the best method for Xilinx Artix to TI
TMS320C6671 serial rapido I/O connection. The DSP SRIO interface pins
are connected to FPGA GTP interface. We want to achieve a fastest
possible data transfer rate (DSP (DDR3) -> FPGA, FPGA -> DSP (DDR3)).
I already have some concepts for this system, but I'm not sure if I
understand all TI documents (SPRS756, SPRUGW1, SPRUGR9, SPRUGV8,
SPRUGS5) correctly. I'll try to present my ideas and ask for help in
rating them.
1.
First of all a master - slave control connection (DSP - master, FPGA -
slave) is needed. It doesn't involve the DDR-3 memory. To achieve such
mode of operation I want to program the LSU unit in the Rocket IO
module to send read and write requests to FPGA. An already written
FPGA module will respond to those requests accordingly.
2.
High throughput data transfers. We assume, that most straight forward
implementation will be with FPGA acting as a master here.
After reading the :
KeyStone Architecture
Literature Number: SPRUGW1B
November 2012
Serial Rapid IO (SRIO)
User Guide
> This peripheral is an externally-driven slave module that is capable of acting as a master
> within the DSP chip. This means that an external device can push (burst write) data to
> the DSP as needed without having to generate an interrupt to the CPU or without
> relying on the DSP EDMA.
The concept is to let the FPGA do the job. The FPGA (after
configuring and triggering the transfer) generates necessary Rapid
IO packets to realize the data transfer. Everything looked fine until
I checked the DSP "Functional Block Diagram" and asked myself if the
Rapid IO packets from FPGA will be automatically translated and routed
to DDR3.
I found this threads :
http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/p/215645/761152.aspx#761152
http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/p/162646/594544.aspx#594544
The main question is if this concept is possible to implement and if this is
the best way to achieve highest data throughput? Also if there is a need
to configure anything in the Multicore Navigator?
At this time I think that the FPGA acting as external master on the SRIO
interface should be able to realize this transfer (we have a master on
the DSP Pcie interface which is already realizing transfers to/from
DDR3 memory without a need of configuring anything in the Multicore
Navigator). The SRIO interface, as PCIe interface, is connected to
TeraNet.
Another thing is a confirmation from the DSP of the requested transfer
end. We need to be sure, that the data from NWRITE_R SRIO transaction
is not only delivered to the DSP but also, that the data is
successfully written to the DSP attached DDR3 memory. We need to know
a safe moment when the fresh data in external memory is available for
other modules to be fetched (e.g. PCIe attached USB 3.0 controller or
one of the DSP cores).
Summarizing, we well be very thankful if somebody could check the
concept and point the weak points. The basic questions we want to ask
for answers are following.
a) Will the concept work?
b) Is the described method a preferred way to do achieve the highest
possible throughput?
c) Is any Multicore Navigator programming necessary?
d) Will the response for NWRITE_R packet be generated only after the
effective write to the DSP attached DDR-3 or at some moment before.