We have a design based on an MPC5200. We have two TI PCI1520 chips connected to its 33MHz PCI bus.
One PCI1520 connects to two PCMCIA slots we use for device expansion, and the other connects to two Compact FLASH sockets for storage.
We're running Linux, and it comes with all the appropriate drivers for this hardware.
The problem is that we can't seem to get the PCI1520 to perform the PIO Reads and Writes at any speed faster than 1MHz, which limits our raw transfer speed to a rather poor 2MB/sec. A raw "dd" from the CF to /dev/null peaks at 1.7MB/s, and file reads over Ethernet peak at 1.4MB/s The CF cards can sustain 6MB/s, and we'd like to get closer to that if we can.
The Compact Flash card is being run in 16-bit PIO mode. We have a minor suspicion that the PCI1520 might be able to run a bit faster in Memory Mode, but our drivers don't support that (and the only data sheet difference we can find is the 150ns vs 210ns read signal width). Where it is running slow doesn't seem to be due to the bus cycle active period though.
We have the PCI1520 programmed to the fastest IO speeds we can find. An Oscilloscope shows IORDY going low for about 220us and then remaining high for 800us. That proves the CF card isn't slowing things down by inserting any wait states. I would have expected that the IO cycles should be able to be closer together than this.
Meanwhile, on the PCI side of the bus, the PCI1520 is obeying the rules and "giving up" after 16 PCI bus cycles, forcing the PCI master to retry. This is called a "Delayed Transaction". The following trace shows the CF/PCMCIA "IORDY" in Yellow and the PCI "DEVSEL" signal in Blue. At the top of the picture can be seen the 256-word bursts as the CD disk sectors are being read, and the gaps between them.
It looks like the state machine in the PCI1520 is taking a while to start the PIO cycle, and can't get the data read in less than the 16 clocks (500us) at which point it is required to force a "Delayed Transaction". This then wastes another 200-400us or so while the PCI bus retries and then runs the next cycle.
If there was an option to change the PCI standard
Does anyone know if there's anything simple we can do to get the performance up? I realize that for the best performance we should be using the chip as a CardBus Bridge (capable of 66MB/s or whatever) but our hardware was finalised over 4 years ago and we have units in the field.
Thanks,