This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

XIO2213B: How to improve the 1394b iso-packet performance

Part Number: XIO2213B

We have developed a 1394b card using xio2213b, we want to simulate a 1394b camera on our PC. In every 15 ms, we generate a picture consist of 36 isochronous (channel one) packets, each packet is 3716 bytes long. If we can send those packets within 5 ms, then our target will process the picture correctly. We know that 1394b can send only one iso packet of one channel in  every 125 us, so we need 36 * 125 = 4.5 ms to send all those packets. As it turned out, we can send all those 36 packets in 4.5 ms most of the time, but sometimes it will cost 7ms even 10ms to send all these packets, and that will cause error. We think that the problem is sometimes the dma engine on xio2213b can't get all those packets from the PC memory within 4.5ms.

So we need some instructions on how to improve the  iso-packet performance. Like since we are using the channel 1 only, can we expand the buffer for that channel on xio2213? 

  • Isochronous packets are scheduled for a specific frame number; if the packet is not sent for any reason, the controller hardware reports an error.

    So if some packets are delayed, there is something wrong with the scheduling. There might be a bug in the controller driver or in your software, or your pipeline length might be too short.

    How exactly are you scheduling the packets?

  • All 36 packets are sent, but sometimes it will cost more than 5 ms to send them.

    Every 15 ms, I will build a descriptor chain for all those 36 packets, then start the iso dma. If I don't need to change the data, I think can update the descriptor chain just once and run the iso dma of xio2213b  at every 15 ms, still sometimes it will take 7ms for xio2213b to send all those packets,

    I am very interested about the pipeline length you are talking about, can you give me some more information about it? Thank you very much.

  • That descriptor chain would be the pipeline.

    What OS are you using? What is the frame that the first descriptor in the chain is scheduled for? How exactly are you measuring the cost of sending, i.e., do you take the time of the last completed frame, or the time when the OS reports the completion?

  • The OS we are using is windowsXP. There is  no particular schedule for the first frame, we are simulating a camera which take a picture every 15ms, so every 15ms we generate 36 packets and send them as soon as possible.

    There are three devices connected by 1394b in our lab. A is the PC which generate the picture; B is our target device which process the picture; C is also a PC serves as the bus monitor. I use the self designed driver(copy the code of Linux) on A; and use the unibrain firewire on monitor C.

    If it cost A more than 5ms to send one picture to B, then B will not have enough time to process the picture. I use device C to measure the cost of sending by counting the interval between the first frame received and the the 36th frame received. I can see after total 1000 pictures are sent, 995 of them are received within 4.5ms, but 5 of them cost more than 7ms.

  • All the time are recorded by the software running on XP.
  • "no particular schedule" and "every 15 ms" contradict each other.

    Instead of 36 packets, generate 120 packets (or a multiple of that), with the remaining 84 ones being skip packets. Add new descriptors before the old ones run out, so that the pipeline never becomes empty.
    This is the only way to ensure that the pictures are actually sent each 15 ms.

  • If I prepare the descriptor chain for e.g. 72 packets, how can I control the second 36 packets are sent 15ms after the first 36 packets are sent.
  • One packet is sent every frame, so at the fixed frame rate of 8 kHz, 120 packets = 15 ms. Of these, 36 are real packets, and 84 are skip packets.

  • I'll try this. Thank you.

  • My goal is not to send those packets at exactly 15ms interval, the camera we are simulating wait the exposure command(by an io signal) then it will send one picture. My goal is that after the exposure command, I can send a picture within 5ms.
    As most time xio2213b can send those 36 packets within 5ms, I think those time it fail to do so is caused by the dma engine fail to transmit all the data from pci to 1394b just in time.
    Is there any settings to ensure the bandwidth of xio2213b, I mean mark this xio2213b as the most important pci device of the PC. Give xio2213b the highest priority on PCI bus so that the bandwidth can be ensured.
  • On the XIO2213B's internal PCI bus, there is no contention. PCI Express is a point-to-point connection, and the memory controller has more than enough bandwidth, so this is extremly unlikely to be a problem. However, the PCI-to-PCIe bridge makes reading data from main memory somewhat inefficient because it cannot perfectly predict how large read requests will be.

    To send the data after some event, (re-)starting the IT DMA should work perfectly fine.
    But the controller does prefetching, and might assume that there is enough time to do so. To ensure that the first packet can be prefetched, add one to three skip packets before the actual data.

    Can you check the returned timestamps? Are there dropped cycles in the middle of your transmission, or are the timestamps all consecutive?

    If you suspect that the XIO2213B cannot sustain the bandwidth, try to insert skip packets into the middle of the transmission, or make the packets smaller, if possible.