This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Pci express bandwidth and dataflow.

Hi,

We have connected a DM8168 to an fpga over pci express. The fpga has a 2 lanes, gen 1 interface.

We wrote a small driver to access the fpga using edma - nothing fancy.

We did a bandwidth test and are currently limited to 120 MB/s.

According to the spec, the maximum bandwidth of 2 lanes gen 1 is 500 MB/s, so, we expected more. :)

We looked at the bus with a protocol analyser (see screenshot below) and noticed the following. (N=netra, F=fpga)

* N -> F: packet is sent at timestamp 0.

* F -> N: ack is sent back 708 ns later.

* F -> N: fpga sents an update flow control packet.

* N -> F: netra sent the next packet. (with another small time delay)

Thus: it seems that the netra is waiting on the ack before sending a new data packet? AFAIK, this isn't necessary? The netra should keep transmitting until the window is empty. It shouldn't be waiting on the ack?

Have you seen this behaviour before? How can we solve it? Is this a limitation?

Theo

  • Theo,

    I am not sure about the ACK issue and will get back to you in a day, but I suspect there could also be issue with setting of EDMA parameters. 

    Can you provide the A,B ,C count you are setting for EDMA and the sync mode being used?

       Hemant

  • I'm using 1MB

    A = 1024, B = 1024, C = 1, INCR, ABSYNC

    I tried changing A & B (into 1920 & 1080/1200 - without any result)

    What would be the recommended settings?

    Theo

    *snip*

    static int acnt = 1024;
    static int bcnt = 1024;
    static int ccnt = 1;

        // source & destination
        edma_set_dest  (dma_ch, (unsigned long)( dmaphyssrc1), INCR, W8BIT);
        edma_set_src  (dma_ch, (unsigned long)(edma_dev_phys[1]), INCR, W8BIT);

        // indexes          - channel, src/destbidx, src/destcidx
        edma_set_src_index  (dma_ch, acnt, acnt*bcnt);
        edma_set_dest_index (dma_ch, acnt, acnt*bcnt);

        // transfer params  - channel, acnt, bcnt, ccnt, bcntrld, SYNC
        edma_set_transfer_params (dma_ch, acnt, bcnt, ccnt, acnt, ABSYNC);

        // Enable the Interrupts on Channel 1 - TODO: check.
        edma_read_slot (dma_ch, &param_set);
        param_set.opt |= (1 << ITCINTEN_SHIFT);
        param_set.opt |= (1 << TCINTEN_SHIFT);
        param_set.opt |= EDMA_TCC(EDMA_CHAN_SLOT(dma_ch));
        edma_write_slot (dma_ch, &param_set);

    *snip*

  • Theo,

    I don't have recommended settings right now, but can you please try the below combination? 

            acnt = 256
    bcnt = 4096
    ccnt = 1
    Mode as ABSYNC.
    Though I am still to get final numbers (still doing some debug, optimization), but we are roughly getting 300+ MB/s with GEN2 x1 setup doing EDMA unidirectional transfers.  
    
    
       Hemant
  • Hemant,

    I tried:

    * acnt = 1024, bcnt = 1024, ccnt = 1

    * acnt = 256, bcnt = 4096, ccnt = 1

    * acnt = 4096, bcnt = 256, ccnt = 1

    * acnt = 128, bcnt = 8192, ccnt = 1

    * acnt = 64, bcnt = 16384, ccnt = 1

    => All of them give the same result.

    I'm not sure that this issue is DMA related.

    Is it possible to provide the sourcecode of your driver?

    Note: I'm currently using a Gen 1 with 2 lanes card, we'll switch over as soon as possible (to gen 2 with 2 lanes)

    Theo

  • Hemant,

    Any news about this issue?

    Theo

  • Theo,

    You can refer the EP and RC side sample applications used with PCIe Endpoint driver on DM8168 for EDMA configuration used in data transfers. Please refer user guide @ http://processors.wiki.ti.com/index.php/DM81xx_AM38xx_PCI_Express_Endpoint_Driver_User_Guide

    The page has tarball of sample apps attached where you can look for EDMA parameters. Note that the EDMA transfer from user space is initiated using a sample EDMA driver (also available from same page). Also note that the EP application uses mmap to transfer data. Look for "THPT" in application code where EDMA parameters are set and data transfer (write and/or read from EP) is done.

    I haven't yet got formal throughput numbers but roughly should get 350MB/s in single direction (read/write) from DM816x EP when connected to x1 GEN2 device.

       Hemant  .