This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

  • Resolved

Poor g_ether performance with OMAP4/Pandaboard

Hi,
 
Iam testing Ethernet over USB performance using CDC EEM protocol under the below given environment.
 
USB version: 2.0 (HS)
USB Host: Pandaboard
USB Device: Pandaboard
Host OS: Linux 3.0.20
Device OS: Linux 3.0.20
 
Iam observing following performance with the iperf tool.
 
Host to Device: 48 Mbps
Device to Host: 165 Mbps
 
I tried to find the reason for poor performance of transfer from Host to Device. I found a limitation with DMA controller in OTG, which cannot handle unaligned buffer addresses and hence falls back to PIO mode for transfer. I fixed this issue by giving 4-byte aligned buffer and the Host to Device performance improved to 132 Mbps. However it is low compared to Device to Host performance of 165 Mbps.
 
Please, let us know if there is any know issue in this scenario.
 
I used USB protocol analyzer and found that the device is NAKing too many times for Host's PING after OUT. It seems device is slow in reading the packets. Overall, there are only 3 bulk transactions happen in 1 USB frame which seems to be the reason for poor performance.
 
Is there any OMAP4/Pandaboard related hardware/driver issue in this?
Any help would be helpful.
Regards
Guna
 
  • Gunasekaran,

    You are correct that there is an issue with the DMA alignment.  The below patches take care of the mis-alignment and increase the USB-Ethernet throughput.  These patches are based on the 4AI.1.4 release for Blaze / Blaze Tablet (http://omappedia.org/wiki/4AI.1.4_OMAP4_Icecream_Sandwich_Release_Notes ), so you may need to modify them slightly to fit your Pandaboard codebase:

    http://review.omapzoom.org/22482 usb: musb: implement (un)map_urb_for_dma hooks
    http://review.omapzoom.org/22486 usb: hcd: Add a dma_align flag
    http://review.omapzoom.org/22489 usbnet: dma alignement fix
    http://review.omapzoom.org/22487 usb: musb: indicate DMA alignement requirement
    http://review.omapzoom.org/22488 usb: ehci-omap: indicate DMA alignement requirement (optional: for EHCI)

    What bootargs are you using on the Pandaboard?  Are you setting the vmalloc parameter?  By default, the 4AI.1.4 release for Blaze / Blaze Tablet uses vmalloc=768M.  While increasing the amount of address space available for virtual addressing, a high vmalloc value has the side effect of moving most userspace allocations into highmem, which causes extra overhead for creating the kernel mapping.  With a lower vmalloc value (such as 128M), the USB-Ethernet throughput improved in our testing.

    Regards,

    Gina

    Please click the Verify Answer button on this post if it answers your question

    _______________________________________________________

    Be sure to read the OMAP4 and OMAP5 Forum Guidelines and FAQ 

  • In reply to Gina Glaser:

    Gina

    Thanks for your quick reply.

    As I already mentioned, I already have a workaround to pass 4-byte aligned buffer address to DMA. However, I will check your patches for proper fix. Now, the issue is: even without the alignment issue, DMA seems to be slow in reading the packets. In the musb_gadget code, I found a comment which says "DMA is slow for RX/OUT for typical case (short_not_ok is 0, i.e, DMA mode 0)". For USB-Ethernet, Short_not_ok is 0. Is there a real issue in DMA apart from alignment issue which makes the transfer slow for RX/OUT?

    I will also check the vmalloc value and get back to you.

    Thanks

    Guna

  • In reply to Gunasekaran Dharman:

    Hi Gina

     

    I tested with vmalloc=128M, but the Ethernet over USB performance for Host to Device didn't improve.

    DMA transfer seems slow for RX/OUT even without alignment issue. Why is it so? Is there any known hardware issue?

     

    Regards

    Guna

     

     

  • In reply to Gunasekaran Dharman:

    Guna,

    The difference in USB transfer throughput between RX and TX is not due to the DMA alignment issue, but to the mode that is being used.  We use DMA mode 1 for TX transfers and DMA mode 0 for RX transfers, so RX will always be slower.  This is by design, since in the TX case, the MUSB driver knows how much data it is sending and can make use of the DMA done interrupt. 

    Regards,

    Gina

    Please click the Verify Answer button on this post if it answers your question

    _______________________________________________________

    Be sure to read the OMAP4 and OMAP5 Forum Guidelines and FAQ 

This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.