This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Kernel crash with USB composite Gadget

Expert 2280 points
Other Parts Discussed in Thread: AM1808, DA8XX, OMAPL138

Hi,

I'm trying to build and run USB composite gadgets on AM1808 Exp Kit board, using Linux Kernel 2.6.37 from PSP 3.21.0.4.

I've just modified kernel configuration to set USB MUSB Controller in peripheral mode, and compile USB gadgets as modules. But after loading g_cdc module (Composite Gadget. ETH+ACM) and configuring usb0 network interface, the system crashes with always the same backtrace that seems to refer to u_ether.c and MUSB code (see blue lines):

Unable to handle kernel paging request at virtual address ffff3800
pgd = c0004000
[ffff3800] *pgd=c3ffe031, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [#1] PREEMPT
last sysfs file: /sys/devices/platform/musb-da8xx/musb-hdrc/gadget/net/usb0/flags
Modules linked in: g_cdc ipv6 minix
CPU: 0    Not tainted  (2.6.37 #8)
PC is at __kmalloc_track_caller+0x74/0xcc
LR is at __kmalloc_track_caller+0x48/0xcc
pc : [<c00b0578>]    lr : [<c00b054c>]    psr: a0000093
sp : c047fcd0  ip : c054c078  fp : c047fcfc
r10: 00000000  r9 : 00000001  r8 : bf05cac8
r7 : 00000020  r6 : a0000093  r5 : c3802500  r4 : ffff3800
r3 : 00000000  r2 : bf05cac8  r1 : 00000020  r0 : c3802500
Flags: NzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 0005317f  Table: c30e8000  DAC: 00000017
Process swapper (pid: 0, stack limit = 0xc047e270)
Stack: (0xc047fcd0 to 0xc0480000)
...
Backtrace:

[<c00b0504>] (__kmalloc_track_caller+0x0/0xcc) from [<c02b20e4>] (__alloc_skb+0x58/0xf0)
 r8:bf05cac8 r7:00000620 r6:00000020 r5:c305c6c0 r4:c3802180
[<c02b208c>] (__alloc_skb+0x0/0xf0) from [<bf05cac8>] (rx_submit+0xb8/0x1a4 [g_cdc])
[<bf05ca10>] (rx_submit+0x0/0x1a4 [g_cdc]) from [<bf05cf4c>] (rx_complete+0x250/0x258 [g_cdc])
[<bf05ccfc>] (rx_complete+0x0/0x258 [g_cdc]) from [<c02556d0>] (musb_g_giveback+0x150/0x188)
[<c0255580>] (musb_g_giveback+0x0/0x188) from [<c0256700>] (musb_g_rx+0x228/0x290)
 r7:c385ed1c r6:c3811320 r5:c3118ec0 r4:00002000
[<c02564d8>] (musb_g_rx+0x0/0x290) from [<c02525e4>] (musb_dma_completion+0x94/0x9c)
[<c0252550>] (musb_dma_completion+0x0/0x9c) from [<c02594f8>] (cppi41_completion+0x154/0x368)
 r6:00000000 r5:c385ec00 r4:c385ec00
[<c02593a4>] (cppi41_completion+0x0/0x368) from [<c0257cbc>] (da8xx_musb_interrupt+0xa8/0x264)
[<c0257c14>] (da8xx_musb_interrupt+0x0/0x264) from [<c007b7b4>] (handle_IRQ_event+0x2c/0xfc)
[<c007b788>] (handle_IRQ_event+0x0/0xfc) from [<c007dd10>] (handle_edge_irq+0x15c/0x1cc)
 r7:0000003a r6:c387fc00 r5:c047e000 r4:c048bc34
[<c007dbb4>] (handle_edge_irq+0x0/0x1cc) from [<c003207c>] (asm_do_IRQ+0x7c/0xa0)
 r8:00000001 r7:00000002 r6:04000000 r5:00000000 r4:0000003a
[<c0032000>] (asm_do_IRQ+0x0/0xa0) from [<c0032bec>] (__irq_svc+0x4c/0x9c)
Exception stack(0xc047fef8 to 0xc047ff40)
fee0:                                                       0007a0c0 00001f40
ff00: 00000000 20000013 c0488194 c04881a0 c0481bd4 c04a8a70 c04c2ed4 41069265
ff20: c0028ac4 c047ff64 1dcbf2ff c047ff40 c006c73c c003f914 20000013 ffffffff
 r5:febfd000 r4:ffffffff
[<c003f89c>] (davinci_enter_idle+0x0/0xa0) from [<c02707b8>] (cpuidle_idle_call+0xbc/0x11c)
 r4:c0488210
[<c02706fc>] (cpuidle_idle_call+0x0/0x11c) from [<c0034158>] (cpu_idle+0x74/0xe4)
 r8:c0028af8 r7:c04a8a70 r6:c0481bd4 r5:c0481da0 r4:c047e000
[<c00340e4>] (cpu_idle+0x0/0xe4) from [<c036f4e4>] (rest_init+0x8c/0xa4)
 r7:c0481bc8 r6:c002a380 r5:c04a8a3c r4:c047e000
[<c036f458>] (rest_init+0x0/0xa4) from [<c0008a14>] (start_kernel+0x284/0x2e4)
 r4:c04b057c
[<c0008790>] (start_kernel+0x0/0x2e4) from [<c0008034>] (stext+0x34/0x3c)
 r6:c002a784 r5:c04a8a98 r4:00053175
Code: e595c000 e59c4000 e3540000 15953010 (17943003)
...


The same happens with a "reduced" version of Multi Gadget (g_multi ETH+MSC), where I removed ACM due to AM1808 limitation on number of eps.

I had to do a minor change in f_rndis.c but I do not think this is the point because using the simple gadget g_ether I have no problem. I had to comment out next lines (as they are in standard kernel) due to a compilation issue:

#if 0
    if (rndis_set_param_vendor(rndis->config, vendorID,
                manufacturer))
        goto fail;
#endif


Does anybody test USB composite gadgets on TI platforms?
Could it be to an (HW) limitation of MUSB Controller?
Or could someone give any hint or suggestion?

Thanks in advance. Regards,

Max



  • Dear TI experts and community,

    Any suggestion about this topic?

    It would be important to know at least if it is an HW limitation, and if it affects AM335x too.

    Thanks for any help.
    Best regards, Max

  • Dear all,

    I've run some further tests related to this issue with SDK 05.03.02. It comes with two different kernel configurations: da850_omapl138_defconfig (same as in 05.02.00) and new tisdk_am180x-evm_defconfig. One of the main difference between them is that the first use DMA for USB and the second PIO:

    # CONFIG_MUSB_PIO_ONLY is not set
    CONFIG_USB_TI_CPPI41_DMA_HW=y
    CONFIG_USB_TI_CPPI41_DMA=y

    CONFIG_MUSB_PIO_ONLY=y
    # CONFIG_USB_TI_CPPI41_DMA is not set

    So I've tried with this new configuration (without DMA), and kernel does not crash any more!

    A critical issue related to USB+DMA is known, but TI experts clarify that is was just related to host side and to interaction with some wifi devices (1, 2).
    I've not yet run performance tests, but I suppose PIO configuration would affects them heavily, and they are really critical for my application.
    Moreover USB DMA configuration in TechRefManual doc seems really tricky, and other notes in Silicon Errata doc are even more.

    I need further help, please.

    Thanks, in advance. Regards,

    Max

  • Max,

    This seems to be some sw issue with v2.6.37 based source you have. Can you try updating the DMA driver same as available for AM18x on v2.6.33 at below link? (last two patches)

    http://arago-project.org/git/projects/?p=linux-omapl1.git;a=shortlog;h=refs/heads/DAVINCIPSP_03.20.00.14

    Ajay

  • Ajay,

    I had a look but I am a little bit confused...

    Most of the code is already there: I'm using latest SDK for AM180x (05.03.02.00) based on PSP 03.21.00.04 and kernel 2.6.37. Patch comments say they are back porting from 2.6.37 driver, so I would expect to have them already in.

    I've also downloaded pure PSP 03.21.00.04 and musb driver is the same of the one I have in AM180x SDK 05.03.02.00. On the other hand, musb driver in AM37xx SDK 05.02.00.00, which is based on PSP 03.21.00.04 too, seems really different! I had a look at musb driver in AM335x SDK 05.03.02.00 (kernel 3.1.0), and there are some more differences...

    Is there any repository with latest musb driver for kernel 2.6.37 ?

    I will try to patch the code and test it. I'll let you know.

    Regards, Max

  • Hi Ajay,

    I've tried to integrate the patches you suggested. I make changes in cppi41.c cppi41.h cppi41_dma.c cppi41_dma.h and da8xx.c. Patches to musb_core.c musb_core.h and usb.h were already there. And I ignored musb_host.c because I build MUSB in peripheral mode, so it is not used.

    I did not experience kernel crash any more, but performances are decreased. I will do some more tests in the next days.

    Would it be available an official patch for kernel 2.6.37 from TI?
    Were I can find arago repository for kernel 2.6.37 that is present in latest PSP and SDK?

    I have one question about some "funny" code I've just found looking at cppi41_dma.c: it check about gadget driver name ("g_ether" in the original source code, and "g_file_storage" in the patch code you suggested me), to decide how to configure the DMA mode (USB_GENERIC_RNDIS_MODE or USB_TRANSPARENT_MODE) and about some other actions. I wonder how can it works with different gadget names, such as g_mass_storage or g_multi!!!

            pkt_len = rx_ch->pkt_size;
            mode = USB_GENERIC_RNDIS_MODE;
            if (!strcmp(gadget_driver->driver.name, "g_file_storage")) {
                if (cppi->inf_mode && length > pkt_len) {
                    pkt_len = 0;
                    length = length - rx_ch->pkt_size;
                    cppi41_rx_ch_set_maxbufcnt(&rx_ch->dma_ch_obj,
                        DMA_CH_RX_MAX_BUF_CNT_1);
                    rx_ch->inf_mode = 1;
                } else {
                    max_rx_transfer_size = rx_ch->pkt_size;
                    mode = USB_TRANSPARENT_MODE;
                }
            } else
                if (rx_ch->length < max_rx_transfer_size)
                    pkt_len = rx_ch->length;

            if (mode != USB_TRANSPARENT_MODE)
                cppi41_set_ep_size(rx_ch, pkt_len);
            cppi41_mode_update(rx_ch, mode);

    I have a look also at AM335x SDK 5.3.2.0 kernel (3.1, PSP 4.6.0.3): after the patch cppi41 files in kernel 2.6.37 are almost the same of AM335x kernel 3.1. So my question is: can I use kernel 3.1 MUSB driver within kernel 2.6.37?

    Moreover can you have a look at txdma_completion_work() implementations in the two kernels? IRQ are saved and restored for each channel in kernel 3.1 and cond_resched() is called under some conditions. Which is the best implementation? Could this affect performances?

    Thanks for your support.

    Best regards, Max

  • Max,

    Good to see you made changes to v2.6.37 nd don;t see kernel panic. How much performance drop do you see and in which usb class?

    I think you caught the bug in ")if (!strcmp(gadget_driver->driver.name, "g_file_storage")" and that seems to be the issue why we have problem with g_mass_storage. We should be using TRANSPARENT for both g_mass_storage and g_file_storage. Can you test g_mass_storage again with below change,

      if (!strcmp(gadget_driver->driver.name, "g_file_storage") || !strcmp(gadget_driver->driver.name, "g_mass_storage"))

    You can use v3.1 musb driver in v2.6.37. Tx_dma_completion_work() can be further updated with additional cond_sched().

    Ajay

  • Max,

    Can you share the changes (as a patch) you made for on v2.6.37 kernel? We can pick this patch when starting on this.

    Ajay

  • Ajay

    Some further good news: I've verified that when a write is done using g_file_storage, the DMA mode of Rx channel changes continuously from TRANSPARENT to GENERIC_RNDIS, than back to TRANSPARENT and so on, due to that piece of code I've outlined some posts ago. Of course this was not happening with g_mass_storage, due to check on gadget driver name.

    But now, after making the change you suggested (... || !strcmp(gadget_driver->driver.name, "g_mass_storage") ...), g_mass_storage works fine as g_file_storage! So the point is exactly those lines.

    Of course this does not solve my root problem: I need to use g_multi gadget driver, so DMA channels configuration cannot depends just on gadget driver name. Some channels will be used to implement RNDIS function, some others for MASS STORAGE CLASS or other USB functions. By the way relaying on gadget driver name is an awful hard-coded approach, and for sure I need something different.

    I will be happy to share patches for 2.6.37 kernel (for AM1808), and further patches for g_mass_storage_issue seen with kernel 3.1 (AM335x), after cleaning them and run final tests. But please, I would ask you to share some details on USB DMA handling and configuration.

    I've read microprocessors Tec. Reference Manual, but it is quite complex. There are four DMA mode, but kernel driver code seems to use just two, TRANSPARENT and GENERIC_RNDIS: why different modes need to be used for different USB functions? And why, in case of MASS STORAGE CLASS, DMA mode on RX channel need to change continuously? What about USB functions other than MSC and RNDIS (CDC ACM, etc...): which is the correct DMA mode configuration for channels used by these other functions?

    Thanks. Regards,
    Max

  • Max,

    Thanks for update and good to read that g_mass_storage issue is solved as well. I agree that hard coded gadget driver is not right approach and untill we get a correct way of doing it we need to continue using it.

    Ideally if  transfer size is more than packet size then GENERIC_RNDIS mode should be used which will make sure entire data us transferred and then finally we get one interrupt. Whereas in TRANSPARENT mode we get one interrupt per usb packet size and so it is used for short packets. We had to use TARNSPARENT for g_file_storage gadget due to some issue in using GENERIC_RNDIS mode. We are using GENERIC_RNDIS for all other gadget drivers.

    Ajay

  • Ajay said:

    How much performance drop do you see and in which usb class?

    Patch seems to have no relevant impacts on g_file_storage performances (even if it's hard to measure them).

    Regarding g_ether, these are iperf performances without the patch:

    iperf client @ USB Device -> iperf server @ USB Host : 25.2 Mbps
    iperf client @ USB Host -> iperf server @ USB Device : 38.0 Mbps

    And with the patch:

    iperf client @ USB Device -> iperf server @ USB Host : 21.9 Mbps
    iperf client @ USB Host -> iperf server @ USB Device : 38.2 Mbps

    When the client runs on the USB Device (AM1808), performances are clearly lower (the USB Host is a Linux PC).

    Both are heavily lower than kernel 3.6.33 performances: at least - 40 %!!!

    With the patch I've tried also to use g_multi and performances are even worse:

    iperf client @ USB Device -> iperf server @ USB Host : 19.5 Mbps
    iperf client @ USB Host -> iperf server @ USB Device : 23.5 Mbps

    But these last values could depend on different USB classes: USB CDC with g_ether, RNDIS with g_multi.

    Max



  • Max,

    As I posted earlier v2.6.37 has some issue affecting performance. I check again internally and no one has found the solution to it.

    Regards,

    Ajay