This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM6548: Issue with AM6548 ICSSG Ethernet Driver

Part Number: AM6548


Tool/software:

Dear TI Support Team,

I am writing to seek assistance regarding an issue I have encountered with the AM6548 platform. I have observed behavior similar to the problem discussed in the following forum thread:

https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1158425/am6548-icssg-ethernet-driver-crash

Specifically, after system boot, one set of the network interfaces occasionally becomes unreachable (unable to ping) despite being correctly configured. Additionally, when executing “ifconfig ethX down” (where ethX refers to the affected interface), the system triggers a kernel BUG with an error message indicating:

  “kernel BUG at lib/genalloc.c:254!”

This issue appears sporadic and affects the stability of the network functionality on the platform. 

I appreciate your time and assistance in helping resolve this matter. Please let me know if you require any further details or log files.

Thank you in advance for your support.

Best regards,

  • Hello Jack,

    what version of SDK are you using?

    What board are you using, and what version is that board? Please take a photo of your board if you are not sure.

    Regards,

    Nick

  • Hello Saulnier

    Below is my error message and the SDK version. If you need any additional information, please let me know.

    SDK Ver: ti-processor-sdk-linux-rt-am65xx-evm-08_00_00_02

    error log:

     [ 208.301290] kernel BUG at lib/genalloc.c:254!
    [ 209.301521] Internal error: Oops - BUG: 0 [#1] PREEMPT_RT SMP
    [ 209.316203] Modules linked in: drv_misc(O) xhci_plat_hcd xhci_hcd usbcore rpmsg_char ti_am335x_adc kfifo_buf omap_rng rng_core dwc3 udc_core irq_pruss_intc roles usb_common crct10dif_ce icssg_prueth pru_rproc icss_iep ti_k3_r5_remoteproc virtio_rpmsg_bus m_can_platform m_can can_dev ti_am335x_tscadc pruss ti_cal videobuf2_dma_contig v4l2_fwnode videobuf2_memops videobuf2_v4l2 videobuf2_common pvrsrvkm(O) sa2ul sha512_generic authenc phy_omap_usb2 dwc3_keystone sch_fq_codel cryptodev(O) ipv6
    [ 209.359652] CPU: 1 PID: 948 Comm: ip Tainted: G W O 5.10.41-rt39-g0701a5b58c #19
    [ 209.368162] Hardware name: Texas Instruments AM654 Base Board (DT)
    [ 209.374327] pstate: 20000005 (nzCv daif -PAN -UAO -TCO BTYPE=--)
    [ 209.380322] pc : gen_pool_destroy+0xa8/0xe0
    [ 209.384505] lr : gen_pool_destroy+0xa0/0xe0
    [ 209.388679] sp : ffff800011ca3260
    [ 209.391983] x29: ffff800011ca3260 x28: ffff800011ca3780
    [ 209.397284] x27: ffff0008072b4fb0 x26: ffff800011a25000
    [ 209.402585] x25: ffff0008072b4f80 x24: dead000000000100
    [ 209.407886] x23: dead000000000122 x22: 0000000000000007
    [ 209.413187] x21: ffff0008072b4fb0 x20: ffff0008072b4fb0
    [ 209.418488] x19: 0000000000000200 x18: 0000000000000000
    [ 209.423790] x17: 0000000000000000 x16: 0000000000000000
    [ 209.429091] x15: ffff000816cba2c0 x14: ffffffffffffffff
    [ 209.434392] x13: ffff80001116714c x12: ffff800011167146
    [ 209.439693] x11: 0000000000000000 x10: ffff80001105d8c0
    [ 209.444993] x9 : ffff800011ca32c0 x8 : ffff800011051cc0
    [ 209.450294] x7 : 0000000000000000 x6 : ffffffffffffffff
    [ 209.455595] x5 : 0000000000000000 x4 : 0000000000000000
    [ 209.460895] x3 : 0000000000000000 x2 : 0000000000000000
    [ 209.466196] x1 : 0000000000000000 x0 : 0000000000000000
    [ 209.471498] Call trace:
    [ 209.473936] gen_pool_destroy+0xa8/0xe0
    [ 209.477763] k3_cppi_desc_pool_destroy+0x54/0x98 [icssg_prueth]
    [ 209.483686] prueth_cleanup_rx_chns.isra.0+0x1c/0x38 [icssg_prueth]
    [ 209.489948] emac_ndo_stop+0x20c/0x2c8 [icssg_prueth]
    [ 209.494995] __dev_close_many+0xac/0x138
    [ 209.498916] __dev_change_flags+0xb0/0x1c0
    [ 209.503006] dev_change_flags+0x24/0x68
    [ 209.506832] do_setlink+0x610/0xd58
    [ 209.510313] __rtnl_newlink+0x3f8/0x790
    [ 209.514140] rtnl_newlink+0x4c/0x78
    [ 209.517621] rtnetlink_rcv_msg+0x118/0x338
    [ 209.521707] netlink_rcv_skb+0x58/0x118
    [ 209.525537] rtnetlink_rcv+0x18/0x28
    [ 209.529107] netlink_unicast+0x1bc/0x278
    [ 209.533022] netlink_sendmsg+0x1a4/0x3b0
    [ 209.536937] ____sys_sendmsg+0x250/0x290
    [ 209.540853] ___sys_sendmsg+0x80/0xc8
    [ 209.544507] __sys_sendmsg+0x68/0xc0
    [ 209.548074] __arm64_sys_sendmsg+0x24/0x30
    [ 209.552163] el0_svc_common.constprop.0+0x78/0x1a0
    [ 209.556948] do_el0_svc+0x24/0x90
    [ 209.560256] el0_svc+0x14/0x20
    [ 209.563305] el0_sync_handler+0xb0/0xb8
    [ 209.567131] el0_sync+0x180/0x1c0
    [ 209.570445] Code: aa1303e1 97ffc4e9 eb00027f 54fffd89 (d4210000)
    [ 209.576527] ---[ end trace 0000000000000003 ]---
    [ 210.576721] printk: enabled sync mode
    [ 210.585062] note: ip[948] exited with preempt_count 1
    [ 210.592192] printk: console [ttyS2]: printing thread stopped
    [ 210.599536] ------------[ cut here ]------------
    [ 210.604149] WARNING: CPU: 1 PID: 0 at kernel/rcu/tree.c:626 rcu_eqs_enter.isra.0+0x84/0x90
    [ 210.612412] Modules linked in: drv_misc(O) xhci_plat_hcd xhci_hcd usbcore rpmsg_char ti_am335x_adc kfifo_buf omap_rng rng_core dwc3 udc_core irq_pruss_intc roles usb_common crct10dif_ce icssg_prueth pru_rproc icss_iep ti_k3_r5_remoteproc virtio_rpmsg_bus m_can_platform m_can can_dev ti_am335x_tscadc pruss ti_cal videobuf2_dma_contig v4l2_fwnode videobuf2_memops videobuf2_v4l2 videobuf2_common pvrsrvkm(O) sa2ul sha512_generic authenc phy_omap_usb2 dwc3_keystone sch_fq_codel cryptodev(O) ipv6
    [ 210.655864] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G D W O 5.10.41-rt39-g0701a5b58c #19
    [ 210.664806] Hardware name: Texas Instruments AM654 Base Board (DT)
    [ 210.670972] pstate: 20000085 (nzCv daIf -PAN -UAO -TCO BTYPE=--)
    [ 210.676965] pc : rcu_eqs_enter.isra.0+0x84/0x90
    [ 210.681487] lr : rcu_idle_enter+0x10/0x20
    [ 210.685493] sp : ffff80001129bf50
    [ 210.688796] x29: ffff80001129bf50 x28: 0000000000000000
    [ 210.694098] x27: 0000000000000000 x26: 0000000000000000
    [ 210.699399] x25: 0000000000000000 x24: 0000000000000000
    [ 210.704700] x23: 0000000000000000 x22: ffff80001103da28
    [ 210.710001] x21: 0000000000000001 x20: ffff00080011d880
    [ 210.715302] x19: ffff80001103d948 x18: 000000000000000e
    [ 210.720603] x17: 0000000000000001 x16: 0000000000000019
    [ 210.725904] x15: 0000000000000004 x14: 0000000000000307
    [ 210.731204] x13: 0000000000000000 x12: 0000000000000001
    [ 210.736505] x11: 0000000000000000 x10: 0000000000000a50
    [ 210.741806] x9 : ffff80001129bed0 x8 : ffff00080011e330
    [ 210.747107] x7 : 0000000000000001 x6 : 00000073d8f41e76
    [ 210.752407] x5 : 00ffffffffffffff x4 : ffff80086e899000
    [ 210.757708] x3 : 4000000000000002 x2 : 4000000000000000
    [ 210.763009] x1 : ffff800010f07180 x0 : ffff00087f7a0180
    [ 210.768310] Call trace:
    [ 210.770748] rcu_eqs_enter.isra.0+0x84/0x90
    [ 210.774922] rcu_idle_enter+0x10/0x20
    [ 210.778578] default_idle_call+0x24/0x70
    [ 210.782496] do_idle+0xc8/0x138
    [ 210.785630] cpu_startup_entry+0x24/0x60
    [ 210.789543] secondary_start_kernel+0x15c/0x188
    [ 210.794067] ---[ end trace 0000000000000004 ]---

  • Hello Nick

    Do you have any thoughts or ideas regarding my question?

  • Nick, 

    What's the final software fix for the issue in below e2e post? 

    https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1158425/am6548-icssg-ethernet-driver-crash

    Is the fix been checked in SDK 8.4? 

    BR, Rich

  • Hello Jack & Rich,

    Apologies for the delayed responses here.

    Software version

    I do not see an RT Linux SDK version for 8.0 or 8.4:
    https://www.ti.com/tool/download/PROCESSOR-SDK-LINUX-RT-AM65X

    In general, I would suggest using the latest version of the SDK. The gigabit PRU Ethernet code has been in active development. SDK 8.6 has many bug fixes for behavior you might see in earlier SDK 8.x releases, but SDK 9.3 will have the latest version of the code.

    Please confirm that you are using AM65x SR2, not SR1 

    We are only supporting customers going to production on AM65x Silicon Revision (SR) 2, not SR1. What kind of silicon are you using for this testing?

    If you are using a TI EVM, you can attach a picture of your board and I can tell if the processor on that board is SR2 or SR1.

    The difference between SR1 and SR2 especially matters for PRU Ethernet, because the PRU subsystem changed a lot between these 2 revisions. So all the fixes that were added for SR2 cannot be applied backwards onto SR1.

    Regards,

    Nick

  • Hi Nick:

    Yes, we confirm that we are using SR2.

    If I don't want to recompile the SDK, what else can be done to fix this bug?

  • Hello Jack,

    Silicon version

    Ok, thank you for confirming the silicon version.

    What can you try?

    I am not sure what you mean by "recompile the SDK".

    We do not support backporting patches to previous SDK releases. You are welcome to do that on your own, but I will be unable to help.

    First off, I would suggest using an official SDK version. So I would not suggest using SDK 8.0 or SDK 8.4, since we did not actually validate AM65x on those releases. You can find the official RT Linux SDK releases here:
    https://www.ti.com/tool/download/PROCESSOR-SDK-LINUX-RT-AM65X

    If you want to stick with Linux kernel 5.10, I would suggest testing SDK 8.6 and seeing what the behavior looks like. SDK 8.6 will have multiple bugfixes that were not in previous SDK 8.x releases.

    SDK 9.3 / Linux kernel 6.1 will have the most recent bugfixes.

    Regards,

    Nick