This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/XTCIEVMK2LX: Running ex02_messageq IPC demo crashes linux

Part Number: XTCIEVMK2LX

Tool/software: Linux

I have several TCIEVMk2X modules, all running the latest Processor SDK-RT for Linux v03.02 from http://software-dl.ti.com/processor-sdk-linux-rt/esd/K2HK/latest/exports/ti-processor-sdk-linux-rt-k2hk-evm-03.02.00.05-Linux-x86-Install.bin.

I am trying to run the ex02_messageq example from the linux-devkit/sysroots/ti-ipc-tree/examples/TCI6636_linux_elf. I will be honest, I gave up trying to modify the makefiles and products.mak for this example after spending far too much time trying to track down each and every stumbling block, and simply created CodeComposerStudio projects for the demo code.

However, when I run the code on the target (TCI6638K2K on a TCIEVMK2X evaluation module), it sometimes works and sometimes crashes Linux hard.

Since the forum repeats the opening message on every page, I will continue in a second message.

Steve Williams

Tarana Wireless

  • In making my CodeComposer project for the DSP side, I changed the *.cfg file to pass null for the first argument, so that the resulting dsp.out works on all 8 cores

    MultiProc.setConfig(null, [ "HOST", 
    			    "CORE0", "CORE1", "CORE2", "CORE3",
        			    "CORE4", "CORE5", "CORE6", "CORE7"]);

    I run the demo with the following ex02.sh script:

    root@k2hk-evm:~/ex02-messageq# cat ex02.sh
    #!/bin/sh

    targets=$(seq 0 7)

    for target in ${targets} ; do
    echo "Resetting dsp${target}"
    mpmcl reset dsp${target}
    done


    for target in ${targets} ; do
    image=dsp.out
    echo Loading ${image} to dsp${target}
    mpmcl load dsp${target} ${image}
    done

    for target in ${targets} ; do
    echo "Running dsp${target}"
    mpmcl run dsp${target}
    done

    The DSPs all seem to start correctly, judging by their /sys/kernel/debug/remoteproc/remoteproc*/trace0 output.

    Usually, I can run the "./host COREx" a few times and get correct output from both sides:

    root@k2hk-evm:~/ex02-messageq# ./host CORE0
    --> main:
    --> Main_main:
    --> App_create:
    App_create: Host is ready
    <-- App_create:
    --> App_exec:
    App_exec: sending message 1
    App_exec: sending message 2
    App_exec: sending message 3
    App_exec: message received, sending message 4
    App_exec: message received, sending message 5
    App_exec: message received, sending message 6
    App_exec: message received, sending message 7
    App_exec: message received, sending message 8
    App_exec: message received, sending message 9
    App_exec: message received, sending message 10
    App_exec: message received, sending message 11
    App_exec: message received, sending message 12
    App_exec: message received, sending message 13
    App_exec: message received, sending message 14
    App_exec: message received, sending message 15
    App_exec: message received
    App_exec: message received
    App_exec: message received
    <-- App_exec: 0
    --> App_delete:
    <-- App_delete:
    <-- Main_main:
    <-- main:

    root@k2hk-evm:~/ex02-messageq# cat /sys/kernel/debug/remoteproc/remoteproc0/trace0
    2 Resource entries at 0x800000
    [t=0x00009dc2] xdc.runtime.Main: --> main:
    registering rpmsg-proto:rpmsg-proto service on 61 with HOST
    [t=0x00708dac] xdc.runtime.Main: NameMap_sendMessage: HOST 53, port=61
    [t=0x0070e58f] xdc.runtime.Main: --> smain:
    [t=0x007141b4] Server: Server_create: server is ready
    [t=0x00717185] Server: <-- Server_create: 0
    [t=0x00719a12] Server: --> Server_exec:
    [t=0x00000005:21d8a8d4] Server: Server_exec: processed cmd=0x0
    [t=0x00000005:21d95502] Server: Server_exec: processed cmd=0x0
    [t=0x00000005:21d9b610] Server: Server_exec: processed cmd=0x0
    [t=0x00000005:21dc3859] Server: Server_exec: processed cmd=0x0
    [t=0x00000005:21dc9376] Server: Server_exec: processed cmd=0x0
    [t=0x00000005:21dd2ed1] Server: Server_exec: processed cmd=0x0
    [t=0x00000005:21de5b2e] Server: Server_exec: processed cmd=0x0
    [t=0x00000005:21deb681] Server: Server_exec: processed cmd=0x0
    [t=0x00000005:21df46fb] Server: Server_exec: processed cmd=0x0
    [t=0x00000005:21e0c872] Server: Server_exec: processed cmd=0x0
    [t=0x00000005:21e123c8] Server: Server_exec: processed cmd=0x0
    [t=0x00000005:21e1b3a9] Server: Server_exec: processed cmd=0x0
    [t=0x00000005:21e33352] Server: Server_exec: processed cmd=0x0
    [t=0x00000005:21e38eaa] Server: Server_exec: processed cmd=0x0
    [t=0x00000005:21e41e07] Server: Server_exec: processed cmd=0x2000000
    [t=0x00000005:21e469ae] Server: <-- Server_exec: 0
    [t=0x00000005:21e499e8] Server: --> Server_delete:
    [t=0x00000005:21e4db3b] Server: <-- Server_delete: 0
    [t=0x00000005:21e5377b] Server: Server_create: server is ready
    [t=0x00000005:21e57012] Server: <-- Server_create: 0
    [t=0x00000005:21e5a174] Server: --> Server_exec:

    But, if I try to run the program repetitively, either on the same core or rotating among all cores, it eventually will crash linux, usually right after the "host" application prints:

    --> main:
    --> Main_main:
    --> App_create:
    App_create: Host

    Sometimes, I have seen a huge spray of kernel debug messages also:

    :::: iter=4 core=7 ::::
    --> main:
    --> Main_main:
    --> App_create:
    App_create: Host[ 3992.427631] BUG: scheduling while atomic: swapper/0/0/0x00010002
    is ready
    <-- A[ 3992.427690] Modules linked in: sha512_arm sha1_generic sha1_arm_neon sha1_arm md5 jitterentropy_rng sha256_generic sha256_arm hmac drbg des_generic cbc xfrm_user xfrm4_tunnel ipcomp xfrm_ipcomp ah4 af_key xhci_plat_hcd xhci_hcd usbcore dwc3 udc_core extcon keystone_sa_driver ks_sa_rng authenc rng_core aes_arm debugss_kmodule(O) keystone_dsp_mem dwc3_keystone temperature_kmodule(O) davinci_wdt keystone_remoteproc sch_fq_codel uio_module_drv(O) uio rpmsg_proto virtio_rpmsg_bus remoteproc virtio virtio_ring ipsecmgr_mod(O) xfrm_algo hplibmod(O) gdbserverproxy(O) cryptodev(O) cmemk(O)
    pp_create:
    --> [ 3992.427696] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 4.4.32-rt41-ge26c84b0ac #1
    App_exec:
    App_e[ 3992.427697] Hardware name: Keystone
    xec: sending mes[ 3992.427700] Backtrace:
    sage 1
    App_exec[ 3992.427718] [<c0012e00>] (dump_backtrace) from [<c0012ffc>] (show_stack+0x18/0x1c)
    : sending messag[ 3992.427725] r7:c08e6000 r6:20070193 r5:00000000 r4:c08fd1c4
    e 2
    App_exec: s[ 3992.427735] [<c0012fe4>] (show_stack) from [<c02a2a80>] (dump_stack+0x8c/0xa0)
    ending message 3[ 3992.427742] [<c02a29f4>] (dump_stack) from [<c0043728>] (__schedule_bug+0x58/0x68)

    App_exec: mess[ 3992.427748] r7:c08e6000 r6:c08e4f80 r5:c08edfa8 r4:00000000
    age received, se[ 3992.427757] [<c00436d0>] (__schedule_bug) from [<c0659ea4>] (__schedule+0x4d8/0x538)
    nding message 4[ 3992.427761] r5:c08edfa8 r4:de587f80

    App_exec: messa[ 3992.427767] [<c06599cc>] (__schedule) from [<c0659f64>] (schedule+0x60/0xfc)
    ge received, sen[ 3992.427774] r10:c08e7dc0 r9:dd5a8e80 r8:c08e7dc0 r7:c08edfa8 r6:c08ee4e4 r5:c08edfa8
    ding message 5
    [ 3992.427776] r4:c08e6000
    App_exec: messag[ 3992.427782] [<c0659f04>] (schedule) from [<c065b888>] (rt_spin_lock_slowlock+0x1d4/0x310)
    e received, send[ 3992.427785] r5:c08e6000 r4:de212e00
    ing message 6
    A[ 3992.427790] [<c065b6b4>] (rt_spin_lock_slowlock) from [<c065cff0>] (rt_spin_lock+0x5c/0x60)
    pp_exec: message[ 3992.427797] r10:c08eb24c r9:de1c8e50 r8:de1c8e40 r7:c0801178 r6:c08e7e6c r5:000002a0
    received, sendi[ 3992.427799] r4:de212e00
    ng message 7
    Ap[ 3992.427806] [<c065cf94>] (rt_spin_lock) from [<c03f35a8>] (regmap_lock_spinlock+0x14/0x20)
    p_exec: message [ 3992.427809] r5:000002a0 r4:de212e00
    received, sendin[ 3992.427815] [<c03f3594>] (regmap_lock_spinlock) from [<c03f4e38>] (regmap_read+0x38/0x68)
    g message 8
    App[ 3992.427818] r5:000002a0 r4:de212e00
    _exec: message r[ 3992.427826] [<c03f4e00>] (regmap_read) from [<c02cdfe8>] (keystone_irq_handler+0x50/0x128)
    eceived, sending[ 3992.427831] r7:c0801178 r6:de2dea90 r5:de1c8e50 r4:00000000
    message 9
    [ 3992.427838] [<c02cdf98>] (keystone_irq_handler) from [<c006fc6c>] (generic_handle_irq+0x2c/0x3c)
    [ 3992.427844] r10:c08eb24c r9:f0805000 r8:de008000 r7:00000000 r6:00000000 r5:00000043
    [ 3992.427846] r4:c08e3828
    [ 3992.427851] [<c006fc40>] (generic_handle_irq) from [<c006ff4c>] (__handle_domain_irq+0x64/0xbc)
    [ 3992.427856] [<c006fee8>] (__handle_domain_irq) from [<c000940c>] (gic_handle_irq+0x50/0x94)
    [ 3992.427862] r9:f0805000 r8:f0804000 r7:c08e7f00 r6:f080400c r5:c08eb5f8 r4:c08fd300
    [ 3992.427866] [<c00093bc>] (gic_handle_irq) from [<c0013ac0>] (__irq_svc+0x40/0x88)
    [ 3992.427868] Exception stack(0xc08e7f00 to 0xc08e7f48)
    [ 3992.427872] 7f00: 00000000 de585550 00000000 c001fc40 c08e6000 c08eb1f0 c08e2524 c08e7f70
    [ 3992.427876] 7f20: c065f954 c0926428 c08eb24c c08e7f5c c08e7f60 c08e7f50 c0010290 c0010294
    [ 3992.427878] 7f40: 60070013 ffffffff
    [ 3992.427885] r9:c0926428 r8:c065f954 r7:c08e7f34 r6:ffffffff r5:60070013 r4:c0010294
    [ 3992.427893] [<c0010254>] (arch_cpu_idle) from [<c00625e8>] (default_idle_call+0x34/0x40)
    [ 3992.427898] [<c00625b4>] (default_idle_call) from [<c0062748>] (cpu_startup_entry+0x154/0x1c0)
    [ 3992.427904] [<c00625f4>] (cpu_startup_entry) from [<c0659078>] (rest_init+0x90/0x94)
    [ 3992.427906] r7:00000000

    [ 3992.427915] [<c0658fe8>] (rest_init) from [<c089fd90>] (start_kernel+0x408/0x414)

    [ 3992.427918] r5:00000001 r4:c0929040
    [ 3992.427923] [<c089f988>] (start_kernel) from [<80008090>] (0x80008090)
    [ 3992.427949] ------------[ cut here ]------------
    [ 3992.637619] ------------[ cut here ]------------
    [ 3992.637628] WARNING: CPU: 1 PID: 18 at /home/gtbldadm/processor-sdk-linux-rt-krogoth-build/build-CORTEX_1/arago-tmp-external-linaro-toolchain/work-shared/k2hk-evm/kernel-source/kernel/sched/core.c:3647 rt_mutex_setprio+0x414/0x418()
    [ 3992.637670] Modules linked in: sha512_arm sha1_generic sha1_arm_neon sha1_arm md5 jitterentropy_rng sha256_generic sha256_arm hmac drbg des_generic cbc xfrm_user xfrm4_tunnel ipcomp xfrm_ipcomp ah4 af_key xhci_plat_hcd xhci_hcd usbcore dwc3 udc_core extcon keystone_sa_driver ks_sa_rng authenc rng_core aes_arm debugss_kmodule(O) keystone_dsp_mem dwc3_keystone temperature_kmodule(O) davinci_wdt keystone_remoteproc sch_fq_codel uio_module_drv(O) uio rpmsg_proto virtio_rpmsg_bus remoteproc virtio virtio_ring ipsecmgr_mod(O) xfrm_algo hplibmod(O) gdbserverproxy(O) cryptodev(O) cmemk(O)
    [ 3992.637674] CPU: 1 PID: 18 Comm: ktimersoftd/1 Tainted: G W O 4.4.32-rt41-ge26c84b0ac #1
    [ 3992.637676] Hardware name: Keystone
    [ 3992.637678] Backtrace:
    [ 3992.637686] [<c0012e00>] (dump_backtrace) from [<c0012ffc>] (show_stack+0x18/0x1c)
    [ 3992.637693] r7:c0049e9c r6:200e0093 r5:00000000 r4:c08fd1c4
    [ 3992.637700] [<c0012fe4>] (show_stack) from [<c02a2a80>] (dump_stack+0x8c/0xa0)
    [ 3992.637709] [<c02a29f4>] (dump_stack) from [<c0023f48>] (warn_slowpath_common+0x88/0xb8)
    [ 3992.637714] r7:c0049e9c r6:00000e3f r5:00000009 r4:00000000
    [ 3992.637721] [<c0023ec0>] (warn_slowpath_common) from [<c002401c>] (warn_slowpath_null+0x24/0x2c)
    [ 3992.637727] r8:c08e4f80 r7:00000001 r6:00000062 r5:de587f80 r4:c08edfa8
    [ 3992.637734] [<c0023ff8>] (warn_slowpath_null) from [<c0049e9c>] (rt_mutex_setprio+0x414/0x418)
    [ 3992.637741] [<c0049a88>] (rt_mutex_setprio) from [<c0069a20>] (__rt_mutex_adjust_prio+0x34/0x50)
    [ 3992.637747] r10:c0373b34 r9:de0b3dd8 r8:00000000 r7:de0b3dd8 r6:c0960a88 r5:c08edfa8
    [ 3992.637750] r4:de0a95c0
    [ 3992.637756] [<c00699ec>] (__rt_mutex_adjust_prio) from [<c006a398>] (task_blocks_on_rt_mutex+0x2c0/0x2dc)
    [ 3992.637762] [<c006a0d8>] (task_blocks_on_rt_mutex) from [<c065b7ac>] (rt_spin_lock_slowlock+0xf8/0x310)
    [ 3992.637768] r10:c0373b34 r9:a00e0013 r8:de0b3dd8 r7:de0a95c0 r6:de0a9afc r5:de0b2000
    [ 3992.637770] r4:c0960a88
    [ 3992.637775] [<c065b6b4>] (rt_spin_lock_slowlock) from [<c065cff0>] (rt_spin_lock+0x5c/0x60)
    [ 3992.637781] r10:c0373b34 r9:c0960a88 r8:00000200 r7:c0373b34 r6:00000000 r5:de590580
    [ 3992.637783] r4:c0960a88
    [ 3992.637794] [<c065cf94>] (rt_spin_lock) from [<c0373b48>] (serial8250_backup_timeout+0x14/0x138)
    [ 3992.637797] r5:de590580 r4:c0960a88
    [ 3992.637807] [<c0373b34>] (serial8250_backup_timeout) from [<c007fa68>] (call_timer_fn+0x30/0xa0)
    [ 3992.637812] r7:c0373b34 r6:00000000 r5:de590580 r4:ffffe000
    [ 3992.637818] [<c007fa38>] (call_timer_fn) from [<c007fc88>] (run_timer_softirq+0x1b0/0x23c)
    [ 3992.637823] r7:00000000 r6:00000000 r5:de590580 r4:c0960ba0
    [ 3992.637828] [<c007fad8>] (run_timer_softirq) from [<c0026e78>] (do_current_softirqs+0x1b8/0x254)
    [ 3992.637834] r10:00000001 r9:de0b3ed0 r8:00000000 r7:04208140 r6:de0b2000 r5:00000004
    [ 3992.637836] r4:c08e22b0
    [ 3992.637840] [<c0026cc0>] (do_current_softirqs) from [<c0027390>] (run_ksoftirqd+0x34/0x64)
    [ 3992.637846] r10:00000000 r9:00000000 r8:ffffe000 r7:c08efeb0 r6:00000001 r5:de035300
    [ 3992.637848] r4:ffffe000
    [ 3992.637855] [<c002735c>] (run_ksoftirqd) from [<c004200c>] (smpboot_thread_fn+0x164/0x2b8)
    [ 3992.637858] r5:de035300 r4:de0b2000
    [ 3992.637863] [<c0041ea8>] (smpboot_thread_fn) from [<c003ec24>] (kthread+0xe4/0xfc)
    [ 3992.637869] r10:00000000 r9:00000000 r8:00000000 r7:c0041ea8 r6:de035300 r5:de035380
    [ 3992.637872] r4:00000000 r3:de0a95c0
    [ 3992.637877] [<c003eb40>] (kthread) from [<c000f9d0>] (ret_from_fork+0x14/0x24)
    [ 3992.637881] r7:00000000 r6:00000000 r5:c003eb40 r4:de035380
    [ 3992.637883] ---[ end trace 0000000000000002 ]---
    [ 3992.847618] ------------[ cut here ]------------

     

    This smells to me like there is some disagreement between Linux and the DSP TI/BIOS code about how shared memory is being defined.

     

  • I am attaching my code composer projects and the script I use to start the DSPs from Linux.

    ex02-messageq.zip

  • Hi,

    Sorry for the delayed reply.
    This have been forwarded to the TCI linux experts. Their feedback will be posted here.

    Best Regards,
    Yordan
  • Any update? This problem continues to impede progress here. We have yet to find any working example of IPC in the Processor SDK 3.02.00.05 that has ARM+Linux talking to DSP+BIOS that don't eventually reach a state requiring rebooting the board to recover.