This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RTOS: How to use IPC_Start for multiple applications running on Linux

Other Parts Discussed in Thread: OMAPL138

Tool/software: TI-RTOS

Hi There, I am using OMAPL138 SoC and I have few applications that are using IPC MessageQ but are independent as in all have there own main loop. 

I have added Ipc_start() function before using any IPC API's and it looks to work fine. But I get problem when I try to kill some of the apps... 

I get LAD_failed() error.. 

Instead, of just killing the app in Linux OS, I would like to do Ipc_stop.. But when I try to use this I am getting errors (Will post the errors soon)

My main question, is it okay to use Ipc_stop() to kill only one app when multiple apps are running that uses IPC?

And in DSP core, I am using only once "IpcMgr_ipcStartup" in main loop for IPC init but I am running multiple IPC apps (Different TI-RTOS Tasks).

Regards,

Mitesh

  • Hi Mitesh,

    Could you give details about SW release which you are using?
    Also attaching the logs should be useful.

    BR
    Tsvetolin Shulev
  • Hello, On Linux we have two apps running with their own IPC_loop() function shown below:

    The general flow of the apps are they runappCreate() and are in appExecute() for ever...  it never goes to appDelete also doesn't go to Ipc_stop().

    Whenever we kill the app from Linux we as of now use kill command of the Linux and which just kill the app and doesn't do Ipc_stop().. I would like to do this Ipc_stop. How can we call the Ipc_stop() for each of the apps? 

    Code::

    Int IPC_loop(
    const String processorName,
    const String bcAddr
    )
    {
    Int status;
    UInt16 remoteProcId;

    #ifdef DEBUG_IPC
    printf("--> main:\n");
    #endif

    /* Ipc initialization */
    status = Ipc_start();
    if ( status != Ipc_S_SUCCESS )
    {
    fprintf(stderr, "<-- Ipc_start failed\n");
    return status;
    }

    /* application create, exec, delete */
    remoteProcId = MultiProc_getId(processorName);

    /* application create phase */
    status = appCreate(remoteProcId, bcAddr);
    if (status != MessageQ_S_SUCCESS)
    {
    fprintf(stderr, "<-- appCreate Failed\n");
    return status;
    }

    /* application execute phase */
    status = appExecute();
    if (status != MessageQ_S_SUCCESS)
    {
    fprintf(stderr, "<-- appExecute Failed\n");
    return status;
    }

    /* application delete phase */
    status = appDelete();
    if (status != MessageQ_S_SUCCESS)
    {
    fprintf(stderr, "<-- appDelete Failed\n");
    return status;
    }

    /* Ipc finalization */
    (void)Ipc_stop();

    #ifdef DEBUG_IPC
    printf("<-- main:\n");
    #endif
    return (status < 0 ? status : 0);
    }

    }

  • Hi Mitesh,

    You indicated you are getting errors when attempting ipc_stop, can you share what those errors are?

    Generally, ipc_start and ipc_stop only need to be called once per core. Can you check what the return value of ipc_start is for both runs?

    Also, make sure you are calling ipc_detach before ipc_stop. See:

  • Hi Sahin, I shall upload the logs asap.

    However, in the User Guide (Section 2.2) for IPC it says to call Ipc_start every time not just once per core... 

  • Hi Sahin, the return of all the Ipc_start calls are 0. But if I call Ipc_start() only in the first app that ran on Linux, I get MessageQ Create failed error for the other two apps when called without Ipc_start().
  • Hi Mitesh,

    That's interesting, I would have expected the second Ipc_Start() call to return 1 to indicate that it's been already been setup. I need to look into this more.

    In any case, I think the proper procedure would be to send a "shutdown" message between the ARM and DSP to let the ARM know that the DSP is closing the connection. Otherwise, the ARM may return a "no socket connection" error or timeout when trying to send a message.

    It would be good to see the logs to get more insight into what's happening.
  • Hi Team

    Mitesh and I work together

    I've some more info on Mitesh's (our!) problem, in the hope that it is of some use?

    Thanks in advance - any input much appreciated

    Neil


    Some version specifics ...
    IPC 3.46.00.02
    XDC tools 3.32.01.22
    www.ti.com/.../TMDSLCDK138 lcdk 04.00.00.04

    e2eOne is an app which ...
    - takes a JSON file
    - compares its contents to the current known state
    - sees that a "full reconfigure" is needed
    - knowing that a reset is needed it ...
    > stops e2eTwo, the app which outputs info from our device
    > stops e2eThree, the app which writes some output to disk
    > restarts the DSP
    > starts e2eTwo
    > starts e2eThree
    > sends some info down a MessageQ to the DSP
    > ends

    When run from a fresh startup, the following (13 lines down from here) command line test loop works well

    When run again, we see occasional errors
    recvfrom failed: Link has been severed (67)
    rpmsgThreadFxn: transportGet failed on fd 9, returned -20
    which do not seem to affect our apps

    Does anyone know where those errors are coming from ???



    Command line loop ...

    root@omapl138-lcdk:~# for i in 3 4; do date -u ; /home/root/bin/e2eOne -c/some/path/myFile$i.json; sleep 5; ps aux | grep e2e | grep -v grep; sleep 5; done
    Wed Aug 8 15:09:49 UTC 2018
    echo from e2eOne : have /some/path/myFile3.json (stopping e2eTwo and e2eThree, restarting the DSP, restarting e2eTwo & e2eThree)
    recvfrom failed: Link has been severed (67)
    rpmsgThreadFxn: transportGet failed on fd 10, returned -20
    <-- e2eTwo: MessageQ_get Failed
    <-- e2eTwo: appExecute Failed
    recvfrom failed: Link has been severed (67)
    rpmsgThreadFxn: transportGet failed on fd 9, returned -20
    <-- e2eThree: MessageQ_get Failed
    <-- e2eThree: appExecute Failed
    root 968 0.1 2.1 12676 2704 pts/2 Sl 15:09 0:00 /home/root/bin/e2eTwo argTwoA argTwoB
    root 969 0.1 2.1 12696 2660 pts/2 Sl 15:09 0:00 /home/root/bin/e2eThree argThree
    Wed Aug 8 15:11:00 UTC 2018
    echo from e2eOne : have /some/path/myFile4.json (stopping e2eTwo and e2eThree, restarting the DSP, restarting e2eTwo & e2eThree)
    root 1003 0.1 2.1 12676 2704 pts/2 Sl 15:11 0:00 /home/root/bin/F280/e2eTwo argTwoA argTwoB
    root 1004 0.1 2.1 12696 2660 pts/2 Sl 15:11 0:00 /home/root/bin/F280/e2eThree argThree


    /var/volatile/log/syslog ...

    Aug 8 15:09:49 omapl138-lcdk kernel: remoteproc remoteproc0: stopped remote processor dsp
    Aug 8 15:09:49 omapl138-lcdk kernel: remoteproc remoteproc0: releasing dsp
    Aug 8 15:09:51 omapl138-lcdk kernel: davinci-rproc davinci-rproc.0: assigned reserved memory node dsp_cma@c3000000
    Aug 8 15:09:51 omapl138-lcdk kernel: remoteproc remoteproc0: dsp is available
    Aug 8 15:09:51 omapl138-lcdk kernel: remoteproc remoteproc0: powering up dsp
    Aug 8 15:09:51 omapl138-lcdk kernel: remoteproc remoteproc0: Booting fw image rproc-dsp-fw, size 5803392
    Aug 8 15:09:52 omapl138-lcdk kernel: virtio_rpmsg_bus virtio0: rpmsg host is online
    Aug 8 15:09:52 omapl138-lcdk kernel: remoteproc remoteproc0: registered virtio0 (type 7)
    Aug 8 15:09:52 omapl138-lcdk kernel: remoteproc remoteproc0: remote processor dsp is now up
    Aug 8 15:09:52 omapl138-lcdk kernel: virtio_rpmsg_bus virtio0: creating channel rpmsg-proto addr 0x3d
    Aug 8 15:09:52 omapl138-lcdk kernel: virtio_rpmsg_bus virtio0: msg received with no recipient
    Aug 8 15:11:01 omapl138-lcdk kernel: remoteproc remoteproc0: stopped remote processor dsp
    Aug 8 15:11:01 omapl138-lcdk kernel: remoteproc remoteproc0: releasing dsp
    Aug 8 15:11:03 omapl138-lcdk kernel: davinci-rproc davinci-rproc.0: assigned reserved memory node dsp_cma@c3000000
    Aug 8 15:11:03 omapl138-lcdk kernel: remoteproc remoteproc0: dsp is available
    Aug 8 15:11:03 omapl138-lcdk kernel: remoteproc remoteproc0: powering up dsp
    Aug 8 15:11:03 omapl138-lcdk kernel: remoteproc remoteproc0: Booting fw image rproc-dsp-fw, size 5803392
    Aug 8 15:11:03 omapl138-lcdk kernel: virtio_rpmsg_bus virtio0: rpmsg host is online
    Aug 8 15:11:03 omapl138-lcdk kernel: remoteproc remoteproc0: registered virtio0 (type 7)
    Aug 8 15:11:03 omapl138-lcdk kernel: remoteproc remoteproc0: remote processor dsp is now up
    Aug 8 15:11:04 omapl138-lcdk kernel: virtio_rpmsg_bus virtio0: creating channel rpmsg-proto addr 0x3d
    Aug 8 15:11:04 omapl138-lcdk kernel: virtio_rpmsg_bus virtio0: msg received with no recipient




    Another problem we're seeing it with thttp logging ... it can take up MASSES of space to report very little ...

    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: map cache - 0 allocated, 0 active (0 bytes), 0 free; hash size: 0; expire age: 1800
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: fdwatch - 0 polls (0/sec)
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: timers - 3 allocated, 3 active, 0 free
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: up 5441679 seconds, stats for 1 seconds:
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: thttpd - 0 connections (0/sec), 0 max simultaneous, 0 bytes (0/sec), 0 httpd_conns allocated
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: map cache - 0 allocated, 0 active (0 bytes), 0 free; hash size: 0; expire age: 1800
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: fdwatch - 1 polls (1/sec)
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: timers - 3 allocated, 3 active, 0 free
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: up 5441679 seconds, stats for 1 seconds:
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: thttpd - 0 connections (0/sec), 0 max simultaneous, 0 bytes (0/sec), 0 httpd_conns allocated
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: map cache - 0 allocated, 0 active (0 bytes), 0 free; hash size: 0; expire age: 1800
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: fdwatch - 1 polls (1/sec)
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: timers - 3 allocated, 3 active, 0 free
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: map cache - 0 allocated, 0 active (0 bytes), 0 free; hash size: 0; expire age: 1800
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: fdwatch - 1 polls (1/sec)
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: timers - 3 allocated, 3 active, 0 free
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: up 5441679 seconds, stats for 1 seconds:
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: thttpd - 0 connections (0/sec), 0 max simultaneous, 0 bytes (0/sec), 0 httpd_conns allocated
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: map cache - 0 allocated, 0 active (0 bytes), 0 free; hash size: 0; expire age: 1800
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: fdwatch - 1 polls (1/sec)
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: timers - 3 allocated, 3 active, 0 free
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: up 5441679 seconds, stats for 1 seconds:
    Aug 8 14:39:23 omapl138-lcdk thttpd[275]: thttpd - 0 connections (0/sec), 0 max simultaneous, 0 bytes (0/sec), 0 httpd_conns allocated


    Does anyone know how to make such logging go away and stay away?

    Anyone understand those "up 5441679 seconds" stats? The box has certainly not been up that long



    The broader question of when one calls Ipc_start() and Ipc_stop() is something I'd love some input on.
    Each of the 3 apps has its OWN start/stop pair. Sometimes all 3 apps run simultaneously, but usually
    it's just e2eTwo and e2eThree that run together (sounds like Dr Seuss!). Note that every call to Ipc_start
    is grabbing and checking the return value ...
    status = Ipc_start();
    if ( status != Ipc_S_SUCCESS )
    {
    fprintf(stderr, " <-- Ipc_start failed\n");
    return status;
    }
    This has never failed. Or certainly this does not fail if I run the loop many hundreds of times - I can't recall it ever failing

    Ipc_S_ALREADYSETUP is something I thought we might see here. We don't
  • This case has been marked "TI Thinks Resolved"

    Surely TI doesn't think that?

    Thanks in advance

    Neil

  • Please note that Mitesh can no longer contribute to this conversation, or any other, since some admin error seems to have locked him out of e2e (see below) ... could that be addressed, please ???

    Thanks in advance,

    Neil

  • Hi Neil,

    I'm sorry for the delayed response. I'm still looking into this.

    This error...

    Neil Gatenby said:

    recvfrom failed: Link has been severed (67)
    rpmsgThreadFxn: transportGet failed on fd 9, returned -20

    is Rpmsg complaining about a socket failure due to the DSP being reloaded. See: http://processors.wiki.ti.com/index.php/IPC_Slave_Error_Recovery#Host_Application_Recovery

    I received the same error as Mitesh after viewing this post for some reason. I was able to fix it by clearing my browser cache, have him try that or another browser if it still doesn't work.

    By the way, you can click "No" on the highlighted post to mark this thread as not resolved.

  • Thanks Sahin

    A very useful reply ... I'll follow up on those leads

    N