This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Getting DSP IPC Running on AM571x in Linux

I'm struggling with getting IPC working with the DSP. It seems like I'm on the right track, but I feel that I've hit sort of a roadblock with respect to finding more detailed documentation for troubleshooting. I am using SDK 02.00.02.11 and the unmodified DSP firmware and IPC libraries. Below is what I have been able to validate so far:

At startup, the DSP seems to be coming up and its firmware is loaded.

# dmesg | grep dsp
[    0.000000] Reserved memory: initialized node dsp1_cma@99000000, compatible id shared-dma-pool
[   20.689664] omap-rproc 40800000.dsp: assigned reserved memory node dsp1_cma@99000000
[   20.689692]  remoteproc2: 40800000.dsp is available
[   22.837667]  remoteproc2: powering up 40800000.dsp
[   22.837680]  remoteproc2: Booting fw image dra7-dsp1-fw.xe66, size 21365636
[   22.843174] omap_hwmod: mmu0_dsp1: _wait_target_disable failed
[   22.943430]  remoteproc2: remote processor 40800000.dsp is now up

The message: "_wait_target_disable failed" is a little disconcerting; but the last line shows the DSP is up, so perhaps it is not an issue...

I have created CMA memory pools for the two IPU's and the DSP per the am57xx-idk DTS example:

# dmesg | grep CMA
[    0.000000] Reserved memory: created CMA memory pool at 0x0000000095800000, size 56 MiB
[    0.000000] Reserved memory: created CMA memory pool at 0x0000000099000000, size 64 MiB
[    0.000000] Reserved memory: created CMA memory pool at 0x000000009d000000, size 32 MiB
# dmesg | grep cma
[    0.000000] Reserved memory: initialized node ipu2_cma@95800000, compatible id shared-dma-pool
[    0.000000] Reserved memory: initialized node dsp1_cma@99000000, compatible id shared-dma-pool
[    0.000000] Reserved memory: initialized node ipu1_cma@9d000000, compatible id shared-dma-pool
[    0.000000] cma: Reserved 128 MiB at 0x00000000b7c00000
[    0.000000] Memory: 543704K/848896K available (6251K kernel code, 294K rwdata, 2212K rodata, 320K init, 254K bss, 18472K reserved, 286720K cma-reserved, 136192K highmem)
[   20.688756] omap-rproc 58820000.ipu: assigned reserved memory node ipu1_cma@9d000000
[   20.689495] omap-rproc 55020000.ipu: assigned reserved memory node ipu2_cma@95800000
[   20.689664] omap-rproc 40800000.dsp: assigned reserved memory node dsp1_cma@99000000

I am also making sure that the CMEM kernel module is loaded with:

# dmesg | grep cmem
[ 1661.071894] cmemk initialized

And am succeeding in creating a contiguous memory partitions for use with the DSP.
An 80MB CMEM partition is created per below:

# cat /proc/iomem | grep CMEM
a0000000-a4ffffff : CMEM

And it looks like that memory is available for use:

# cat /proc/cmem
Block 0: Pool 0: 1 bufs size 0x5000000 (0x5000000 requested)
Pool 0 busy bufs:
Pool 0 free bufs:
id 0: phys addr 0xa0000000

Below is a log trace from the LAD log after starting the LAD daemon with:

# lad_dra7xx -l lad.txt -p 777
Opened log file: lad.txt
Set LAD's directory permissions to '777'
numProcessors = 5 id = 0 baseId = 0
Spawned daemon: lad_dra7xx

Tail the LAD log:

# tail -f /tmp/LAD/lad.txt
[0.802920] 
Initializing LAD... [0.804143] 
    opening FIFO: /tmp/LAD/LADCMDS

And run one of the opencl examples such as:

# /usr/share/ti/examples/opencl/platforms/platforms

At this point the example program hangs until ^C, and in another terminal I have the following LAD log output. The -5 status code output is printed many times prior to the disconnect at the end of the trace.

Expand the source section below to view the LAD log trace:

[0.693855] 
Initializing LAD... [0.696416] NameServer_setup: entered, refCount=0
[0.696471] NameServer_setup: creating listener thread
[0.696543] NameServer_setup: exiting, refCount=1
[0.696629] NameServer_create(): 'GateMP'
[0.696647] 
    opening FIFO: /tmp/LAD/LADCMDS
[0.698442] listener_cb: Entered Listener thread.
[0.698462] NameServer: waiting for unblockFd: 0, and socks: maxfd: 0
[17015278.660230] Retrieving command...
[17015278.660586] 
LAD_CONNECT: 
[17015278.660600]     client FIFO name = /tmp/LAD/1585
[17015278.660607]     client PID = 1585
[17015278.660614]     assigned client handle = 0
[17015278.660873]     FIFO /tmp/LAD/1585 created
[17015278.661349]     FIFO /tmp/LAD/1585 opened for writing
[17015278.661381]     sent response
[17015278.661389] DONE
[17015278.661396] Retrieving command...
[17015278.661464] Sending response...
[17015278.661477] Retrieving command...
[17015278.661510] LAD_MULTIPROC_GETCONFIG: calling MultiProc_getConfig()...
[17015278.661522] MultiProc_getConfig() - 5 procs
[17015278.661530] # processors in cluster: 5
[17015278.661536] cluster baseId: 0
[17015278.661543]     ProcId 0 - "HOST"
[17015278.661550]     ProcId 1 - "IPU2"
[17015278.661557]     ProcId 2 - "IPU1"
[17015278.661563]     ProcId 3 - "DSP2"
[17015278.661570]     ProcId 4 - "DSP1"
[17015278.661577]     status = 0
[17015278.661583] DONE
[17015278.661589] Sending response...
[17015278.661598] Retrieving command...
[17015278.661637] LAD_NAMESERVER_SETUP: calling NameServer_setup()...
[17015278.661647] NameServer_setup: entered, refCount=1
[17015278.661654] NameServer_setup: already setup
[17015278.661660] NameServer_setup: exiting, refCount=2
[17015278.661666]     status = 1
[17015278.661673] DONE
[17015278.661679] Sending response...
[17015278.661688] Retrieving command...
[17015278.661716] LAD_MESSAGEQ_GETCONFIG: calling MessageQ_getConfig()...
[17015278.661725]     status = 0
[17015278.661732] DONE
[17015278.661738] Sending response...
[17015278.661746] Retrieving command...
[17015278.661773] LAD_MESSAGEQ_SETUP: calling MessageQ_setup()...
[17015278.661781] MessageQ_setup: entered, refCount=0
[17015278.661789] NameServer_create(): 'MessageQ'
[17015278.661800] MessageQ_setup: exiting, refCount=1
[17015278.661807]     status = 0
[17015278.661813] DONE
[17015278.661819] Sending response...
[17015278.661828] Retrieving command...
[17015278.661977] NameServer_attach: --> procId=1, refCount=0
[17015278.677036] NameServer_attach: created send socket: 6, procId 1
[17015278.677084] NameServer_attach: connect failed: procId=1, errno=22 (Invalid argument)
[17015278.677108]     closing send socket: 6
[17015278.677128] NameServer_attach: <-- refCount=0, status=-1
[17015278.677137] Sending response...
[17015278.677154] Retrieving command...
[17015278.677204] NameServer_attach: --> procId=2, refCount=0
[17015278.677220] NameServer_attach: created send socket: 6, procId 2
[17015278.677230] NameServer_attach: connect failed: procId=2, errno=22 (Invalid argument)
[17015278.677240]     closing send socket: 6
[17015278.677255] NameServer_attach: <-- refCount=0, status=-1
[17015278.677263] Sending response...
[17015278.677275] Retrieving command...
[17015278.677303] NameServer_attach: --> procId=3, refCount=0
[17015278.677317] NameServer_attach: created send socket: 6, procId 3
[17015278.677326] NameServer_attach: connect failed: procId=3, errno=22 (Invalid argument)
[17015278.677335]     closing send socket: 6
[17015278.677348] NameServer_attach: <-- refCount=0, status=-1
[17015278.677355] Sending response...
[17015278.677367] Retrieving command...
[17015278.677393] NameServer_attach: --> procId=4, refCount=0
[17015278.677407] NameServer_attach: created send socket: 6, procId 4
[17015278.677416] NameServer_attach: connect failed: procId=4, errno=22 (Invalid argument)
[17015278.677425]     closing send socket: 6
[17015278.677436] NameServer_attach: <-- refCount=0, status=-1
[17015278.677444] Sending response...
[17015278.677456] Retrieving command...
[17015278.677492] LAD_GATEMP_ISSETUP: calling GateMP_isSetup()...
[17015278.677501]     status = 0
[17015278.677508] DONE
[17015278.677527] Sending response...
[17015278.677539] Retrieving command...
[17015278.677571] LAD_GATEHWSPINLOCK_GETCONFIG: calling GateHWSpinlock_getConfig()...
[17015278.677580] GateHWSpinlock_getConfig()    baseAddr = 0x4a0f6000
[17015278.677587]     size = 0x1000
    offset = 0x800
[17015278.677594]     status = 0
[17015278.677601] DONE
[17015278.677607] Sending response...
[17015278.677617] Retrieving command...
[17015278.677734] LAD_GATEMP_START: calling GateMP_start()...
[17015278.677746]     status = 0
[17015278.677753] DONE
[17015278.677759] Sending response...
[17015278.677770] Retrieving command...
[17015278.677796] LAD_GATEMP_GETNUMRESOURCES: calling GateMP_getNumResources()...
[17015278.677805]     status = 0
[17015278.677812] DONE
[17015278.677817] Sending response...
[17015278.677828] Retrieving command...
[17015278.677865] LAD_NAMESERVER_GET: calling NameServer_get(0x2f230, '_GateMP_TI_dGate'[17015278.677875] )...
[17015278.677885] NameServer_getLocal: entry key: '_GateMP_TI_dGate' not found!
[17015278.677893] NameServer_getRemote: no socket connection to processor 1
[17015278.677901] NameServer_getRemote: no socket connection to processor 2
[17015278.677908] NameServer_getRemote: no socket connection to processor 3
[17015278.677914] NameServer_getRemote: no socket connection to processor 4
[17015278.677921]     value = 0x10
[17015278.677927]     status = -5
[17015278.677934] DONE
[17015278.677939] Sending response...
[17015278.677950] Retrieving command...
[17015278.678171] LAD_MESSAGEQ_CREATE: calling MessageQ_create(0x2ead8, 0x2eaf8)...
[17015278.678185] MessageQ_create: creating 'OCL:MsgQ:1585'
[17015278.678198] NameServer_add: Entered key: 'OCL:MsgQ:1585', data: 0x80
[17015278.678206] MessageQ_create: returning obj=0x2f658, qid=0x80
[17015278.678214]     status = 0
[17015278.678220] DONE
[17015278.678226] Sending response...
[17015278.678239] Retrieving command...
[17015278.678285] LAD_NAMESERVER_GETUINT32: calling NameServer_getUInt32(0x2f570, 'OCL:DSP1:MsgQ')...
[17015278.678296] NameServer_getLocal: entry key: 'OCL:DSP1:MsgQ' not found!
[17015278.678304] NameServer_getRemote: no socket connection to processor 1
[17015278.678346] NameServer_getRemote: no socket connection to processor 2
[17015278.678353] NameServer_getRemote: no socket connection to processor 3
[17015278.678360] NameServer_getRemote: no socket connection to processor 4
[17015278.678366]     value = 0x80
[17015278.678372]     status = -5
[17015278.678378] DONE

...

[17243489.757855] Sending response...
[17243489.760522] Retrieving command...
[17243489.760624] LAD_MESSAGEQ_DESTROY: calling MessageQ_destroy()...
[17243489.760634] MessageQ_destroy: entered, refCount=1
[17243489.760643] MessageQ_delete: deleting 0x2f658
[17243489.760660] MessageQ_delete: returning 0
[17243489.760673] MessageQ_destroy: exiting, refCount=0
[17243489.760680]     status = 0
[17243489.760687] DONE
[17243489.760694] Sending response...
[17243489.760704] Retrieving command...
[17243489.763787]    EOF detected on FIFO, closing FIFO: /tmp/LAD/LADCMDS
[17243489.763839] 
    opening FIFO: /tmp/LAD/LADCMDS

I found the following IPC troubleshooting FAQ:

processors.wiki.ti.com/.../IPC_3.x_FAQ
This is because LAD is looking for the default GateMP instance, and did not find it. On the DRA7xx, verify

  1. SharedRegion 0 is defined on DSP1. GateMP implementation on the host requires use of SharedRegion
  2. DSP1 is the owner of SharedRegion 0
  3. Ipc_start() is called on DSP1. This ensures the default GateMP instance is created.


At the end of the log I am getting an error that is close to the one shown in the link above, but I don’t know how to proceed with the verification steps outlined in the FAQ.


These initial errors appear to be more like:
processors.wiki.ti.com/.../IPC_3.x_FAQ
Internally, LAD tries to connect to sockets created by the rpmsg_proto kernel driver. This connection can fail for several reasons, so this is a common "general" error code. Some examples that generate this error condition are below.

  • Failure to load the slave image. If the slave executable fails to load, the sockets to communicate with it are not created. Make sure the slaves you need to communicate with are already loaded.
  • Failure to provide the correct KERNEL_INSTALL_DIR when building the user-side IPC libraries. Some customers reported this error when using Linux 3.9+ kernels (where the AF_MAX value in socket.h has increased). This created a mismatch in what AF_RPMSG was set to in user libs and the kernel, and the connect call failed. The solution was to require users set KERNEL_INSTALL_DIR when building the user-space libraries so IPC can interrogate the kernel version and set AF_RPMSG appropriately.
  • Failure to configure the slave image correctly. Internally, the slave-side MessageQ transport (TransportRpmsg) broadcasts its availability to the Linux host. If this broadcast doesn't occur (e.g., b/c the slave wasn't configured with TransportRpmsg), the socket connection will fail with error 22.
  • General slave-to-host interrupt failure. Some devices (e.g. DRA7XX) have interrupt crossbars that must be configured correctly as part of system boot (e.g. uboot). If these crossbars aren't initialized correctly, the slave-side broadcast that TransportRpmsg is available won't be received, the underlying socket connection won't be available, and the first sign of failure will be LAD trying to connect to the socket.


Any help or advice that I could get to test the following functionality of the DSP would be much appreciated, e.g.:

  • If the slave executable fails to load, the sockets to communicate with it are not created,
    • How can I determine if the slave executable is loaded?
  • How can I verify that there is no "mismatch in what AF_RPMSG was set to in user libs and the kernel"?
  • How can I ensure the slave is configured with: TransportRpmsg?
  • And lastly, this seems to be the most plausible issue: how can I ensure that the interrupt crossbar is configured correctly?


Thanks,
Weston

  • Try setting an environment variable IPC_DEBUG=1 before you run. You can set it to 2 for even more details, but let's see 1 first. This info came from here:

    processors.wiki.ti.com/.../IPC_DEBUG
  • Hi Brad, Thanks for the response.

    The output I get from running the platforms example when setting IPC_DEBUG to 1 (using: export IPC_DEBUG=1, from the same shell) is:

    ./platforms 
    MultiProc_getConfig: got LAD response for client 0, status=0
    NameServer_setup: got LAD response for client 0, status=1
    MessageQ_getConfig: got LAD response for client 0, status=0
    MessageQ_setup: LAD response for client 0, status=0
    NameServer_attach: LAD response, status=-1
    NameServer_attach: LAD response, status=-1
    NameServer_attach: LAD response, status=-1
    NameServer_attach: LAD response, status=-1
    GateMP_isSetup: got LAD response for client 0, status=0
    GateHWSpinlock_getConfig: got LAD response for client 0, status=0
    GateMP_start: got LAD response for client 0, status=0
    GateMP_getNumResources: got LAD response for client 0, status=0
    NameServer_get: got LAD response for client 0
    GateMP_start: could not open default gate,                 status=-5
    GateMP_stop: entered
    Ipc_start: GateMP_start failed: not found -5
    MessageQ_create: got LAD response for client 0, status=0
    MessageQ_create: creating endpoints for 'OCL:MsgQ:1580' queueIndex 0
    NameServer_getUInt32: got LAD response for client 0
    NameServer_getUInt32: got LAD response for client 0
    NameServer_getUInt32: got LAD response for client 0
    ... (many hundreds of repetitions of the line above) 
    NameServer_getUInt32: got LAD response for client 0
    NameServer_getUInt32: got LAD response for client 0
    NameServer_getUInt32: got LAD response for client 0
    Ipc: Caught SIGINT, calling Ipc_stop...
    ^CMessageQ_destroy: got LAD response for client 0, status=-5
    Ipc_stop: MessageQ_destroy() failed: -5
    

    This is more or less the same information that is in the lad.txt log in the original post (collapsed source section).

    The first issue I see is the NameServer_attach response status, then I see "GateMP_start: could not open default gate, status=-5"; but in the initial portion of the trace it appears that GateMP is started successfully (below is from the more detailed lad.txt log in the original post):

    [0.696629] NameServer_create(): 'GateMP'
    ...
    [17015278.677492] LAD_GATEMP_ISSETUP: calling GateMP_isSetup()...
    [17015278.677501]     status = 0
    ...
    [17015278.677734] LAD_GATEMP_START: calling GateMP_start()...
    [17015278.677746]     status = 0
    ...
    [17015278.677796] LAD_GATEMP_GETNUMRESOURCES: calling GateMP_getNumResources()...
    [17015278.677805]     status = 0

    An issue appears to be that the socket creation or connection for processor ID 4 (DSP1) is failing:

    [17015278.677393] NameServer_attach: --> procId=4, refCount=0
    [17015278.677407] NameServer_attach: created send socket: 6, procId 4
    [17015278.677416] NameServer_attach: connect failed: procId=4, errno=22 (Invalid argument)
    [17015278.677425]     closing send socket: 6
    [17015278.677436] NameServer_attach: <-- refCount=0, status=-1

    Then on subsequent connections to the processor we see failures:

    [17015278.677865] LAD_NAMESERVER_GET: calling NameServer_get(0x2f230, '_GateMP_TI_dGate'[17015278.677875] )...
    [17015278.677885] NameServer_getLocal: entry key: '_GateMP_TI_dGate' not found!
    [17015278.677893] NameServer_getRemote: no socket connection to processor 1
    [17015278.677901] NameServer_getRemote: no socket connection to processor 2
    [17015278.677908] NameServer_getRemote: no socket connection to processor 3
    [17015278.677914] NameServer_getRemote: no socket connection to processor 4
    [17015278.677921]     value = 0x10
    [17015278.677927]     status = -5
    [17015278.677934] DONE

    Something in the LAD log that doesn't seem quite right:

    [17015278.661510] LAD_MULTIPROC_GETCONFIG: calling MultiProc_getConfig()...
    [17015278.661522] MultiProc_getConfig() - 5 procs
    [17015278.661530] # processors in cluster: 5
    [17015278.661536] cluster baseId: 0
    [17015278.661543]     ProcId 0 - "HOST"
    [17015278.661550]     ProcId 1 - "IPU2"
    [17015278.661557]     ProcId 2 - "IPU1"
    [17015278.661563]     ProcId 3 - "DSP2"
    [17015278.661570]     ProcId 4 - "DSP1"

    The AM571x only has a single DSP, so the MultiProc_getConfig() call is not returning correct information about the platform. How does this API call get the platform information (oh, I see, from: MultiProcCfg_dra7xx.c in ./linux-devkit/sysroots/cortexa15hf-vfp-neon-linux-gnueabi/usr/share/ti/ti-ipc-tree/linux/src/cfg)? Is this information fixed in the LAD or the resources that the LAD is using (MultiProc module) at build time (it appears fixed), or configurable at run-time (not as far as I can tell, although the literature would indicate otherwise with: "Your application can query the MultiProc table using various runtime APIs." ... But this may only be available with BIOS)? In which case, do I need to rebuild/configure some things to re-target for the AM571x (it would appear so unless it is valid for the configuration to indicate there is an extra DSP when there is not)? I would imagine that the IPC for the dual core ARM and DSP SoC would be configured significantly differently than for the single core version.

    Also, I have read through this documentation:

    http://processors.wiki.ti.com/index.php/IPC_Users_Guide/MultiProc_Module

    But it does not seem applicable to the Linux environment. Is there a counterpart to this documentation for Linux? Or is it unnecessary to rebuild/reconfigure the IPC as is sort of indicated here:

    http://processors.wiki.ti.com/index.php/IPC_Install_Guide_Linux

    PS, below in the collapsed section, I've included the debug output for IPC_DEBUG=2:

    ./platforms 
    
    LAD_connect: PID = 1806, fifoName = /tmp/LAD/1806
    
    LAD_putCommand: cmd = 0
    LAD_putCommand: status = 0
    
    LAD_connect: got response
        status == LAD_SUCCESS, assignedId=0
    
    LAD_putCommand: cmd = 2
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    
    LAD_putCommand: cmd = 24
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    MultiProc_getConfig: got LAD response for client 0, status=0
    
    LAD_putCommand: cmd = 3
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    NameServer_setup: got LAD response for client 0, status=1
    
    LAD_putCommand: cmd = 18
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    MessageQ_getConfig: got LAD response for client 0, status=0
    
    LAD_putCommand: cmd = 19
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    MessageQ_setup: LAD response for client 0, status=0
    
    LAD_putCommand: cmd = 16
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    NameServer_attach: LAD response, status=-1
    
    LAD_putCommand: cmd = 16
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    NameServer_attach: LAD response, status=-1
    
    LAD_putCommand: cmd = 16
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    NameServer_attach: LAD response, status=-1
    
    LAD_putCommand: cmd = 16
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    NameServer_attach: LAD response, status=-1
    
    LAD_putCommand: cmd = 31
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    GateMP_isSetup: got LAD response for client 0, status=0
    
    LAD_putCommand: cmd = 32
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    GateHWSpinlock_getConfig: got LAD response for client 0, status=0
    
    LAD_putCommand: cmd = 27
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    GateMP_start: got LAD response for client 0, status=0
    
    LAD_putCommand: cmd = 28
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    GateMP_getNumResources: got LAD response for client 0, status=0
    
    LAD_putCommand: cmd = 9
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    NameServer_get: got LAD response for client 0
    GateMP_start: could not open default gate,                 status=-5
    GateMP_stop: entered
    Ipc_start: GateMP_start failed: not found -5
    
    LAD_putCommand: cmd = 21
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    MessageQ_create: got LAD response for client 0, status=0
    MessageQ_create: creating endpoints for 'OCL:MsgQ:1806' queueIndex 0
    
    LAD_putCommand: cmd = 11
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    NameServer_getUInt32: got LAD response for client 0
    
    LAD_putCommand: cmd = 11
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    NameServer_getUInt32: got LAD response for client 0
    
    LAD_putCommand: cmd = 11
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    NameServer_getUInt32: got LAD response for client 0
    
    ... (many hundreds of repetitions of the lines above)
    
    LAD_putCommand: cmd = 11
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    NameServer_getUInt32: got LAD response for client 0
    
    LAD_putCommand: cmd = 11
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    NameServer_getUInt32: got LAD response for client 0
    
    LAD_putCommand: cmd = 11
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    ^CIpc: Caught SIGINT, calling Ipc_stop...
    
    LAD_putCommand: cmd = 20
    LAD_putCommand: status = 0
    LAD_getResponse: client = 0
    LAD_getResponse: got response
    MessageQ_destroy: got LAD response for client 0, status=-5
    Ipc_stop: MessageQ_destroy() failed: -5

    Post postscript:

    Excerpting the location of the failure from the log trace in source file:
    ./linux-devkit/sysroots/cortexa15hf-vfp-neon-linux-gnueabi/usr/share/ti/ti-ipc-tree/linux/src/daemon/NameServer_daemon.c

        /* create socket for sending messages to remote processor */
        sock = socket(AF_RPMSG, SOCK_SEQPACKET, 0);
        if (sock < 0) {
            status = NameServer_E_FAIL;
            LOG2("NameServer_attach: socket failed: %d, %s\n", errno,
                    strerror(errno));
            goto done;
        }
        NameServer_module->comm[clId].sendSock = sock;
        LOG2("NameServer_attach: created send socket: %d, procId %d\n", sock,
                procId);
    
        err = ConnectSocket(sock, procId, MESSAGEQ_RPMSG_PORT);
        if (err < 0) {
            status = NameServer_E_FAIL;
            LOG3("NameServer_attach: connect failed: procId=%d, errno=%d (%s)\n",
                    procId, errno, strerror(errno));
            goto done;
        }

    As is evident due to the trace:

    [17015278.677393] NameServer_attach: --> procId=4, refCount=0
    [17015278.677407] NameServer_attach: created send socket: 6, procId 4

    The socket is created successfully. But ConnectSocket fails. What can cause ConnectSocket to fail?

    ConnectSocket in:
    ./linux-devkit/sysroots/cortexa15hf-vfp-neon-linux-gnueabi/usr/share/ti/ti-ipc-tree/linux/src/utils/SocketFxns.c

        err = connect(sock, (struct sockaddr *)&dstAddr, len);
        if (err < 0) {
            /* don't hard-printf since this is no longer fatal */
            PRINTVERBOSE2("connect failed: %s (%d)\n", strerror(errno), errno);
            return (-1);
        }

    EVM output from the platforms example, and from the LAD (for reference):

    # /usr/share/ti/examples/opencl/platforms/platforms
    PLATFORM: TI AM57x
      Version: OpenCL 1.1 TI product version 01.01.08.01 (Apr  5 2016 21:35:38)
      Vendor : Texas Instruments, Inc.
      Profile: FULL_PROFILE
        DEVICE: TI Multicore C66 DSP
          Type       : ACCELERATOR
          CompUnits  : 2
          Frequency  : 0.75 GHz
          Glb Mem    :  163840 KB
          GlbExt1 Mem:       0 KB
          GlbExt2 Mem:       0 KB
          Msmc Mem   :    1024 KB
          Loc Mem    :     128 KB
          Max Alloc  :  147456 KB
    
    
    # tail -f /tmp/LAD/lad.txt 
    [0.717181] 
    Initializing LAD... [0.717956] NameServer_setup: entered, refCount=0
    [0.717978] NameServer_setup: creating listener thread
    [0.718064] NameServer_setup: exiting, refCount=1
    [0.718153] NameServer_create(): 'GateMP'
    [0.718171] 
        opening FIFO: /tmp/LAD/LADCMDS
    [0.718261] listener_cb: Entered Listener thread.
    [0.718272] NameServer: waiting for unblockFd: 0, and socks: maxfd: 0
    [189.235834] Retrieving command...
    [189.236065] 
    LAD_CONNECT: 
    [189.236077]     client FIFO name = /tmp/LAD/1526
    [189.236086]     client PID = 1526
    [189.236095]     assigned client handle = 0
    [189.236149]     FIFO /tmp/LAD/1526 created
    [189.236285]     FIFO /tmp/LAD/1526 opened for writing
    [189.236321]     sent response
    [189.236330] DONE
    [189.236337] Retrieving command...
    [189.236422] Sending response...
    [189.236438] Retrieving command...
    [189.236485] LAD_MULTIPROC_GETCONFIG: calling MultiProc_getConfig()...
    [189.236499] MultiProc_getConfig() - 5 procs
    [189.236507] # processors in cluster: 5
    [189.236515] cluster baseId: 0
    [189.236522] 	ProcId 0 - "HOST"
    [189.236530] 	ProcId 1 - "IPU2"
    [189.236538] 	ProcId 2 - "IPU1"
    [189.236545] 	ProcId 3 - "DSP2"
    [189.236552] 	ProcId 4 - "DSP1"
    [189.236560]     status = 0
    [189.236567] DONE
    [189.236574] Sending response...
    [189.236585] Retrieving command...
    [189.236640] LAD_NAMESERVER_SETUP: calling NameServer_setup()...
    [189.236651] NameServer_setup: entered, refCount=1
    [189.236658] NameServer_setup: already setup
    [189.236665] NameServer_setup: exiting, refCount=2
    [189.236673]     status = 1
    [189.236680] DONE
    [189.236686] Sending response...
    [189.236698] Retrieving command...
    [189.236737] LAD_MESSAGEQ_GETCONFIG: calling MessageQ_getConfig()...
    [189.236746]     status = 0
    [189.236754] DONE
    [189.236760] Sending response...
    [189.236772] Retrieving command...
    [189.236810] LAD_MESSAGEQ_SETUP: calling MessageQ_setup()...
    [189.236820] MessageQ_setup: entered, refCount=0
    [189.236828] NameServer_create(): 'MessageQ'
    [189.236841] MessageQ_setup: exiting, refCount=1
    [189.236849]     status = 0
    [189.236856] DONE
    [189.236863] Sending response...
    [189.236874] Retrieving command...
    [189.237084] NameServer_attach: --> procId=1, refCount=0
    [189.237115] NameServer_attach: created send socket: 6, procId 1
    [189.237148] NameServer_attach: connect failed: procId=1, errno=22 (Invalid argument)
    [189.237175]     closing send socket: 6
    [189.237200] NameServer_attach: <-- refCount=0, status=-1
    [189.237209] Sending response...
    [189.237221] Retrieving command...
    [189.237263] NameServer_attach: --> procId=2, refCount=0
    [189.237282] NameServer_attach: created send socket: 6, procId 2
    [189.237293] NameServer_attach: connect failed: procId=2, errno=22 (Invalid argument)
    [189.237305]     closing send socket: 6
    [189.237324] NameServer_attach: <-- refCount=0, status=-1
    [189.237333] Sending response...
    [189.237344] Retrieving command...
    [189.237382] NameServer_attach: --> procId=3, refCount=0
    [189.237400] NameServer_attach: created send socket: 6, procId 3
    [189.237424] NameServer_attach: created receive socket: 7, procId 3
    [189.262248] NameServer: back from select()
    [189.262270] NameServer: listener thread, event: REFRESH
    [189.262283] NameServer: waiting for unblockFd: 0, and socks: maxfd: 7
    [189.262304] NameServer_attach: <-- refCount=1, status=0
    [189.262314] Sending response...
    [189.262434] Retrieving command...
    [189.262454] LAD_GATEMP_ISSETUP: calling GateMP_isSetup()...
    [189.262462]     status = 0
    [189.262470] DONE
    [189.262478] Sending response...
    [189.262518] Retrieving command...
    [189.262533] LAD_GATEMP_ATTACH: calling GateMP_attach(3)...
    [189.262551] NameServer_getRemote: Sending request via sock: 6
    [189.262561] NameServer_getRemote: requesting from procId 3, GateMP: _GateMP_TI_dGate
    [189.262583] NameServer_getRemote: pending on waitFd: 1
    [189.262677] NameServer: back from select()
    [189.262688] NameServer: Listener got NameServer message from sock: 7!
    [189.262705] listener_cb: recvfrom socket: fd: 7
    [189.262714] 	Received ns msg: nbytes: 484, from addr: 61, from vproc: 3
    [189.262722] NameServer Reply: instanceName: GateMP, name: _GateMP_TI_dGate, value: 0x40197
    [189.262756] NameServer: waiting for unblockFd: 0, and socks: maxfd: 7
    [189.262842] NameServer_getRemote: value for GateMP:_GateMP_TI_dGate not found.
    [189.262854] GateMP_attach: failed to open default gate on procId 3
    [189.262862]     status = -5
    [189.262869] DONE
    [189.262876] Sending response...
    [189.262913] Retrieving command...
    [189.262928] NameServer_attach: --> procId=4, refCount=0
    [189.262945] NameServer_attach: created send socket: 8, procId 4
    [189.262965] NameServer_attach: created receive socket: 9, procId 4
    [189.263138] NameServer: back from select()
    [189.263152] NameServer: listener thread, event: REFRESH
    [189.263162] NameServer: waiting for unblockFd: 0, and socks: maxfd: 9
    [189.263182] NameServer_attach: <-- refCount=1, status=0
    [189.263192] Sending response...
    [189.263248] Retrieving command...
    [189.263264] LAD_GATEMP_ISSETUP: calling GateMP_isSetup()...
    [189.263271]     status = 0
    [189.263278] DONE
    [189.263284] Sending response...
    [189.263319] Retrieving command...
    [189.263335] LAD_GATEMP_ATTACH: calling GateMP_attach(4)...
    [189.263408] NameServer_getRemote: Sending request via sock: 8
    [189.263416] NameServer_getRemote: requesting from procId 4, GateMP: _GateMP_TI_dGate
    [189.263435] NameServer_getRemote: pending on waitFd: 1
    [189.263527] NameServer: back from select()
    [189.263538] NameServer: Listener got NameServer message from sock: 9!
    [189.263551] listener_cb: recvfrom socket: fd: 9
    [189.263559] 	Received ns msg: nbytes: 484, from addr: 61, from vproc: 2
    [189.263566] NameServer Reply: instanceName: GateMP, name: _GateMP_TI_dGate, value: 0x404f0
    [189.263582] NameServer: waiting for unblockFd: 0, and socks: maxfd: 9
    [189.264911] NameServer_getRemote: Reply from: 4, GateMP:[189.264933] _GateMP_TI_dGate, value buffer content: 0x600...
    [189.264942] _GateMP_TI_dGate, value buffer content: 0x40001...
    [189.264949] _GateMP_TI_dGate, value buffer content: 0xa...
    [189.264957] _GateMP_TI_dGate, value buffer content: 0x102...
    [189.264968] NameServer_getRemote: Sending request via sock: 8
    [189.264976] NameServer_getRemote: requesting from procId 4, GateMP: _GateMP_TI_info
    [189.264997] NameServer_getRemote: pending on waitFd: 1
    [189.265085] NameServer: back from select()
    [189.265096] NameServer: Listener got NameServer message from sock: 9!
    [189.265110] listener_cb: recvfrom socket: fd: 9
    [189.265117] 	Received ns msg: nbytes: 484, from addr: 61, from vproc: 2
    [189.265125] NameServer Reply: instanceName: GateMP, name: _GateMP_TI_info, value: 0x40b08
    [189.265141] NameServer: waiting for unblockFd: 0, and socks: maxfd: 9
    [189.265164] NameServer_getRemote: Reply from: 4, GateMP:[189.265173] _GateMP_TI_info, value buffer content: 0x99500300...
    [189.265181] _GateMP_TI_info, value buffer content: 0x99500400...
    [189.265188] _GateMP_TI_info, value buffer content: 0x0...
    [189.265195] _GateMP_TI_info, value buffer content: 0x100...
    [189.265202] _GateMP_TI_info, value buffer content: 0x200...
    [189.265209] _GateMP_TI_info, value buffer content: 0x0...
    [189.265244] GateMP_get_sr0: UIO support not found
    [189.265300]     status = 2
    [189.265310] DONE
    [189.265317] Sending response...
    [189.265366] Retrieving command...
    [189.265384] LAD_GATEMP_ISSETUP: calling GateMP_isSetup()...
    [189.265392]     status = 0
    [189.265399] DONE
    [189.265406] Sending response...
    [189.265618] Retrieving command...
    [189.265637] LAD_GATEHWSPINLOCK_GETCONFIG: calling GateHWSpinlock_getConfig()...
    [189.265646] GateHWSpinlock_getConfig()	baseAddr = 0x4a0f6000
    [189.265653] 	size = 0x1000
    	offset = 0x800
    [189.265661]     status = 0
    [189.265667] DONE
    [189.265674] Sending response...
    [189.265805] Retrieving command...
    [189.265822] LAD_GATEMP_START: calling GateMP_start()...
    [189.265830]     status = 0
    [189.265837] DONE
    [189.265843] Sending response...
    [189.265877] Retrieving command...
    [189.265892] LAD_GATEMP_GETNUMRESOURCES: calling GateMP_getNumResources()...
    [189.265900]     status = 0
    [189.265906] DONE
    [189.265913] Sending response...
    [189.265964] Retrieving command...
    [189.265979] LAD_NAMESERVER_GET: calling NameServer_get(0x2f230, '_GateMP_TI_dGate'[189.265998] )...
    [189.266010] NameServer_getLocal: entry key: '_GateMP_TI_dGate' not found!
    [189.266018] NameServer_getRemote: no socket connection to processor 1
    [189.266025] NameServer_getRemote: no socket connection to processor 2
    [189.266032] NameServer_getRemote: Sending request via sock: 6
    [189.266039] NameServer_getRemote: requesting from procId 3, GateMP: _GateMP_TI_dGate
    [189.266059] NameServer_getRemote: pending on waitFd: 1
    [189.266173] NameServer: back from select()
    [189.266188] NameServer: Listener got NameServer message from sock: 7!
    [189.266213] listener_cb: recvfrom socket: fd: 7
    [189.266221] 	Received ns msg: nbytes: 484, from addr: 61, from vproc: 3
    [189.266229] NameServer Reply: instanceName: GateMP, name: _GateMP_TI_dGate, value: 0x40f30
    [189.266244] NameServer: waiting for unblockFd: 0, and socks: maxfd: 9
    [189.266258] NameServer_getRemote: value for GateMP:_GateMP_TI_dGate not found.
    [189.266271] NameServer_getRemote: Sending request via sock: 8
    [189.266280] NameServer_getRemote: requesting from procId 4, GateMP: _GateMP_TI_dGate
    [189.266298] NameServer_getRemote: pending on waitFd: 1
    [189.266384] NameServer: back from select()
    [189.266395] NameServer: Listener got NameServer message from sock: 9!
    [189.266407] listener_cb: recvfrom socket: fd: 9
    [189.266416] 	Received ns msg: nbytes: 484, from addr: 61, from vproc: 2
    [189.266423] NameServer Reply: instanceName: GateMP, name: _GateMP_TI_dGate, value: 0x4101f
    [189.266437] NameServer: waiting for unblockFd: 0, and socks: maxfd: 9
    [189.266450] NameServer_getRemote: Reply from: 4, GateMP:[189.266460] _GateMP_TI_dGate, value buffer content: 0x600...
    [189.266468] _GateMP_TI_dGate, value buffer content: 0x40001...
    [189.266475] _GateMP_TI_dGate, value buffer content: 0xa...
    [189.266481] _GateMP_TI_dGate, value buffer content: 0x102...
    [189.266489]     value = 0x10
    [189.266495]     status = 0
    [189.266502] DONE
    [189.266508] Sending response...
    [189.266586] Retrieving command...
    [189.266602] LAD_MESSAGEQ_CREATE: calling MessageQ_create(0x2ead8, 0x2eaf8)...
    [189.266611] MessageQ_create: creating 'OCL:MsgQ:1526'
    [189.266623] NameServer_add: Entered key: 'OCL:MsgQ:1526', data: 0x80
    [189.266632] MessageQ_create: returning obj=0x2f6b8, qid=0x80
    [189.266640]     status = 0
    [189.266646] DONE
    [189.266652] Sending response...
    [189.267233] Retrieving command...
    [189.267428] LAD_NAMESERVER_GETUINT32: calling NameServer_getUInt32(0x2f570, 'OCL:DSP1:MsgQ')...
    [189.267442] NameServer_getLocal: entry key: 'OCL:DSP1:MsgQ' not found!
    [189.267450] NameServer_getRemote: no socket connection to processor 1
    [189.267457] NameServer_getRemote: no socket connection to processor 2
    [189.267465] NameServer_getRemote: Sending request via sock: 6
    [189.267472] NameServer_getRemote: requesting from procId 3, MessageQ: OCL:DSP1:MsgQ
    [189.267495] NameServer_getRemote: pending on waitFd: 1
    [189.267574] NameServer: back from select()
    [189.267586] NameServer: Listener got NameServer message from sock: 7!
    [189.267598] listener_cb: recvfrom socket: fd: 7
    [189.267606] 	Received ns msg: nbytes: 484, from addr: 61, from vproc: 3
    [189.267613] NameServer Reply: instanceName: MessageQ, name: OCL:DSP1:MsgQ, value: 0x414c9
    [189.267628] NameServer: waiting for unblockFd: 0, and socks: maxfd: 9
    [189.267687] NameServer_getRemote: value for MessageQ:OCL:DSP1:MsgQ not found.
    [189.267699] NameServer_getRemote: Sending request via sock: 8
    [189.267707] NameServer_getRemote: requesting from procId 4, MessageQ: OCL:DSP1:MsgQ
    [189.267740] NameServer_getRemote: pending on waitFd: 1
    [189.267819] NameServer: back from select()
    [189.267830] NameServer: Listener got NameServer message from sock: 9!
    [189.267842] listener_cb: recvfrom socket: fd: 9
    [189.267849] 	Received ns msg: nbytes: 484, from addr: 61, from vproc: 2
    [189.267857] NameServer Reply: instanceName: MessageQ, name: OCL:DSP1:MsgQ, value: 0x40080
    [189.267872] NameServer: waiting for unblockFd: 0, and socks: maxfd: 9
    [189.267895] NameServer_getRemote: Reply from: 4, MessageQ: name: OCL:DSP1:MsgQ, value: 0x40080
    [189.267906]     value = 0x40080
    [189.267913]     status = 0
    [189.267927] DONE
    [189.267935] Sending response...
    [189.267976] Retrieving command...
    [189.267992] LAD_NAMESERVER_GETUINT32: calling NameServer_getUInt32(0x2f570, 'OCL:DSP2:MsgQ')...
    [189.268002] NameServer_getLocal: entry key: 'OCL:DSP2:MsgQ' not found!
    [189.268010] NameServer_getRemote: no socket connection to processor 1
    [189.268018] NameServer_getRemote: no socket connection to processor 2
    [189.268025] NameServer_getRemote: Sending request via sock: 6
    [189.268032] NameServer_getRemote: requesting from procId 3, MessageQ: OCL:DSP2:MsgQ
    [189.268054] NameServer_getRemote: pending on waitFd: 1
    [189.268132] NameServer: back from select()
    [189.268143] NameServer: Listener got NameServer message from sock: 7!
    [189.268155] listener_cb: recvfrom socket: fd: 7
    [189.268163] 	Received ns msg: nbytes: 484, from addr: 61, from vproc: 3
    [189.268202] NameServer Reply: instanceName: MessageQ, name: OCL:DSP2:MsgQ, value: 0x30080
    [189.268220] NameServer: waiting for unblockFd: 0, and socks: maxfd: 9
    [189.268244] NameServer_getRemote: Reply from: 3, MessageQ: name: OCL:DSP2:MsgQ, value: 0x30080
    [189.268256]     value = 0x30080
    [189.268264]     status = 0
    [189.268272] DONE
    [189.268279] Sending response...
    [189.268407] Retrieving command...
    [189.269074] LAD_MESSAGEQ_DELETE: calling MessageQ_delete(0x2f6b8)...
    [189.269086] MessageQ_delete: deleting 0x2f6b8
    [189.269103] MessageQ_delete: returning 0
    [189.269114]     status = 0
    [189.269121] DONE
    [189.269131] Sending response...
    [189.269606] Retrieving command...
    [189.269626] LAD_GATEMP_ISSETUP: calling GateMP_isSetup()...
    [189.269634]     status = 0
    [189.269641] DONE
    [189.269648] Sending response...
    [189.269688] Retrieving command...
    [189.269703] LAD_GATEMP_DETACH: calling GateMP_detach(3)...
    [189.269711]     status = -5
    [189.269718] DONE
    [189.269724] Sending response...
    [189.269794] Retrieving command...
    [189.269811] NameServer_detach: --> procId=3, refCount=1
    [189.269834] NameServer: back from select()
    [189.269847] NameServer: listener thread, event: REFRESH
    [189.269856] NameServer: waiting for unblockFd: 0, and socks: maxfd: 9
    [189.269875] NameServer_detach: closing socket: 6
    [189.269896] NameServer_detach: closing socket: 7
    [189.270014] NameServer_detach: <-- refCount=0, status=0
    [189.270026] Sending response...
    [189.270101] Retrieving command...
    [189.270118] LAD_GATEMP_ISSETUP: calling GateMP_isSetup()...
    [189.270126]     status = 0
    [189.270133] DONE
    [189.270140] Sending response...
    [189.270176] Retrieving command...
    [189.270191] LAD_GATEMP_DETACH: calling GateMP_detach(4)...
    [189.270205]     status = 0
    [189.270213] DONE
    [189.270230] Sending response...
    [189.270282] Retrieving command...
    [189.270298] NameServer_detach: --> procId=4, refCount=1
    [189.270318] NameServer: back from select()
    [189.270330] NameServer: listener thread, event: REFRESH
    [189.270339] NameServer: waiting for unblockFd: 0, and socks: maxfd: 0
    [189.270356] NameServer_detach: closing socket: 8
    [189.270377] NameServer_detach: closing socket: 9
    [189.270491] NameServer_detach: <-- refCount=0, status=0
    [189.270502] Sending response...
    [189.270671] Retrieving command...
    [189.270723] LAD_MESSAGEQ_DESTROY: calling MessageQ_destroy()...
    [189.270733] MessageQ_destroy: entered, refCount=1
    [189.270747] MessageQ_destroy: exiting, refCount=0
    [189.270755]     status = 0
    [189.270762] DONE
    [189.270769] Sending response...
    [189.270811] Retrieving command...
    [189.270827] LAD_NAMESERVER_DESTROY: calling NameServer_destroy()...
    [189.270835] NameServer_destroy: entered, refCount=2
    [189.270842] NameServer_destroy(): refCount(1) > 0, exiting
    [189.270850] NameServer_destroy: exiting, refCount=1
    [189.270868]     status = 0
    [189.270876] DONE
    [189.270883] Sending response...
    [189.270958] Retrieving command...
    [189.270975] 
    LAD_DISCONNECT: [189.270983] 
        client handle = 0[189.270990] 
        closing FIFO /tmp/LAD/1526 (filePtr=0x2f400)
    [189.271085]     done, unlinking /tmp/LAD/1526
    [189.271182] DONE
    [189.271193] Retrieving command...
    [189.284079]    EOF detected on FIFO, closing FIFO: /tmp/LAD/LADCMDS
    [189.284127] 
        opening FIFO: /tmp/LAD/LADCMDS
    

    Thanks,

    Weston

  • Weston,

    The Processor Linux SDK version referenced in the post - 2.0.2.11 (released in 2016 1Q) includes OpenCL v01.01.08.01 which only supports AM572x. OpenCL on AM571x is not supported in Processor SDK 2.0.2.11. The AM572x OpenCL runtime attempts to connect to DSP1 and DSP2 over IPC. DSP2 does not exist on AM571x, leading to the IPC errors you mention.

    OpenCL support for the AM571x will be in the next Processor SDK release (3.1.x, 3Q 2016).

    Ajay