This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Rookie question: Ipc_start() hangs on the DSP of OMAP-L138

Other Parts Discussed in Thread: OMAP-L138, SYSBIOS

Hi,

I am trying an application on OMAP-L138 EVM board which uses Syslink (version 2.20.02.20) for IPC between the Host (ARM) and the DSP (C648).

On the Linux side, I am using the rootfs which came with tv-dvsdk_omapl138-evm_04_03_00_06 and a newer kernel which comes with DaVinci-PSP-SDL-03.22.00.02

uname -a shows

Linux arago 3.3.0 #1 PREEMPT Tue Oct 30 11:24:08 PDT 2012 armv5tejl unknown 

I have tried the examples which come with Syslink release and they all work. I was particularly interested in the MessageQ example as my application uses MessageQ (with a slight difference that the example uses HeapBufMP and I use HeapMemMP in my application)

So, I copied the config.bld and relevant portions of the cfg file from that example to build the DSP side of my application. I have also implemented Ipc_start()/Ipc_attach() in a task on the DSP side.

On the Linux side, I do the host side calls like Syslink_setup() followed by Ipc_control(LOADCALLBACK) and then Ipc_control(STARTCALLBACK).

To run, doing the following steps

- use slaveloader to load the DSP and start it

- start the Linux side application from a telnet shell

On the Linux side, the code hangs in the call to Ipc_control(STARTCALLBACK).

On the DSP side, I loaded the code using CCS + XDS510 (before doing the slaveloader above) and saw that the code was spinning in Ipc_start()). So,  I am guessing it is looping in the same when loaded via slaveloader as well although I am not sure and that is one of the help requests below

Questions:
---------------
1. In this mode, what is the best way to get logs from the DSP side? i.e., slaveloader loads the DSP and starts it. If DSP does a printf/Log_info, what is the best way to get at those logs? Can I connect CCS to a running DSP code and get logs? If yes, can you please point me to some documentation on how to do it? in general, ways to debug such applications involving both DSP and ARM

2. Anything obvious that jumps out I could be doing wrong compared to the example which would cause Ipc_start() to hang?  I tried an experiment where I used the Linux code from my application and the DSP side code from the MessageQ example. With this setup, the Linux code got past STARTCALLBACK. This seems to point out that I am doing something wrong on the DSP side. I am cutting and pasting relevant sections from the cfg file, config.bld file and my source code

Please let me know if you need further information.

Greatly appreciate any help in this regard.

Cheers,
-raja.

cfg file
____________________________________________ 

/*
* ======== IPC Configuration ========
*/

/* required because SysLink is running on the host processor */
xdc.useModule('ti.syslink.ipc.rtos.Syslink');

/* configure processor names */
var MultiProc = xdc.useModule('ti.sdo.utils.MultiProc');
var procNameAry = MultiProc.getDeviceProcNames();
MultiProc.setConfig("DSP", procNameAry);

/* ipc configuration */
var Ipc = xdc.useModule('ti.sdo.ipc.Ipc');

/* ipc setup for SR0 Memory (host processor not running Sys/Bios) */
Ipc.sr0MemorySetup = false;

/* set ipc sync to pair, requiring Ipc_attach() call on all processors */
Ipc.procSync = Ipc.ProcSync_PAIR;

/* define host processor */
Ipc.hostProcId = MultiProc.getIdMeta("HOST");

/* shared region configuration */
var SharedRegion = xdc.useModule('ti.sdo.ipc.SharedRegion');

/* configure SharedRegion #0 (IPC) */
var SR0Mem = Program.cpu.memoryMap["SR_0"];

SharedRegion.setEntryMeta(0,
new SharedRegion.Entry({
name: "SR0",

base: SR0Mem.base,
len: SR0Mem.len,
ownerProcId: MultiProc.getIdMeta("HOST"),
cacheEnable: false,
isValid: true
})
);

/* configure SharedRegion #1 (MessageQ Buffers) */
var SR1Mem = Program.cpu.memoryMap["SR_1"];

SharedRegion.setEntryMeta(1,
new SharedRegion.Entry({
name: "MessageQ Buffers",
base: SR1Mem.base,
len: SR1Mem.len,
ownerProcId: MultiProc.getIdMeta("HOST"),
cacheEnable: false,
isValid: true
})
);

/* configure external memory cache property
*
* C000_0000 - C7FF_FFFF 800_0000 ( 128 MB) Cache.MAR192_223
* ----------------------------------------------------------------------------
* C000_0000 - C1FF_FFFF 200_0000 ( 32 MB) -------- don't care
* C200_0000 - C202_FFFF 3_0000 ( 192 KB) SR_0, SR-1 no-cache MAR194
* C203_0000 - C2FF_FFFF FD_0000 ( ~15 MB) -------- no-cache MAR194
* C300_0000 - C37F_FFFF 80_0000 ( 8 MB) DSP_PROG cache enable MAR195
* C380_0000 - C3FF_FFFF 80_0000 ( 8 MB) -------- cache enable MAR195
* C400_0000 - C7FF_FFFF 400_0000 ( 64 MB) -------- don't care

*/

Cache = xdc.useModule('ti.sysbios.family.c64p.Cache');
Cache.MAR192_223 = 0x00000008; /* xxxx xxxx xxxx xxxx xxxx xxxx xxxx 10xx */

config.bld
_______________________________________________________________

 

/*
* ======== config.bld ========
*
*/

var Build = xdc.useModule('xdc.bld.BuildEnvironment');

/* Memory Map for ti.platforms.evmOMAPL138
*
* C000_0000 - C7FF_FFFF 800_0000 ( 128 MB) External Memory
* ------------------------------------------------------------------------
* C000_0000 - C1FF_FFFF 200_0000 ( 32 MB) Linux
* C200_0000 - C200_FFFF 1_0000 ( 64 KB) SR_0 (ipc)
* C201_0000 - C202_FFFF 2_0000 ( 128 KB) SR_1 (MessageQ buffers)
* C203_0000 - C2FF_FFFF FF_0000 ( ~15 MB) --------
* C300_0000 - C3FF_FFFF 100_0000 ( 16 MB) DSP_PROG (code, data)
* C400_0000 - C7FF_FFFF 400_0000 ( 64 MB) Linux
*/

var SR_0 = {
name: "SR_0", space: "data", access: "RWX",
base: 0xC2000000, len: 0x10000,
comment: "SR#0 Memory (64 KB)"
};

var SR_1 = {
name: "SR_1", space: "data", access: "RWX",
base: 0xC2010000, len: 0x20000,
comment: "SR#1 Memory (128 KB)"
};

Build.platformTable["ti.platforms.evmOMAPL138:dsp"] = {
externalMemoryMap: [

[ SR_0.name, SR_0 ],
[ SR_1.name, SR_1 ],
[ "DSP_PROG", {
name: "DSP_PROG", space: "code/data", access: "RWX",
base: 0xC3000000, len: 0x1000000,
comment: "DSP Program Memory (16 MB)"
}]
],
codeMemory: "DSP_PROG",
dataMemory: "DSP_PROG",
stackMemory: "DSP_PROG",
l1DMode: "32k",
l1PMode: "32k",
l2Mode: "32k"
};

/*
* ======== ti.targets.elf.C674 ========
*/
var C674 = xdc.useModule('ti.targets.elf.C674');
C674.ccOpts.suffix += " -mi10 -mo ";
Build.targets.$add(C674);

Relevant source code
____________________________________________________

do
{
status = Ipc_start();
} while (status == Ipc_E_NOTREADY);

if (status < 0)
{
Log_error0("Ipc_start() failed");
goto leave;
}

/* attach to the remote processor */
remoteProcId = MultiProc_getId("HOST");

/* connect to remote processor */
do
{
status = Ipc_attach(remoteProcId);

if (status == Ipc_E_NOTREADY) {

Task_sleep(100);
}

} while (status == Ipc_E_NOTREADY);

if (status < 0)
{
Log_error0("Ipc_attach() failed");
goto leave;
}

 

  • Wanted to add versions of components I am using

    SYSBIOS - bios_6_33_05_46

    IPC - ipc_1_24_03_32

    XDCTOOLS - xdctools_3_23_03_53

  • Anybody able to help on this one? I am still stuck

  • Hello Raja,

    Please see if you find this post useful: http://e2e.ti.com/support/development_tools/code_composer_studio/f/81/t/184194.aspx. This is about debugging ARM & DSP using CCS.

     

     

  • Thank you Varun. I will read through those.

    Cheers,

    -raja.

  • Hi Varun,

    I have read through those documents and tried what is mentioned in them.

    Been partially successful.

    1. Can run gdbserver on the ARM and connect to it from Remote debugger from CCS running on Linux

    2. But, I cannot get the DSP side to show proper symbols/execution when connecting to the DSP via a XDS510 emulator using CCS on Windows

    I tried two approaches for the DSP side

    1. Load the DSP code using slaveloader from the ARM side and then try connecting to the DSP from CCS. Is the code already running on the DSP at that time? That is my understanding as slaveloader does the start up. But, when I connect from CCS, the green play button is still active and I can click on it to what seems like run the code on the DSP. Don't understand what exactly is happening there. Although I have loaded proper symbols corresponding to the DSP executable, when I pause the DSP using CCS, it seems to be in some unexpected location

    2. I am not able to do the other method suggested in the Wiki page (i,e. add an infinite loop the DSP side of main()). When I try that, I get a kernel panic on the ARM side when trying to connect to the gdbserver from CCS on Linux. I am guessing this is due to address of reset_vector not getting passed in in LOADCALLBACK. Is that correct?

    So, not much progress.

    In some other debugging, I started cutting down my DSP program to a bare minimum and get that to connect to the ARM side. We were originally using 64000 bytes stack size for Tasks on the DSP side. With that stack size, even the simplified program would not connect. So, I started reducing the stack size and things connected when the stack size was set to 4096 bytes and it hung when I increased it to 8192 bytes. Is there some magic number there?

    After that, I went back to the full DSP application and changed the stack size (we have two tasks in there) to 4096 bytes each, but no luck in connecting with the ARM side. I reduced the stack size to 2048 bytes each (for a total of 4096 bytes thinking that there is some magic total somewhere) and it still did not connect.

    Am at a loss. Any ideas?

    Cheers,

    -raja

  • Hi Raja,

    The DSP is active and running the minute you run slaveloader with the 'startup' option. So you should be able to immediately connect to the DSP core in CCS, and load the symbols. The spin loop is not strictly necessary since the code should attempt to call Ipc_start in an infinite loop, which will not succeed until you run the ARM-side application. After that you should be able to step through the code and find out you are in Ipc_start(). You may need to add a path mapping to map your linux build path to the Windows path of your IPC source code under Window->Preferences->C/C++->Debug->'common source lookup path' if CCS keeps telling you it cannot find the source file associated with the code.

    Also are you trying this with the ex02_messageq example? If you are, this should have worked. Double-check you have loaded the proper symbols as that is the most common mistake. When you connect to the DSP, the core should halt if the connection is made. Get the debug procedure working before you attempt to look at your own code.

    As for why your code is not working, it could be for various reasons. If you see the ARM-side waiting in STARTCALLBACK, it means Ipc_start() on the slave never returned the proper handshake. Since you seem to be suspecting a stack issue, after you connect with CCS successfully, you can verify the stack usage for the system stack and also each task in your system using the ROV tool in CCS (http://rtsc.eclipse.org/docs-tip/Runtime_Object_Viewer).

    Best regards,

    Vincent

  • Thank you Vincent. I will give this a try.

    Because of my development set up, I am running two CCS instances. One on a Windows 7 host which I use to connect to the DSP and another inside a Linux VM (Ubuntu 10.04 running in a VMWare player) which is used to connect the GDB server on the ARM side.

    I am still compiling code in Windows for DSP and Linux for ARM (although the ex02_messageq example is all compiled on Linux). The split set up kind of makes this tricky with paths and other environment. Do you expect that to create issues?

    I will try the ex02_messageq example debug all from Linux side CCS and see how that goes.

    Cheers,
    -raja 

  • Hi Raja,

    Although theoretically you should be able to point CCS to the source code no matter which machine you built it on, for simplicity sake it may be better to build everything under Linux. It is what we do on the development team and also what most of our customers do when they use the SDK. This way you avoid any inconsistencies introduced when there is a mismatch in the software versions used on the Windows vs the Linux boxes. Typically I only use Windows for running CCS, but I don't use the gdbserver approach, so maybe in your case it makes sense to run CCS under Linux as well.

    Best regards,

    Vincent

  • Thanks for the clarification Vincent.

    I am experiencing difficulties launching a target configuration from my Linux CCS for XDS510 Spectrum Digital emulator. Are there some Linux drivers for that emulator that I need to install?

    Cheers,
    -raja. 

  • Hello Raja,

    Can you do a quick check to see if XDS510 SD Emulator is detected. You can follow the steps mentioned in this wiki: http://linux-c6x.org/wiki/index.php/Setting_up_CCS_v5#Spectrum_Digital_XDS510USB.

     

  • Thank you Varun for the link.

    The device is recognized, but I am not able to reset it or launch the target configuration as CCS is not able to communicate with the emulator.

    Here is some output

    ************************

    root@ubuntu:/opt/ti/ccsv5/ccs_base/emulation/drivers# ./setup_sd.sh
    udev start/running, process 14559
    root@ubuntu:/opt/ti/ccsv5/ccs_base/emulation/drivers# ./sdjtag -X portsavailable** Checking for available USB devices on valid ports.

    $$ XDS510USB PLUS connected on:
    $$ EmuPortAddr=0x510
    $$ EmuSerialNumber=S233421203122426
    root@ubuntu:/opt/ti/ccsv5/ccs_base/emulation/drivers# ./sdjtag -X reset** Resetting Emulator
    ERROR -- XDS510USB Reset Failed
    ERROR -- Check power to your emulator/eZdsp
    ERROR -- Then check your port address
    root@ubuntu:/opt/ti/ccsv5/ccs_base/emulation/drivers# lsusb
    Bus 002 Device 003: ID 0e0f:0002 VMware, Inc. Virtual USB Hub
    Bus 002 Device 002: ID 0e0f:0003 VMware, Inc. Virtual Mouse
    Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
    Bus 001 Device 020: ID 0c55:0540 Spectrum Digital, Inc. SPI540
    Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
    root@ubuntu:/opt/ti/ccsv5/ccs_base/emulation/drivers#

    ******************************************************

    I then disconnected the emulator from my VM, went back to CCS on Windows and made sure that I can still connect.

    Cheers,

    -raja.

  • Wanted to post an update.

    Have not been successful running CCS and connecting using the emulator from the Linux side. So, I have not pursued debugging set up with ex02_messageq example.

    But, I was able to connect to DSP side from Windows CCS using my application and step through. So, I am progressing with that now.

    Found out that the DSP side is failing in Ipc_attach(). More specifically, it gets remote->startedKey is not set to 1 (although I compile with -g and no -O option, the debugger jumps around a bit randomly, don't know why, but I believe this is the failure point)

    So, I wanted to go back and run GDB on ARM and check what's going on the other side (using gdbserver and GDB remote debugging from CCS on Linux), but that connection kept getting timed out. Have no idea why as I was able to get that to work yesterday. Don't know what demons got into the machine overnight:-) So, I am not able to understand what is going on there

    I went back to my simpler app which worked (i.e. DSP and ARM is able to establish a link). In that code, I increased the stack size to 64000 bytes (which was failing before). But, this does not hang any more. Again, don't know the reason behind this.

    The next baby step I am planning to do is to add more code to the simpler app I have and see when things start breaking. Maybe, the GDB side will start working magically after a bit of rest and I can debug that side:-)

  • A bit more progress. I have narrowed the problem down to having this piece of code in the cfg file.

    I am using McASP + EDMA3 for audio i/o in the application. This code sets up the Hwi corresponding to it. If I have this code in there, the Ipc_attach() does not succeed. Please note that I am doing Ipc_attach() before configuring the McASP and starting it.

    If I comment out this piece of code, Ipc_attach() succeeds.

    Any suggestions I can try?

    Thanks,
    -raja 

  • Forgot to paste the offending piece of code. Sorry

    //var hwi_params_edma = new Hwi.Params;
    //hwi_params_edma.eventId = 8; // reference to EDMA3_0_CC0_INT1
    //hwi_params_edma.priority = 5;
    //Program.global.hwi_edma = Hwi.create(5, '&EDMA3CCComplIsr', hwi_params_edma);

  • Raja,

    Hwi 5 is conflicting with what IPC uses.  IPC also uses Hwi 5 by default.  Can you try using a different Hwi for your creation?

    Judah

  • Thank you very much Judah. That gets me one step further:-)

    Before I provide the latest status, wanted to find out if there is some programmatic way to get a free Hwi and use it to avoid collisions?

    Getting past the Hwi collision issue, I am hitting the next issue:-)

    Here is the code change for Hwi

    var hwi_params_edma = new Hwi.Params;
    hwi_params_edma.eventId = 8; // reference to EDMA3_0_CC0_INT1
    hwi_params_edma.priority = 5;
    Program.global.hwi_edma = Hwi.create(7, '&EDMA3CCComplIsr', hwi_params_edma);

    When I run with this, I get an exception like

    A0=0x276c02a A1=0x1
    A2=0x3a3d0adb A3=0xc39fcf2c
    A4=0xf00700 A5=0xf8
    A6=0x8027 A7=0x1
    A8=0x1 A9=0xc39fcfa4
    A10=0x0 A11=0x0
    A12=0x2fa54d41 A13=0x8a1936eb
    A14=0xd58c8f19 A15=0xd58c8f19
    A16=0x0 A17=0x0
    A18=0xc317c97c A19=0x10
    A20=0x0 A21=0x0
    A22=0xc305f958 A23=0xa
    A24=0x540 A25=0x80000000
    A26=0x400 A27=0x80000000
    A28=0x200 A29=0x90de024c
    A30=0x7f A31=0x0
    B0=0x1 B1=0x0
    B2=0x300 B3=0xd9dfb95e
    B4=0xc39fd0b8 B5=0xc39fd0b0
    B6=0xc39fda00 B7=0xc39e4b14
    B8=0x0 B9=0x1d02000
    B10=0xc39fe320 B11=0x0
    B12=0x0 B13=0x0
    B14=0xc39fe320 B15=0xc39f4e40
    B16=0x202 B17=0xc39fda60
    B18=0x0 B19=0xa
    B20=0xace2acac B21=0x3f36aed9
    B22=0xf B23=0x0
    B24=0x24ae77e6 B25=0x950a68da
    B26=0x0 B27=0xffffffff
    B28=0x0 B29=0x1
    B30=0xffffffff B31=0x0
    NTSR=0x1820e
    ITSR=0xf
    IRP=0xc3979304
    SSR=0x0
    AMR=0x0
    RILC=0x0
    ILC=0x0
    Exception at 0xd9dfb970
    EFR=0x2 NRP=0xd9dfb970
    Internal exception: IERR=0x8
    Opcode exception
    ti.sysbios.family.c64p.Exception: line 248: E_exceptionMin: pc = 0xc3979304, sp = 0xc39f4e40.
    To see more exception detail, use ROV or set 'ti.sysbios.family.c64p.Exception.enablePrint = true;'
    xdc.runtime.Error.raise: terminating execution

    Looking at the map file, the closest address to the exception PC is

    c3979300   ti_sysbios_family_c64p_TimestampProvider_getFreq__E

    In the register display, it also says Exception at some address which looks like a bogus address

    Exception at 0xd9dfb970

    Any suggestions?

    Thank you very much,

    -raja.

  • A couple of more runs moves things around

    A0=0xc39721c4 A1=0x1
    A2=0x3b3d0adb A3=0xc39fcf2c
    A4=0xf00700 A5=0xf8
    A6=0x8027 A7=0x1
    A8=0x1 A9=0xc39fcfa4
    A10=0x0 A11=0x0
    A12=0x6da14541 A13=0x8a19366b
    A14=0xbae25555 A15=0x1a1f64bc
    A16=0x1 A17=0x0
    A18=0xc317c97c A19=0x10
    A20=0x0 A21=0x0
    A22=0xc305f958 A23=0xa
    A24=0x540 A25=0x80000000
    A26=0x400 A27=0x80000000
    A28=0x200 A29=0x90de024c
    A30=0x7f A31=0x0
    B0=0x1 B1=0x0
    B2=0x300 B3=0x958c8f19
    B4=0xc39fd0b8 B5=0xc39fd0b0
    B6=0xc39fda00 B7=0xc39e4b14
    B8=0x0 B9=0x1d02000
    B10=0xc39fe320 B11=0x0
    B12=0x0 B13=0x0
    B14=0xc39fe320 B15=0xc39f4e40
    B16=0x202 B17=0xc39fda60
    B18=0x0 B19=0xa
    B20=0x39c2711c B21=0x3f322ef1
    B22=0xf B23=0x0
    B24=0x24eed7e6 B25=0x950a68da
    B26=0x0 B27=0xffffffff
    B28=0x0 B29=0x1
    B30=0xffffffff B31=0x0
    NTSR=0x1020e
    ITSR=0xf
    IRP=0xc39665a8
    SSR=0x0
    AMR=0x0
    RILC=0x0
    ILC=0x0
    Exception at 0x958c8f18
    EFR=0x2 NRP=0x958c8f18
    Internal exception: IERR=0x1
    Instruction fetch exception
    ti.sysbios.family.c64p.Exception: line 248: E_exceptionMin: pc = 0xc39665a8, sp = 0xc39f4e40.
    To see more exception detail, use ROV or set 'ti.sysbios.family.c64p.Exception.enablePrint = true;'
    xdc.runtime.Error.raise: terminating execution

    Next attempt

    A0=0xc39721c4 A1=0x1
    A2=0x3b290adb A3=0xc39fcf2c
    A4=0xf00700 A5=0xf8
    A6=0x8027 A7=0x1
    A8=0x1 A9=0xc39fcfa4
    A10=0x0 A11=0x0
    A12=0xa2adfc00 A13=0x8a19366b
    A14=0xba6255d5 A15=0x1a1f64bd
    A16=0x1 A17=0x0
    A18=0xc317c97c A19=0x10
    A20=0x0 A21=0x0
    A22=0xc305f958 A23=0xa
    A24=0x540 A25=0x80000000
    A26=0x400 A27=0x80000000
    A28=0x200 A29=0x90de024c
    A30=0x7f A31=0x0
    B0=0x1 B1=0x0
    B2=0x300 B3=0x958caf19
    B4=0xc39fd0b8 B5=0xc39fd0b0
    B6=0xc39fda00 B7=0xc39e4b14
    B8=0x0 B9=0x1d02000
    B10=0xc39fe320 B11=0x0
    B12=0x0 B13=0x0
    B14=0xc39fe320 B15=0xc39f4e40
    B16=0x202 B17=0xc39fda60
    B18=0x0 B19=0xa
    B20=0xace2acac B21=0x3f36aed9
    B22=0xf B23=0x0
    B24=0x24ae57ee B25=0x970a68da
    B26=0x0 B27=0xffffffff
    B28=0x0 B29=0x1
    B30=0xffffffff B31=0x0
    NTSR=0x1020e
    ITSR=0xf
    IRP=0xc397a528
    SSR=0x0
    AMR=0x0
    RILC=0x0
    ILC=0x0
    Exception at 0x958caf18
    EFR=0x2 NRP=0x958caf18
    Internal exception: IERR=0x1
    Instruction fetch exception
    ti.sysbios.family.c64p.Exception: line 248: E_exceptionMin: pc = 0xc397a528, sp = 0xc39f4e40.
    To see more exception detail, use ROV or set 'ti.sysbios.family.c64p.Exception.enablePrint = true;'
    xdc.runtime.Error.raise: terminating execution

  • Raja,

    There is no method to determine whether a Hwi is currently in used.  This could be an enhancement that can be made to a future release.

    Can you check whether your Task stacks and ISR stacks are okay?

    I can't tell much from the output there, but random addresses causing exception is usually a blown stack issue.

    Judah

  • Will do, thanks Judah.

  • Hi Judah,

    I have checked the stack sizes and they are well within limits.

    Hwi stack is at ~1300 bytes out of a total of 65536 bytes

    I have two tasks with stack sizes of 64000 bytes each and those stacks have not gone past the 3000 byte mark.

    It is surprising that the code seems to run sometimes even after hitting that exception. I have McASP receiving data from an A/D device. That data is processed on the DSP side and sent to the ARM side using MessageQ (with a backing HeapMemMP). Looks like several frames of data gets sent to the ARM side. (Please note that the exception does not happen when I don't start the McASP Rx, but then there is no input data as well).

    The next problem is with MessageQ frees. Data goes from DSP side -> ARM side. ARM side is able to read it. I have dumped data of a few exchanges to confirm that the data is correct. ARM side frees the message. No errors there. But, after a little while DSP side stars complaining about not enough memory and ARM side (kernel module) outputting 

    *** HeapMemMP_free: Entire buffer is not in the range of the heap!
    Error [0xffffffff] at Line no: 1390 in file ti/syslink/ipc/hlos/knl/HeapMemMP.c

    Don't know where the resources are disappearing

    Will greatly appreciate any suggestions.

    Cheers,

    -raja.

  • Raja,

    Sorry, I can't think of anything else besides the stacks for now.

    The HeapMemMP running out seems like a problem.  Is there a way you can check ROV on the DSP side when you halt the program.  Could be that you are freeing something that is not from the buffer?

    Judah

  • Hi Judah,

    I did check ROV on the DSP side and the HeapMemMP free list was NULL for the HeapMemMP I had associated with the MessageQ on the DSP side. I am well versed with all the information ROV is displaying. If there is any specific things I can check to ensure that the free is happening to the correct HeapMemMP, please let me know and I can check.

    I added a bunch of debug on the ARM side also to understand this better, but not able to get to the bottom of this.

    I use MessageQ_put() on the DSP side which is supposed to wake up the synchronizer object. I have a thread on the ARM side waiting on MessageQ_get(). I put in some debug to check the number of messages pending on the MessageQ on the ARM side. Some times it spiked up to 16 - 18 messages. I have no idea why. Both CPUs are not fully loaded. DSP is running at about 50% CPU level. ARM is actually doing very little processing (take data from DSP and send it out to a network socket, that is all). Each of these messages is about 100 bytes (including MessageQ header) and I allocated 4KB for HeapMemMP. Unless there is a lot of internal overhead to manage the buffers (guessing that the buffer round up is 128 bytes), I should be able to get more than 18 messages before memory runs out. So, a few things which are puzzling.

  • Hi Raja,

    Regarding your HeapMemMP_free error message, could you modify the code in ti/syslink/ipc/hlos/knl/HeapMemMP.c to print out the relevant quantities (e.g. newHeader, obj->buf, size, obj->bufSize) when the error message appears on line 1390? It'd be interesting to know what these values are as a first step. You can use GT or printk, whichever you are more comfortable with for debugging. Then recompile syslink and your application, and rerun to see the output.

    Best regards,

    Vincent

  • Thank you Vincent. I will try and do that today.

    I have a couple of questions about HeapMemMP

    1. In my application (actually more specifically in this failure case), ARM side creates the MessageQ and DSP side does a MessageQ_open() on the remote queue. DSP side allocates a message in a HeapMemMP and uses MessageQ_put() to send data to the ARM side. ARM side uses MessageQ_get() to read the data and frees the message after consuming contents. Which side should create the HeapMemMP and register that heap with MessageQ? I have tried both ways (create it on the DSP side and ARM side and both fail the same way). Is it okay to allocate one side and free on the other side (I would guess so, but the example uses the same buffer to reply and it gets allocated and freed on the ARM side)?

    2. How is the heap id determined when registering heap? Where is the configuration which determines what heap ids are available? Assuming both ARM side and DSP side create a heap, can they use the same heap id?

    Cheers,

    -raja.

  • Hi Raja,

    1. Either side should be able to create the HeapMemMP and register the heap. However, this must be done prior to calling MessageQ_alloc(). It needs to know where to allocate from. And yes, you should be able to allocate from one side and free on the other.

    2. The heap id is user-defined. You as the programmer get to decide which id to use. Both sides need to use the same id to refer to the same heap. On the other hand, if the intent is for the ARM and the DSP to create separate heaps, then different ids should be used.

    Best regards,

    Vincent

  • Thank you very much Vincent.

    Heap id collision might be the problem I am seeing.

    I had originally used different heap ids (as I create a heap from ARM side and one from the DSP side), but that would fail during alloc with invalid heap id.

    So, I ended up using the same heap id on both side while register their respective heaps and that could have led to problems as putting printks like you suggested indicates that is trying to free from a different heap (i.e., ARM's view of that heapid is different from DSP's view of that heap id).

    So, I am not puzzling through code as to why I get invalid heap id. Default configuration seems to have 8 heaps in MessageQ module. Not sure if that is overwritten somewhere else and the number of heaps are actually lower. I will puzzle through that next. Documentation does not mention about maximum number of heaps.

  • Vincent,

    A couple of related, but not totally on topic questions, just in case you have the answers off the top of your head.

    1. In my testing, I am rebooting the OMAP-L138 and rebooting Linux before every run. I am not able to rmmod syslink even with the -f option. Obviously, this is very time consuming. Is there any way I can get back to a state without rebooting where I can reload the DSP app using slaveloader and restart the ARM side application?

    2. I am using CCS to connect to the DSP side (using XDS510) and get console logs.  I usually do not relaunch the target configuration. I just reconnect to the target. But, it has old logs every time confusing which error comes from which run:-) Tried resetting the emulator, but I still get old logs. How can I prevent old logs?

    Cheers,

    -raja.

  • Hi Vincent,

    I got the heap memory to work properly after using open from the remote side.

    Thank you very much for all your help (to all who helped Varun, Judah, Vincent and I know I am forgetting somebody here)

    If you have some answers to the questions I posted above, great. If not, please feel free to close the issue.

    Cheers,

    -raja.

  • Hi Raja,

    Good to know you managed to get your application to work. Regarding your questions:

    1. rmmod should work. Make sure you have cleaned up after syslink properly in your application on both the host and slave sides. If you exit an application while it hasn't yet released all the syslink resources (e.g. you forgot to call Ipc_control(STOPCALLBACK) in the app or you stopped the app thru ctrl-c), SysLink might be in a bad state and you might not be able to launch another Syslink application or rmmod would fail, etc.

    2. Have you tried right-clicking on the console window and select 'Clear'? That should clear the window.

    Best regards,

    Vincent

  • Hi Vincent,

    Thank you for all your help.

    I have not had a chance to try clearing the console part yet. Have been focusing on some other stuff. Also, the exception on DSP side at start up is back after I did some more changes to the code. I never figured out what caused it originally, but it went away and I just let it be. But, it back now and hence I am not able to make much progress without debugging that.

    Guess, I have to live with rebooting Linux while I develop applications as getting to a clean shutdown is probably going to be hard till the application is fully debugged:-)

    I am going to mark this closed and open a new one if I have more issues.

    Cheers,

    -raja.