This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

SYSLINK/IPC troubles on OMAPL138: DSP doesn't set Ipc_PROCSYNCFINISH

Other Parts Discussed in Thread: OMAPL138

Hi all,

In the past week I've done a couple of posts (here and in the Linux forum) about my troubles with SYSLINK on OMAPL138. I've done some progress now and I have the following situation:

I have a custom board with the OMAPL138. The ARM core is running a custom linux (kernel 3.0.23)

I am using:  bios_6_35_01_29ipc_1_25_02_12syslink_2_21_01_05TI_CGT_C6000_7.4.2xdctools_3_25_00_48.

I can compile and load SYSLINK and the examples code on the DSP side on my board but the Linux side hangs as soon as I start the application on the ARM side.

I have found out, through the use of a lot of printk on the Linux side, that the driver gets stuck in Ipc_procSyncStart(), (packages/ti/syslink/ipc/hlos/knl/Ipc.c) where it never sees the DSP setting remote->startedKey to Ipc_PROCSYNCFINISH.

I'm going to move the investigation on the DSP side, but has anyone got any suggestions to speed this up?

Thanks

  • It would be good to see what the DSP code might be doing.

    The call (Ipc_attach()) on the DSP core should set the bit necessary for the remote->startedKey to be set and Ipc_procSyncStart() to return successful (e.g. Ipc_PROCSYNCFINISH).  Make sure the cache settings are setup correctly on the DSP-side configuration.  Its possible due to caching that the bit never gets written back to memory thus the GPP core will never see it.

  • I've been trying to look at the DSP side by doing the following:

    1. On the ARM, ./slaveloader_debug startup DSP server_dsp.xe674 to load and start the DSP

    2. Connect CCS to the DSP, set a breakpoint in ti_sdo_ipc_Ipc_procSyncFinish and then let it resume running.

    3. Now run app_host DSP on the ARM side which will call Ipc_control(Ipc_CONTROLCMD_STARTCALLBACK), which will should get teh DSP to run Ipc_attach() etc.

    I do indeed see the breakpoint being hit, but afterwards I don't seem to be able to follow what happens in the DSP from CCs debugger, probably because teh IPC code on the DSP is built with optimizations.

    How do I recompile IPC without optimizations?

  • If you are running any of the examples provided in SysLink (e.g. ex02_messageQ), you should be able to step into the application's C-source code in CCS provided you load the debug version in you step #1 (above).  Make sure, you load DSP application symbols (Select Run->Load->Load Symbols from the toolbar in CCS)

    Once you connect to the device (Step #2 above), the application should be spinning on Ipc_start.

    Then proceed, with Step #3 (above), this should break the application out of the Ipc_start() call.

    Subsequently, Ipc_attach() is called, who should set a few bits and wait for the host-side application to also set some bits at the top of the SharedRegion 0 (e.g. @ address 0xc600000 in example below) for handshaking.  Address of SR0 will depend on your DSP-side application's configuration.

  • A bit more to add, the Ipc_start() returns IPC_E_NOTREADY until the host-side toggles a bit that can be viewed in memory by the following variable (Ipc_sr0MemorySetup).

    This bit should be toggled when the host-side calls Ipc_control(Ipc_CONTROLCMD_STARTCALLBACK).

  • Arnie,

    I run the debug version of the example (in fact I am running ex01_helloworld) and I have no problems in stepping inside the code of the example itself.

    But then at the breakpoint I want to follow what happens inside the IPC code. That code (IPC) is optimized and is not easy to follow the sequence of what happens.

  • by using printk in the driver, Linux side and CCS debugger on the DSP side it seems that there is a mismatch of addresses on the two sides.

    The Linux driver, inside Ipc_procSyncStart has the following addresses:

    Ipc_procSyncStart: &(self->startedKey)  =0xc67a0080
    Ipc_procSyncStart: &(remote->startedKey)=0xc67a0000

    while on the DSP side, inside ti_sdo_ipc_Ipc_procSyncFinish, I see, using the CCS debugger:

    &(self->startedKey)  :0xc6000000
    &(remote->startedKey):0xc6000080

    It looks like there is a mismatch on the base address.

  • I think my previous post is not very useful. The addresses printed out in the driver are of course virtual .

    I am loading the driver with the trace enabled so I can indeed see that it prints out the following:

    [   67.730004] MemoryOS_map: pa=0xc6000000, va=0xc67a0000, sz=0x10000

    So there is no mismatch: 0xc6000000 on the DSP (physical address) is the same as 0xc7a0000 in the driver (virtual address).

    back to the chase...

  • So now I've determined that the startedKey flag for the DSP resides at 0xC600000 (physical addr).

    This will be set to 1 (START) and then 2 (FINISH) by the DSP.

    If I look on the DSP side, using CCS debugger I indeed see these values being set.

    The Linux driver instead seems not to see the transition from 1 to 2, but If i look 'manually' on Linux, by building a small program that opens /dev/mem and mmaps the page at 0xC600000 I DO SEE the value 2 there.

    It seems some kind of problem with caching inside the Linux driver (?)

  • I'm not aware of the Linux kernel setting certain memory regions as cachable but you can inform SysLink what regions are cached to see if that helps.

    By default, SysLink assumes all memory to be non-cached for all SharedRegions on the host-side.  You can tell SysLink what SRs are cache enabled by setting the SysLink_params global before the call to SysLink_setup in the application.  This will ensure the proper cache writebacks/invalidates calls are made on the host-side.

    Add the following line to your application's host-side code before the call to SysLink_setup:

     #include <ti/syslink/SysLink.h>    

    /* Enable SharedRegion0 to be cache enabled */

    SysLink_params = "SharedRegion.entry[0].cacheEnable=TRUE;";  

    SysLink_setup();
  • Arnie,

    I found I was following a false trail. I had put my printk in the driver without taking into account the control flow.

    No I see that the startedKey is indeed seen as 2 (Ipc_PROCSYNCFINISH).

    I've now proceeded (with greater caution, to avoid ending up chasing a ghost again) and I have followed the flow up to Notify_attach. The driver seems to get stuck (and taking the whole kernel with it) inside this function (possibly inside NotifyDriverShm_create, but I'm not yet sure of this).

    To speed up the debugging, once the board locks up, is there a way of looking at the printk buffer, from Code Composer?

  • It seems that the board hangs when 'request_irq' is called. I have added a couple of printk in OsalIsr_install  [ utils/hlos/knl/osal/Linux/OsalIsr.c ], around the call to request_irq:

    printk(KERN_EMERG ">>>>>>>> Before request_irq\n");
    return -99;
    osStatus = request_irq (isrObj->irq,
                            (Void*) &OsalIsr_callback,
                            0,
                            "SYSLINK",
                            (Void *) isrObj);
    printk(KERN_EMERG ">>>>>>>> After request_irq\n");
    return -99;

    If I leave the first return -99;, the code fails of course, but the board doesn't hangs and I can see the string ">>>>>>>> Before request_irq" in the printk buffer, with dmesg, which porves that that point is reached.

    If I remove that first return -99;, the board hangs.

  • claudio potenza said:
    To speed up the debugging, once the board locks up, is there a way of looking at the printk buffer, from Code Composer?

    Unfortunately, I'm not aware of any.

  • My problems are connected to NOT clearing IRQ 28 (bit 0 of SYSCFG0/CHIPSIG, the first DSP-> ARM interrupt).

    Using the debugger I verified that just before request_irq was called, the bit was set.

    If hack the code to clear it (write 1in SYSCFG0/CHIPSIG_CLR) before calling request_irq, the board doesn't hang anymore.

    I've tried hacking the clearing of the bit inside OsalIsr_callback as well, but I still cannot get the example to run correctly (but at least the board doesn't hang anymore!).

    Where should the interrupt been acknowledged? I guess it should be in the kernel code, not the SYSLINK code. If this is the case, where?

    Note that I am not running the kernel 3.1.10 provided inside the ti-sdk-omapl138-lcdk-01.00.00 archive but our custom kernel 3.0.23, so I want to see if our kernel is missing the ack of this interrupt.

  • Correct, the IRQ acknowledgement is somewhere in the kernel code.  I'm not intimately familiar with the kernel IRQ code, but you can start by looking at the the ARM specific implementation at:  ./arch/arm/kernel/irq.c

    BTW:  This might be a good question for the experts of the Embedded Software Linux forum.