This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

How do I debug DSP_EFAIL on OMAP 3530?

Other Parts Discussed in Thread: OMAP3530

I am trying to run the DSPLINK samples on a Beagleboard (OMAP 3530), and I'm getting a DSP_EFAIL error that I don't know how to troubleshoot.  With SET FAILURE REASON on, I can see that it's failing in PROC_attach, in the DSP_invoke call.  In the console window, I see a message about DSP_init status being 0x800800008, the DSP_EFAIL code.  I presume this all means that I'm not successfully communicating with the dsplinkk driver.  As best I can tell, the memory map settings agree in CFG_OMAP3530_SHMEM.c and in dsplink-omap3530-base.tci.  I did not change them from their defaults in dvsdk_3_00_02_44, which contains DSPLINK 1.61.03.  I have tried booting Linux on the ARM with mem=100M and mem=126M - the defaults seem to me assume 128M of memory available (although the Beagle has 256M).

My toolchain is Montavista 5, with the 2.6.29 kernel.  I am able to load the dsplinkk.ko and lpm_omap3530.ko modules, and the lpmON and lpmOFF pre-built executables output messages that seem normal.  I'm using DSP BIOS 5.33.06.

I am following the instructions in the DSPLINK installation guide and on the wiki, as best I can, but am stumped about how to proceed.  My real goal is to use Codec Engine, but it seemed like I ought to get DSPLINK working first.

Can you give me advice on what to check, or help with understanding what's going on?

Thanks,

Reid Rowlett
Zeta Associates, Inc.

  • Reid,

    The SET_FAILURE_REASON will give you the exact line of code in which the failure occurs. The first SET_FAILURE_REASON print usually indicates the original cause of failure. There should be more kernel prints which will tell you where the failure occurred. You need to go through the steps detailed at http://processors.wiki.ti.com/index.php/Debugging_DSPLink_using_SET_FAILURE_REASON_prints to see how to interpret the failure.

    For further debugging tips,  you can also enable kernel level trace to see which function is failing http://processors.wiki.ti.com/index.php/Enabling_trace_in_DSPLink

    I am assuming that the insmod of DSPLink kernel module succeeded and the memory maps are ok. There is a similar post at http://e2e.ti.com/support/arm174_microprocessors/omap_applications_processors/f/42/p/35987/126269.aspx#126269 . Are you also sseeing the same failure?

    Deepali

  • Deepali,

    Thanks for the reply.  I have turned on SET_FAILURE_REASON and although not every failure code has generated a line number message, there is enough that I believe I know what source lines are responsible.  I don't see any messages from the kernel side, although my DSPLINK 1.61.03 code already had the PRINT_Printf line that the wiki page "Enabling Trace in DSPLink" says to add.  Here are the messages I see:

    root@beagle2:/software/ti/dsplink/examples2# insmod dsplinkk.ko
    root@beagle2:/software/ti/dsplink/examples2# insmod lpm_omap3530.ko
    root@beagle2:/software/ti/dsplink/examples2# ./loopgpp loop.out 1024 10000
    =============== Sample Application : LOOP ==========
    ==== Executing sample for DSP processor Id 0 ====
    Entered LOOP_Create ()
    Entered PROC_setup ()
            linkCfg [0x0]
    Entered DRV_Initialize ()
            drvObj  [0x2e7a8]
            arg     [0x0]
    Leaving DRV_Initialize ()       status [0x8000]
    Entered DRV_ProtectInit ()
            drvObj  [0x2f008]
    Leaving DRV_ProtectInit ()      status [0x8000]
    Entered DRV_ProtectEnter ()
            drvObj  [0x2f008]
    Leaving DRV_ProtectEnter ()     status [0x8000]
    Entered DRV_Invoke ()
            drvObj  [0x2f008]
            cmdId   [0x6c01]
            arg1    [0xbee4bb60]
            arg2    [0x0]
    Entered DRV_installCleanupRoutines ()
            linkCfgPtr      [0x2e774]
    Leaving DRV_installCleanupRoutines ()
    osStatus: 0
    Entered _POOL_init ()
    Leaving _POOL_init ()
    Status: 8000
    Leaving DRV_Invoke ()   status [0x8000]
    Entered _MEM_USR_init ()
    Leaving _MEM_USR_init ()        status [0x8000]
    Entered _IDM_USR_init ()
    Entered DRV_Invoke ()
            drvObj  [0x2f008]
            cmdId   [0x7351]
            arg1    [0xbee4bb34]
            arg2    [0x0]
    Status: 8000
    Leaving DRV_Invoke ()   status [0x8000]
    Leaving _IDM_USR_init ()        status [0x8000]
    Entered _SYNC_USR_init ()
    Entered _IDM_USR_create ()
            key     [0x10080]
            attrs   [0xbee4bb3c]
    Entered DRV_Invoke ()
            drvObj  [0x2f008]
            cmdId   [0x7353]
            arg1    [0xbee4bb14]
            arg2    [0x0]
    Status: 8000
    Leaving DRV_Invoke ()   status [0x8000]
    Leaving _IDM_USR_create ()      status [0x8000]
    Leaving _SYNC_USR_init ()       status [0x8000]
    Entered _SYNC_USR_createCS ()
            idKey   [0x2401c]
            csObj   [0x2e7b0]
    Entered _IDM_USR_acquireId ()
            key     [0x10080]
            idKey   [0x2401c]
            id      [0xbee4bb40]
    Entered DRV_Invoke ()
            drvObj  [0x2f008]
            cmdId   [0x7355]
            arg1    [0xbee4bb14]
            arg2    [0x0]
    Status: 8000
    Leaving DRV_Invoke ()   status [0x8000]
    Leaving _IDM_USR_acquireId ()   status [0x8000]
    Leaving _SYNC_USR_createCS ()   status [0x8000]
    Entered DRV_ProtectLeave ()
            drvObj  [0x2f008]
    Leaving DRV_ProtectLeave ()     status [0x8000]
    Entered PROC_resetCurStatus ()
    Leaving PROC_resetCurStatus ()
    Leaving PROC_setup ()   status [0x8000]
    Entered PROC_attach ()
            procId  [0x0]
            attr    [0x0]
    Entered DRV_Initialize ()
            drvObj  [0x2e7a8]
            arg     [0x0]
    Entered _SYNC_USR_enterCS ()
            csObj   [0x2f020]
    Leaving _SYNC_USR_enterCS ()    status [0x8000]
    Entered _SYNC_USR_leaveCS ()
            csObj   [0x2f020]
    Leaving _SYNC_USR_leaveCS ()    status [0x8000]
    Leaving DRV_Initialize ()       status [0x8000]
    Entered _SYNC_USR_enterCS ()
            csObj   [0x2f020]
    Leaving _SYNC_USR_enterCS ()    status [0x8000]
    Entered DRV_Invoke ()
            drvObj  [0x2f008]
            cmdId   [0x6c08]
            arg1    [0xbee4bb5c]
            arg2    [0x0]
    Status: 80008008
    Leaving DRV_Invoke ()   status [0x80008008]
    Entered _SYNC_USR_leaveCS ()
            csObj   [0x2f020]
    Leaving _SYNC_USR_leaveCS ()    status [0x8000]
    Leaving PROC_attach ()  status [0x80008008]
    PROC_attach failed . Status = [0x80008008]
    Leaving LOOP_Create ()
    ...

    There are more errors, but this is the first set.  My analysis is that the errors come from the ioctl() call in DRV_invoke on line 927 of gpp/src/api/Linux/drv_api.c, which was called from PROC_attach, line 518 of gpp/src/api/proc.c, which was in turn called from LOOP_create, line 231 of gpp/src/samples/loop/loop.c.  I'm not as sure about the kernel side, but I believe the console message comes from DSP_init, which calls an init method of DSP_State[dspId].interface at line 154 of gpp/src/arch/dsp.c, which was called by LDRV_PROC_init at line 365 of gpp/src/ldrv/ldrv_proc.c.  My best guess is that the init method being called is OMAP3530_phyShmemInit at line 141 of gpp/src/arch/OMAP3530/shmem/Linux/omap3530_phy_shmem.c.  It's not so clear to me where the DSP_EFAIL status code is coming from in this side.

    Back on the user side, I believe the ioctl itself returns a normal status.  The error is coming back through its args->apiStatus parameter.

    Despite the fact I can follow the code, I don't have much insight on where to look to track down the problem.

    Reid

     

  • Reid,

    Its great that you took time to dig into the code. You traced it corectly all the way except the last function. The init function is a device specific function OMAP3530_init called in D:\Users$dsplink\gpp\src\arch\OMAP3530\omap3530_hal.c which does hw initialization logic including the call to OMAP3530_phyShmemInit.

    Ideally, you should not have to do all this. Enabling SET_FAILURE_REASON or enabling kernel side prints should tell you the exact line/function at which failure has occurred.

    The log that you have attached shows only user side prints. The failure is happening in the kernel logic (some where in the functions embedded in DSP_init) . Where are the kernel prints. Does dmesg show them? The kernel prints will give the line of actual failure.

    I suspect either wrong mem args or a conflict in the interrupts number for DSP MMU faults. However, you will need a way to get the kenel prints either BY SET_FAILURE_REASON or enabling kernel side prints so we can find out the actual failure.

    Deepali


  • Deepali,

    Thanks - you were correct that it was an interrupt conflict.  I realized after looking further at the code that on the kernel side, only DEBUG builds will produce SET_FAILURE_REASON prints.  When I loaded the DEBUG version of the dsplinkk.ko module, I saw prints from the kernel side, and was able to see that the initial error message was from the failure of request_irq.  I then checked interrupt 28, which was the issue in a previous link that you had cited for me, and found that DSPBridge was also using that interrtupt.  DSPBridge had been built into the kernel that I was using.  Once I rebuilt my kernel without DSPBridge support, LOOPGPP ran to completion. 

    Reid

  • Reid,

    Could you give some details (for my benefit as well as other customers who might run into same issue) how you removed DSPBridge from the kernel you were using.

    Deepali

  • Deepali,

     

    I used "cat /proc/interrupts" to look at the assigned interrupts, and saw (among others):

     

    26:          0        INTC  DspBridge  mailbox

    28:          0        INTC  DspBridge  iommu fault

     

    You had provided a link where someone else had an interrupt 28 conflict with a camera device.  I knew that DSPBridge was in my setup, but I was not aware that it was config'd into the kernel, but also referenced in that link was that CONFIG_MPU_BRIDGE=y was the config entry that would cause DspBridge interrupts to be assigned.  I verified that I had that config setting (zcat /proc/config.gz | grep BRIDGE), and then realized I needed to rebuild my kernel.

     

    In "make xconfig" for my kernel, there was a checkbox for "DSPBridge support" or something close to that (I'm not in front of my development machine right now).  I simply unchecked it, saved the config and rebuilt the kernel.

     

    There are a couple of other items I should also let you know about.  I had said before that SET_FAILURE_REASON prints were only available in the DEBUG build of dsplinkk.ko, but the DEBUG target had a compile error in DSPLINK version 1.61.03 that I was using.

     

    export DSPLINK=/home/beagle2/software/ti/dsplink

    cd $DSPLINK/gpp/src

    make debug

    ...

    /home/beagle2/software/ti/dsplink/gpp/src/../../gpp/src/osal/Linux/2.6.18/dpc.c: In function ‘DPC_Debug’:

    /home/beagle2/software/ti/dsplink/gpp/src/../../gpp/src/osal/Linux/2.6.18/dpc.c:561: error: ‘struct DpcObject_tag’ has no member named ‘exit’

    make[2]: *** [/home/beagle2/software/ti/dsplink/gpp/src/../../gpp/src/osal/Linux/2.6.18/dpc.o] Error 1

     

    The structure is defined in this same source file, and it doesn't contain an 'exit' member, so I just commented out the offending lines that reference it - they're in a routine that tries to print some debug.  It then built OK.  I haven't checked if this is an issue in more recent DSPLINK versions, but if it is, it needs to be fixed.

    The fact that only the DEBUG target of the kernel module can do debug prints also ought to be part of the Wiki page that describes how to turn on trace ("Debugging DSPLink using SET FAILURE REASON prints").

     

    I also find myself wondering why DSP_EFAIL is used for all these various errors.  It would be really simple in the code to have returned a unique error code for this case, and saved having to dig nearly as deeply as I had to do.  DSPLINK has the infrastructure in place to pass back a meaningful error code, but for some reason DSP_EFAIL is being used as a catch-all for a lot of vastly different errors on both the user and kernel sides.

     

    Reid

     

  • Reid,

    I appreciate your detailed response and i/ps on DSPLink. I am sure the steps you have given will be useful for other customers.

    Regarding the DSPLink i/ps:

    • DSPLink 1.61.03 had the build error that you have seen. This error is not in the debug build but it is seen when TRACE is enabled. This has been fixed in later releases.
    • I will update the wiki page as you have suggested.
    • We have tried to use specific error codes as much as possible. The generic codes have been used with the expectation that SET_FAILURE_REASON will give specifics. But I do understand the problem that customers face in this.For older releases, nothing can be done about it. I will see if I can take this i/p for upcoming releases.

    Deepali