This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

firmware_loader error when attempting to stop DSP on DM814x

Other Parts Discussed in Thread: OMAP-L138

Hi,

I am trying to stop the DSP on the DM814x, as part of implementing a low power mode.  I find that the firmware_loader stop operation fails - if both video or audio have been played simultaneously.  I was able to reproduce this issue with normal EZSDK examples, using the ti-ezsdk_dm814x-evm_5_05_02_00, without modification.

Procedure:

1. Start EVM as you would normally.

2. Wait for TI's Qt matrix app to start.

3. Load DSP code using the following lines:

# cd /usr/share/ti/rpe
# firmware_loader 0 ./dm81xx_c6xdsp_debug.xe674 start

4. Start video decode+display as background process:

# /usr/bin/runDecodeDisplayHDMI &

5. Start audio decode while video is still being decoded:

# ./aacdec_a8host_debug.xv5T -i aacdectest.cfg

6. Wait for video display to finish.

7. Attempt to stop DSP firmware:

# firmware_loader 0 /usr/share/ti/rpe/dm81xx_c6xdsp_debug.xe674 stop
FIRMWARE: I2cInit will be done by M3
FIRMWARE: Memory map bin file not passed
Usage : firmware_loader <Processor Id> <Location of Firmware> <start|stop> [-mmap <memory_map_file>] [-i2c <0|1>]
===Mandatory arguments===
<Processor Id> 0: DSP, 1: Video-M3, 2: Vpss-M3
<Location of Firmware> firmware binary file
<start|stop> to start/stop the firmware
===Optional arguments===
-mmap input memory map bin file name
-i2c 0: i2c init not done by M3, 1(default): i2c init done by M3
FIRMWARE: isI2cInitRequiredOnM3: 1
FIRMWARE: Default memory configuration is used
Firmware Loader debugging not configured
Default FL_DEBUG: warning
AllowAssertion at Line no: 1799 in /swcoe/sdk/cm/netra/arago-tmp/work/dm814x-evm-none-linux-gnueabi/ti-syslink-2_20_02_20-r1j/syslink_2_20_02_20/packages/ti/syslink/utils/hlos/knl/Linux/../../../../../../ti/syslink/ipc/hlos/knl/MessageQ.c: (MessageQ_module->heaps [heapId] != NULL) : failed
ed FL_DEBUG levels: error, warning, info, debug, log
MemCfg: DCMM (Dynamically Configurable Memory Map) Version : 2.1.2.1

Hopefully, you can see the same error.  Do I need to do something specially to successfully stop the DSP?

FYI, I initially submitted this issue to the DM814x forum, but they could not solve it due to "lack of resources".

Thanks,

Dan -

  • Daniel70334 said:
    FIRMWARE: Memory map bin file not passed
    Usage : firmware_loader <Processor Id> <Location of Firmware> <start|stop> [-mmap <memory_map_file>] [-i2c <0|1>]
    ===Mandatory arguments===
    <Processor Id> 0: DSP, 1: Video-M3, 2: Vpss-M3
    <Location of Firmware> firmware binary file
    <start|stop> to start/stop the firmware
    ===Optional arguments===
    -mmap input memory map bin file name
    -i2c 0: i2c init not done by M3, 1(default): i2c init done by M3

    FYI, in case anyone was wondering, the "usage" printed above should not really be printed in this case.  It's a problem with firmware_loader, where it prints the "usage" when no -mmap option is passed.  But it keeps going even though it "thinks" there's a problem with the command usage (which, in this case, there isn't).

    Daniel70334 said:
    Assertion at Line no: 1799 in /swcoe/sdk/cm/netra/arago-tmp/work/dm814x-evm-none-linux-gnueabi/ti-syslink-2_20_02_20-r1j/syslink_2_20_02_20/packages/ti/syslink/utils/hlos/knl/Linux/../../../../../../ti/syslink/ipc/hlos/knl/MessageQ.c: (MessageQ_module->heaps [heapId] != NULL) : failed

    The firmware_loader program, when passed the "stop" command, attempts to do MessageQ_unregisterHeap(2), which results in your assertion.  Apparently MessageQ's heapId of 2 is not currently registered.  The firmware_loader program registers heapId 2 when passed the "start" command.  The heap being registered is the heap obtained from the following code in firmware_loader.c:
        heapHandle = SharedRegion_getHeap(0);
        MessageQ_registerHeap(heapHandle, 2);

    Daniel70334 said:
    I find that the firmware_loader stop operation fails - if both video or audio have been played simultaneously

    We don't really know if heapHandle was properly registered, since the return value is not checked.  But, assuming it was successful, is it possible that either your audio app or video app (that you run between firmware start and stop) unregisters heapId 2?

    I ask because MessageQ heapIds are system-wide and can be managed by any application.

    As for failing to stop the DSP, firmware_loader "thinks" it has stopped the DSP (per another email I got forwarded, where it prints success w/ "FIRMWARE: 0 stop Successful"), but apparently your register snooping has shown that the DSP is not in fact powered down.  I know that w/ the OMAP-L138 the DSP won't power down until it executes the IDLE instruction, and if the (prebuilt) DSP firmware never terminates then it won't power down.  Do you know if there is a FORCE bit associated with the DSP powerdown?  I've used this bit on the OMAP-L138 in Linux kernel code, and it forces the power down of the DSP clock even in the case where the DSP doesn't execute IDLE.

    I'm not sure the EZSDK gives you the ability to rebuild the DSP firmware (since it's in the prebuilt-images area), but if you could rebuild it you could somehow have the ARM application tell the DSP to "abort" (w/ SYS/BIOS's System_abort() API), which should cause it to go to and IDLE instruction.  Or, you could insert a SYS/BIOS Idle function into the Idle module, where this function just calls IDLE (perhaps with the asm("IDLE"); instruction), assuming that the DSP executable enters the Idle loop at all.

    Regards,

    - Rob

     

  • Rob,

    First, thanks for responding.

    Robert Tivy said:

    FIRMWARE: Memory map bin file not passed
    Usage : firmware_loader <Processor Id> <Location of Firmware> <start|stop> [-mmap <memory_map_file>] [-i2c <0|1>]
    ===Mandatory arguments===
    <Processor Id> 0: DSP, 1: Video-M3, 2: Vpss-M3
    <Location of Firmware> firmware binary file
    <start|stop> to start/stop the firmware
    ===Optional arguments===
    -mmap input memory map bin file name
    -i2c 0: i2c init not done by M3, 1(default): i2c init done by M3

    FYI, in case anyone was wondering, the "usage" printed above should not really be printed in this case.  It's a problem with firmware_loader, where it prints the "usage" when no -mmap option is passed.  But it keeps going even though it "thinks" there's a problem with the command usage (which, in this case, there isn't).

    [/quote]

    Indeed.  We often wonder why TI folks don't find this as annoying as we do, and remove the usage() invocation.

    Robert Tivy said:

    We don't really know if heapHandle was properly registered, since the return value is not checked.

    I added some logging to firmware_loader:

    ====> MessageQ_registerHeap(0x984f8,2) returned 0(S_SUCCESS )

    So it gives the appearence of a heap being successfully registered.

    Robert Tivy said:

    But, assuming it was successful, is it possible that either your audio app or video app (that you run between firmware start and stop) unregisters heapId 2?

    I ask because MessageQ heapIds are system-wide and can be managed by any application.

    I added some prints to component-sources/syslink_2_20_02_20/packages/ti/syslink/ipc/hlos/knl/MessageQ.c to log the MessageQ_registerHeap() and MessageQ_unregisterHeap() invocations.  I found that the heapId 2 is indeed being both registered and unregistered by the video sample!  More accurately, it is being registered by OMX_Init() and unregistered by OMX_Deinit().  Details:

    1. video app calles OMX_Init()
    2. ... which calls DomxInit()
    3. ... which calls DomxCore_procInit()
    4. ... which calls OmxRpc_moduleRegisterMsgqHeap()
    5. ... which calls MessageQ_registerHeap()

    OMX_Init() ends up registering heapId 4,5, and then 2.  Then, when the program is shut down, OMX_Deinit() will unregister all 3 heapIds.  So then I guess that the heap isn't around anymore for firmware_loader to stop.

    So ... is this an OMX bug?  Or, should the firmware_loader's DSP start/stop have been using a heapId that does not overlap other applications?  

    Does the firmware_loader's choice of heapId have to be changed in concert with the OMX's or the DSP code's choices?  I.e. do multiple places in the EZSDK have to be matched, or should then in fact all try to be unique, like a dynamically allocated resource?

    Robert Tivy said:

    Do you know if there is a FORCE bit associated with the DSP powerdown?  I've used this bit on the OMAP-L138 in Linux kernel code, and it forces the power down of the DSP clock even in the case where the DSP doesn't execute IDLE.

    Uh, there's a log of instances of the word "force" in the DM814x TRM.  Do you have a register or param name from the OMAP-L138?

    Robert Tivy said:

    I'm not sure the EZSDK gives you the ability to rebuild the DSP firmware (since it's in the prebuilt-images area), but if you could rebuild it you could somehow have the ARM application tell the DSP to "abort" (w/ SYS/BIOS's System_abort() API), which should cause it to go to and IDLE instruction.  Or, you could insert a SYS/BIOS Idle function into the Idle module, where this function just calls IDLE (perhaps with the asm("IDLE"); instruction), assuming that the DSP executable enters the Idle loop at all.

    It's not clear to me how much of the DSP is object and how much is source.  The EZSDK does do a bunch of calls to cle674, which I've always imagined to be the compiler for the DSP, so maybe we do compile it.

    Dan -


  • Daniel70334 said:

    Indeed.  We often wonder why TI folks don't find this as annoying as we do, and remove the usage() invocation.

    I'll try to bring this problem to the attention of the EZSDK maintainers.

    Daniel70334 said:
    So ... is this an OMX bug?  Or, should the firmware_loader's DSP start/stop have been using a heapId that does not overlap other applications?

    Good question.  I believe it's a little of both.  OMX must not be checking the return value from its MessageQ_registerHeap(handle, 2) operation, since it shoudl be getting MessageQ_E_ALREADYEXISTS back (assuming that syslink.ko was not built with SYSLINK_BUILD_OPTIMIZE, which would eliminate the check for an already-registered heap).

    But I don't know why firmware_loader is even registering a heap w/ MessageQ.  It doesn't do any MessageQ_alloc()s internally, and I don't know if it's registering the heap on some other application's behalf.  But your stuff is running even though OMX unregisters the heap, and the assertion during the firmware_loader stop is innocuous and just "informational" in this case.

    Daniel70334 said:

    Does the firmware_loader's choice of heapId have to be changed in concert with the OMX's or the DSP code's choices?  I.e. do multiple places in the EZSDK have to be matched, or should then in fact all try to be unique, like a dynamically allocated resource?

    Another good question.

    First, if I were you (i.e., a customer) I would treat firmware_loader as something to be modified to your needs, as an example or starting point.  I suggest taking out the MessageQ operations entirely.  If MessageQ heap 2 is needed by some subsequent application then you will see a failure in that app during MessageQ_alloc(2, size).

    So, should MessageQ heaps be considered a system-wide resource?  The DSP (or any SYS/BIOS remote core) actually has its own heapId number space, completely separate from the host.  But on the host itself everyone shares the heapId number space.  I wouldn't exactly suggest doing what firmware_loader is doing, which seems to be registering a heap on some other application's behalf.  If an app (or a layer used by the app) needs to do MessageQ_alloc then it should register the heap for that alloc.

    This brings up the question of what heapId to choose, and I don't have a good answer for that.

    Should the app start from 0 and keep trying with a higher ID while the MessageQ_registerHeap() API returns MessageQ_E_ALREADYEXISTS?  (this won't work if syslink.ko was built with SYSLINK_BUILD_OPTIMIZE)

    Should all apps agree on who uses what IDs? (this won't work if independently-built apps need to be run simultaneously)

    Daniel70334 said:

    Uh, there's a log of instances of the word "force" in the DM814x TRM.  Do you have a register or param name from the OMAP-L138?

    The OMAP-L138 TRM lists the register with the FORCE bit in question as PSC[0|1] MDCTLn.  I believe the DSP's clock is controlled by PSC0's MDCTL15.  Here's a link to a page with the OMAP-L138 TRM: http://www.ti.com/general/docs/litabsmultiplefilelist.tsp?literatureNumber=spruh77a.  Take a look at the "10.7.4.1 C674x Megamodule Clock OFF" section, it talks about what the DSP must do to be put into a low-power state.

    I don't know how much of this applies to the DSP on the your device.

    Daniel70334 said:

    It's not clear to me how much of the DSP is object and how much is source.  The EZSDK does do a bunch of calls to cle674, which I've always imagined to be the compiler for the DSP, so maybe we do compile it.

    I think cle674 is an alias of some sort for the DSP compiler.  My C6x compiler tools contain cl6x and not cle674, but I assume that where you see cle674 is where DSP code is being compiled.

    Regards,

    - Rob

  • Rob,

    Again, thanks for the reply.

    I added more logging to the OMX code - MessageQ_registerHeap() does indeed return E_ALREADYEXISTS to the OMX library - but nothing in OMX is checking the error return.

    So, I think the answers to the questions that you designated as good are important for knowing how to proceed:

    • Why is firmware_loader registering a message queue?
    • If firmware_loader is registering a message queue for somebody else, then who?  The only app in my test scenario that is using it is OMX, and it is doing its own registration.  (The answer goes to whether we can just delete those lines, or if we change the 2 to something else, then what needs to be changed to match.)
    • Is heap 2 a magic number for RPE and OMX, or can the apps keep trying to register for heaps by checking one by one starting at 0 as you suggested until they get S_SUCCESS?

    Do you have some avenue to try to get answers to these questions?

    Robert Tivy said:

    I suggest taking out the MessageQ operations entirely.  If MessageQ heap 2 is needed by some subsequent application then you will see a failure in that app during MessageQ_alloc(2, size).

    Yup, suggestion noted.  Will consider as the fallback if deterministic answers to these questions remain elusive.

    In a way, I'm disappointed - I had hoped that solving the message queue problem would lead to proper DSP shutdown, but that's clearly not the case.  Even if I avoid the assertion by not starting the video app at all, but just loading and then immediately unloading the DSP, it still never comes back down to idle.

    Dan -

  • Daniel70334 said:

    Do you have some avenue to try to get answers to these questions?

    I have forwarded this thread to the maintainer of the EZSDK, and if he doesn't reply directly to this thread then I will pass his response along.

    Daniel70334 said:

    In a way, I'm disappointed - I had hoped that solving the message queue problem would lead to proper DSP shutdown, but that's clearly not the case.  Even if I avoid the assertion by not starting the video app at all, but just loading and then immediately unloading the DSP, it still never comes back down to idle.

    I suspect that the DSP executable that you're loading is sitting there waiting for some sort of notification from the host to proceed.  It's most likely in SYS/BIOS's Idle loop, which is the loop that runs when there's no threads (Tasks) to run.  You might expect an Idle loop to enter the DSP's IDLE state, but that's not the case.  If you can gain access to building the executable, you could install an Idle function that just calls the asm instruction IDLE.

    It's still an assumption that doing the above will allow the DSP clock powerdown.  The other suggestion I made, to FORCE the clock off, would require changing the Linux source code that handles the clocks.

    Regards,

    - Rob

     

     

  • Hi Dan,

    I received a response from someone on the EZSDK team, which I cut-and-paste here.  I'm also attaching the 2 patches mentioned in the response.

    Hi, 

       The code:

                    heapHandle = SharedRegion_getHeap(0);
                    MessageQ_registerHeap(heapHandle, 2); 

    is used for registering Heap ID that is used by RPE for the MessageQ allocations.

    In the e2e thread, I saw that the user is running the OMX decode as well as RPE audio simultaneously, so they would need this settings to use RPE. 

    In the e2e thread, I saw that the user has pointed out that heapid 2 is also used by the OMX, so this might be conflicting with the registration done by the DSP. To solve this problem, either the heap ID of the OMX or RPE needs to change so that it is not conflicting. I have tried on my side by changing the HEAP ID for RPE in the firmware loader. This would also require a change in the RPE source code to point to the new Heap ID to use for the MessageQ allocations. This seems to solve the assertion problem.

    I have attached patch for media-controller-utils and RPE changes for your reference. In this I am using the HEAP ID as 3 instead of 2, you could use any other but make sure it doesn’t conflicts. 

    I have just sanity tested these patches on my side, kindly do any regression testing that is required  on your side to validate the patch. Also we would not be creating any new release for the media-controller-utils or remote processor execute for this issue, I could only provide a patch set.

    Kindly try this and see if it solves this problem.

    Thanks & Regards,

    Amar

    Here are the patch attachments (w/ .txt extension added to foil the forum file attacher):

    1185.0001-Change-the-RPE-MessageQ-heap-ID-to-3-to-match-with-t.patch.txt

    8270.0001-firmware-loader-change-the-RPE-MessageQ-heap-ID.patch.txt

    Regards,

    - Rob

     

  • Hi Rob,

    Sorry, I was on some other stuff the last few days.  I applied the two patches last night with some skepticism, as I know parts of the EZSDK are delivered as object code.  But it seems like moving the heap ID to 3 resolved the cause of the assertion, and still lets me decode video and audio - so it looks good, as far as I can tell.  That part of the DSP code must truly be delivered in source form, I guess.

    I'm going to defer the DSP power down work for a while - other things have come up.

    Thanks very much for helping with this.  I found this exchange very helpful.

    Dan -