This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Reattach DSP to ARM using IPC on an OMAP-L138

Other Parts Discussed in Thread: OMAP-L138, SYSBIOS

I am using SYSBIOS 6.33.03.33 and IPC 1.24.02.27 with an OMAP-L138 on custom hardware.

I have a very simple communications system where the ARM must notify the DSP periodically.   I had this working in bare metal using interrupts directly, but now I am trying to get equivalent functionality using IPC and the Notify mechanism.

The problem is this:  I would like the DSP to be able to reset itself and reattach to the ARM, but the IPC mechanisms I've tried have not worked.   The notification system works when I run both the ARM and then the DSP, but if I restart the DSP code, the ARM never notices and, as far as it is concerned, stays "attached," so the DSP attach calls never succeed when the DSP is restarted.

I've played around with moving from Ipc.procSync=Ipc.ProcSync_ALL to Ipc.ProcSync_PAIR and doing the attach/detach loops explicitly instead of letting it be taken care of by Ipc_start, and I can again get the system to run once, but no matter what I do on the DSP side I cannot restart it without restarting the ARM side.

The only way I can think of doing this is through an end-run around the IPC system by having an element of shared DDR that the ARM is watching, and if it sees that element go from 0 to 1 it detaches the DSP and can then respond to attach requests.

Will that work?   Is there a better way?

  • I just spent some time on the "end run" approach and got nowhere.   After the DSP reboots, there seems to be nothing the ARM can do to get to a state where it can "attach" the DSP again short of rebooting. 

    I've tried doing a Ipc_detach on the DSP proc ID, but the ARM detach gives up because the DSP's Ipc_Rserved->startedKey structure is in set to PROCSYNCFINISH.

    I've tried Ipc_stop and then another Ipc_start on the ARM side, but that appears to fail as well, i.e., the second Ipc_start never starts.

    So, is what I'm trying to do just impossible?   Is the proper response to the DSP rebooting to reboot the whole chip?   Should I just give up on IPC and go back to bare metal signal interrupts?

  • Jay,

    Which processor is the owner of SharedRegion 0?

    I'm curious what would happen if you made the other core the master since the master core will clear away the reserved portion of shared memory when you do an Ipc_start().

    Judah

  • I believe the ARM is the owner of Shared Region 0, at least according to the commong .xs I include in both the DSP and ARM projects:

    var SharedRegion = xdc.useModule('ti.sdo.ipc.SharedRegion');
    SharedRegion.setEntryMeta(0,
        { base: SHAREDMEM,
          len:  SHAREDMEMSIZE,
          ownerProcId: 0,
          isValid: true,
          name: "shared_mem",
        });

     The ARM brings up my DSP and starts it, so it makes sense (to me) that the ARM is in charge of the IPC as well.  Perhaps going the other way (since I expect the DSP to be going up and down much more often in my development cycle than the ARM) is a good thing to try.   I would just change the ownerProcId to 1 and leave everything else the same, correct?

     

  • Yes, that is correct.

  • I tried switching the SR0 owner to be the DSP, and I still can't get the result I want.   I see that the DSP's Ipc_Start resets the shared memory region to 0 when the DSP restarts, which is an improvement, but there still seems to be too much "state" lying around on the ARM to re-establish communications, even though the ARM knows that the DSP has gone down and come back up again through an external flag in DDR.

    For example, as far as I can tell, the "attached" reference count for the DSP cannot be reduced via Ipc_detach  to 0 because the "startedKey" of the internal module state of various things  is not in the right condition.   I tried just poking in a value of 0 for the attached reference and reattaching the ARM, but that failed as well.

    Doing an Ipc_stop seems catastrophic:  it sets Ipc_sr0MemorySetup to 0 and nothing works after that.

    So, am I barking up the wrong tree here?  Is there a known method for handling a processor going down and coming back up again besides resetting the whole system? 

    It seems that the _pe9.c file generated for my application sets up module state variables in very particular ways.   Is there anyway to get back to this state short of a reload of the program?   It looks like it would be very difficult for my program to access the structures set up in the _pe9.c file. 

  • Jay,

    I'm assuming you are running BIOS on both ARM and DSP correct?  IPC wasn't design to handle this sort of catastrophic going down of a processor.

    When you are doing down on the DSP, do you know that you are going down?  Are you able to do anything before you go down or does it simply reset at some point?

    I think at this point you are probably fighting a losing battle.  I think you've exhausted the possibilities.  Its something we might try to support in the future but we haven't had request for this sort of scenario.

    Judah

  • Yes, I am running BIOS (6) on both processors.  

    What I'm trying to address is my group's normal workflow where we get our image acquisition/replay system going, load and start the ARM via CCS5, which enables the DSP, and then work on the DSP vwhere the algorithms are (also via CCS5).  In this mode we bring the DSP up and down, sometimes loading new code on, sometimes just restarting, debugging, repeating that many, many times without ever touching the ARM again as it happily acquires images and notifies the DSP.   In the running code booting from flash I do not expect to see the DSP go down without also taking down the ARM.

    Having to change the work flow to actually do a complete boot process, i.e., reload code onto the ARM, whenever I want to restart the DSP, significantly slows down my group's development.   To the point that I will probably back off of IPC and go back to bare metal signal interrupts, since I know that works the way I want it to.    I had high hopes that I could build on IPC beyond the simple notification to get more sophisticated message passing and make more effective use of the ARM, but that looks unlikely now without significantly slowing down our debug cycle.

    I guess I have a feature request:  I want Ipc_stop to bring the IPC system to EXACTLY the same place it was in before the first Ipc_start, so that Ipc_start brings the system up in a "blank" slate.   If I had that, then everything would be fine, because I have a method to let the ARM core know that the DSP has restarted, I just can't get the ARM back to its starting point in regards to IPC.   A simple test would be the ability to do a Ipc_start -> Ipc_stop -> Ipc_start cycle succussfully.

  • Jay,

    Are you using SysLink on the ARM? Reading through the post, I see you are using the ARM to initially load and run the DSP (with BIOS on both cores). How did you initially load and run the DSP, with SysLink or manually with CCS? If you are using SysLink, then there is considerable state on the ARM which would need to be reset.

    In general, our fault tolerance is very limited. We have started adding fault tolerance support to SysLink (not yet available) but I don't believe IPC has any FT support.

    From reading your post above, I believe your development flow is to have the ARM up and running, then be able to rebuild a new DSP executable and load/run it on the DSP "on-the-fly" without restarting the ARM. This would be difficult to support with IPC because there are data structures in memory used by both sides. When you relink an executable, the address of this data structure might change. When SysLink loads/runs the DSP, it extracts this address from the DSP executable (during the loading phase). In your situation, if the ARM does not reload the DSP, there would be a mismatch of the base address and no hope of successfully re-attaching.

    One last idea, would it be possible to modify your development flow to have the ARM stop/unload the current DSP executable and then load/run the new DSP executable through a flag mechanism (use CCS to set a flag in memory). This would keep the ARM up and running. Reloading the DSP is supported by our software stack.

    ~Ramsey

  • Ramsey,

    Thanks for your response.

    We are not currently using SysLink on either the ARM or the DSP.    The documentation led me to believe that if both sides are SysBIOS, then the "proper" approach was to use the IPC library.   Was that the wrong impression?   With SysBIOS IPC the "shared memory" location is determined by the common configuration files, so at least that is fixed.

    Your idea of a "flag" is an interesting one.   I'm not sure what the sequence would be, as it appears that Ipc_detach has to be a two sided thing, i.e., both sides have to call detach simultaneously or the connection is not broken.   Is that an incorrect statement?   I only believe this because when I try to do an Ipc_detach of the DSP on the ARM when the DSP is suspended, it never returns success, since there appears to be a check in the Ipc_detach code to make sure the DSP is in a "detaching" state, which it never gets to.    Is there a tested sequence of operations with SysLink that would allow this?

    While fault tolerance is good, all I really need is the ability to "restart" the communications from scratch.   Can I do this with a released version of SysLink, i.e., do you have a unit test that shows a sequence on the "host" processor of

      start communications
      make connections (block until client core connects)
      stop communications (breaking connections, assume client core has been reset, and will be restarted, perhaps even with the restriction of using the same code)
      start communications (block until client core connects)
      make connections (block until client core connects)
        .
        .

    It appears to me that the version of IPC I have (1.24.02.27) cannot do this, and my impression from the previous discussion is that no version of IPC can.   If SysLink can, then I would happily switch from IPC to SysLink to get this ability.

       Jay

  • Jay,

    I'm afraid we don't have good news for you. Judah and I discussed your situation and came to realize that it will not work. There is state information in the .bss section of the ARM's executable which will not be reset by any of the IPC API's. This means that when attempting a second IPC startup sequence, there will always be a failure due to the old state in the ARM's .bss section. We simply do not yet have fault tolerant support in IPC.

    I want to point out that we do support multiple attach/detach sequences but this requires a coordinated effort on both sides to release all resources (i.e. MessageQ_close(), MessageQ_delete(), Ipc_detach(), Ipc_stop(), etc.). This is probably a significant effort. If your deployed solution does not require this, it does not make sense to write all that code just to support your development flow.

    One suggestion we have is to halt the ARM in CCS, reset the ARM, and run the ARM when you want to restart the DSP. You would not need to reload the ARM. You indicated above that reloading the ARM is cumbersome, I'm curious why this is so. We typically reset the device and reload everything when a test crashes. Maybe there is something we could help with making this process easier for you.

    I'll try to answer your questions from your previous post just to tie up any loose ends.

    Jay Gowdy81418 said:
    We are not currently using SysLink on either the ARM or the DSP.

    You do not need to use SysLink. From what I can see, using IPC and SYS/BIOS on both cores is the correct approach for your application. FYI, SysLink implements the same IPC API's but is typically use on an HLOS running on the ARM with IPC and SYS/BIOS running on the DSP. SysLink does support an all SYS/BIOS configuration but this is only useful if you have a file system on the ARM and want to dynamically load the DSP executable.

    You are correct, the IPC shared memory address is determined by configuration and will not change when relinking your executable. When using SysLink, there are a couple of symbols which do move around, but this does not apply for you.

    Jay Gowdy81418 said:
    Your idea of a "flag" is an interesting one.

    If you did want to support a coordinated shutdown sequence, it would look something like this.

    1. ARM: Send a message to DSP to start a shutdown sequence. Wait for ack.

    2. ARM & DSP: Release all remote resources. For example, close all message queue handles opened on remote core (i.e. MessageQ_close()). Do this for all shared resources (MessageQ, HeapMemMP, etc.).

    3. ARM & DSP: Handshake both sides that all remote resources have been released.

    4. ARM & DSP: Release all local resources. For example, delete all message queues (i.e. MessageQ_delete()).

    5. ARM & DSP: Send disconnect event (typically done with Notify module). Wait for ack. Then unregister notify callback.

    6. ARM & DSP: Call Ipc_detach() and Ipc_stop().

    I've attached a SysLink example which illustrates this flow. Look at host/App.c (App_start(), App_stop() for the ARM side code and dsp/Server.c (Server_setup() and Server_finish()) for the DSP side code. The dsp side code will apply directly for you, but the host side code is a little different because it's a SysLink application. Still, about 85% applies.

    Jay Gowdy81418 said:
    I would happily switch from IPC to SysLink to get this ability.

    The current version of SysLink does not have any fault tolerant support, that is coming in a future release. In addition, the support we have is for ARM side terminate, not DSP side terminate. I don't see any reason to switch.

    Good luck. I'm sorry for the troubles, I hope you don't give up on Texas Instruments :-)

    ~Ramsey

    ex02_messageq.zip
  • I haven't given up on you guys yet.   For now I'm going to fall back to just raising the CHIPSIG2 interrupt via registers.   If we move to more sophisticated need for DSP/ARM communications, I'll check it again.

    One thing that may be an alternative approach in the future:  instead of the Ipc_start/stop/start ability, how about an Ipc_detachForce call?   Right now, Ipc_detach just doesn't work at all if the other side has disappeared, i.e., it has to be part of a two sided negotiated dance.

    How about you support the situation where Ipc_detach_force does this

    1.  is the ID of the module being detached > the current ID (i.e., on an OMAP am I the ARM disconnecting the DSP)?
    2. If so, reset the connection to detached (set the attached reference count to 0, clear out any bookkeeping in the internal tables, etc.), regardless of the state of the connection, because we expect that the "other" side has gone away and will be doing a complete reset from scratch.

    The idea is that this would reset the result if Ipc_isAttached to false and allow a subsequent call to Ipc_attach(DSP) to act like the first call to Ipc_attach(DSP).  

    With such a call (or equivalent), I believe I could get the functionality I want.  

       Jay

  • Jay,

    I'm glad to hear you have not given up. Thanks for the feedback. We have already discussed an Ipc_hangup() API similar to your Ipc_detachForce(). We recognize the lack of fault tolerance support and are working to fix this. Your use-case will be considered in our designs. Good luck.

    ~Ramsey