This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

IPC Problem

I have a set of programs for the C6678 that use IPC to pass data between a few cores. It runs on the single DSP evaluation board using a Windows host. Now I'm trying to get it to run on one of the DSPs on the Advantech quad board with a Linux host. This will be expanded to more cores and more DSPs once I get the basic code working.

I'm having trouble on the quad board and I'm not sure of the right approach to isolate the problem. Can someone give me some hints of what to try or where to look?

Here's the problem. The HeapBufMP_Create()/HeapBufMP_Open() and MessageQ_Create()/MessageQ_Open() functions seem to work correctly when MultiProc.numProcessors = 1. When there is more than one processor involved, however, then the following things break:

  1. The XXX_Open() function hangs (it never returns) instead of returning a negative value if the XXX_Create() function hasn't yet been called. I can work around this by hacking in Task_sleep() calls in the right places. That makes sure that the create functions are executed before the open functions.
  2. The XXX_Open() function always hangs if it is executed on a processor other than the processor that executed the XXX_Create() function. I don't have a work around for this. It's difficult to inspect the internal state of the IPC code to figure out why it is hanging. I have looked briefly at the ROV reports, but there wasn't anything obvious that stood out. I probably need to examine these more closely, or build IPC in Debug mode, or do something else that I haven't figured out yet.

Everything is internal to a single DSP because there is no external I/O other than System_printf(). So it's unlikely to be a problem with the Advantech board itself.

The Linux CCS install pulled in newest versions of all the packages. These are newer than the ones I used on the Windows host. This might be a problem, but I haven't had time to test that yet. I'm in the process of rebuilding the Windows environment on a new laptop so I'll try that when the laptop is ready.

Advantech supplied an old version (1.0.0.15) of the C6678 PDK patched to use with their board. It's possible that the old version is not compatible with the newer version of the IPC package, but I haven't had a chance to test that or ask Advantech about it. It doesn't generate any compile or link errors so I don't know if it's worth trying to figure out a patch for a newer PDK.

This seems like there's some kind of signalling problem between the processors. IPCworks on the evaluation board with a newer PDK, and it works with the demo programs supplied by Adventech. Maybe I should build the Adventech demo programs using the new versions of the packages and see if it still works.

There's a lot of different things I could try. Which would be more likely to work first?

Thanks,
Fred

  • Fred,

    I’m not familiar with the setup, so hope others on the forum can provide some useful comments.  But a couple of questions…

    You say “Everything is internal to a single DSP because there is no external I/O other than System_printf(). So it's unlikely to be a problem with the Advantech board itself.”  But it seems from the rest of the description that the trouble is actually occurring when you’re using more than one processor. And, that it may indeed be “some kind of signalling problem between the processors”.  Can you please clarify?

    Do you know what Advantech’s earlier patch did?  It seems an updated patch may be needed, and it would be good to check with them about this.

    Scott

  • I should have said "cores" instead of "processors." Everything is happening inside of a single C6678 (the chip has 8 cores) except for the emulator connection stdio.

    The Adventech patch switches the identity of the two Ethernet ports and adjusts DDR3 parameters.The program doesn't use the Ethernet, and the DDR3 access works so their patch isn't the problem by itself.

    There are a lot of differences between the old PDK they supplied and the most recent one, so the problem could be missing or rearranged functionality in the old one. One of my choices is to try to apply the same patches to a newer PDK.It would take some time to figure out how to patch a new version.

    Adventech has promised a new version "Real Soon Now". I can't afford to wait, though.

    Fred

  • Fred,

    I don't have any knowledge of what the Advantech patch contains, other than what you mentioned.  But I would suspect that, unless your Heap is being linked into DDR3 or there are additional changes in the patch that you didn't mention, it shouldn't have any effect on a simple message passing scenario.  If you check your .map file, and there is no memory allocated above 0x80000000, then everything is internal.

    Can you take the .out file that runs on mutliple cores of the single 6678 EVM and load that exact .out on one the 6678's on your quad board?  If it works, then the problem is in software.  If it doesn't then the problem is in hardware.  

    What about GEL files?  Could there be something in the Linux .gel that initializes something differently than on the 6678 evm?  Again, I don't know what _should_ be different between these two boards, but it might be worth a shot trying to initialize the Quad EVM using the GEL file from the single EVM.  The functions initializing the MSMC might be one place to look, as the Shared L2 is likely where your heap is allocated.

    I don't know the details of the differences between the two PDKs that you mentioned.  My first thought is similar to yours, that there is some sort of deadlock between the processors.  The first thing you might try is putting some significant delay just after the HeapBufMP_Create function on your master core, and after the HeapBufMP_Open functions on your slave cores.  The goal is to ensure that all of these complete before anyone registers the heap or tries to create a MessageQ.   You could also do this by setting a breakpoint after the Heap Buffer functions on all processors and seeing if each core reaches it.

    I hope this gives you a bit of a starting point.

    Regards,
    Dan

     

  • Dan,

    Core 0 has most everything linked into DDR3 because IPC, NDK, and SYS/BIOS are too big for any place else. The program does not start the NDK in the test configuration so the difference in Ethernet devices shouldn't have any effect. Here's the map from the core 0 config file:

    Program.sectMap[".vecs"] = {loadSegment: "L2SRAM", loadAlign:8}; /* CSL per core data structures */
    Program.sectMap[".switch"] = {loadSegment: "L2SRAM", loadAlign:8}; /* CSL per core data structures */
    Program.sectMap[".cio"] = {loadSegment: "L2SRAM", loadAlign:8}; /* per core data structures */
    Program.sectMap[".args"] = {loadSegment: "L2SRAM", loadAlign:8}; /* per core data structures */
    Program.sectMap[".cppi"] = {loadSegment: "L2SRAM", loadAlign:16}; /* per core data structures */
    Program.sectMap[".far:NDK_OBJMEM"] = {loadSegment: "L2SRAM", loadAlign:16}; /* NDK structures */
    Program.sectMap[".nimu_eth_ll2"] = {loadSegment: "L2SRAM", loadAlign:16}; /* per core data structures */
    Program.sectMap[".qmss"] = {loadSegment: "L2SRAM", loadAlign:16}; /* per core data structures */
    Program.sectMap[".resmgr_memregion"] = {loadSegment: "L2SRAM", loadAlign:128}; /* QMSS descriptors region */
    Program.sectMap[".resmgr_handles"] = {loadSegment: "L2SRAM", loadAlign:16}; /* CPPI/QMSS/PA Handles */
    Program.sectMap[".resmgr_pa"] = {loadSegment: "L2SRAM", loadAlign:8}; /* PA Memory */
    Program.sectMap[".stack"] = "L2SRAM";
    Program.sectMap[".bss"] = "DDR3"; /* BSS. .neardata and .rodata are GROUPED */
    Program.sectMap[".neardata"] = "DDR3";
    Program.sectMap[".rodata"] = "DDR3";
    Program.sectMap["systemHeap"] = {loadSegment: "DDR3", loadAlign:128}; /* XDC Heap .. eg Memory_alloc () */
    Program.sectMap[".far"] = "DDR3";
    Program.sectMap[".cinit"] = "DDR3";
    Program.sectMap[".const"] = "DDR3";
    Program.sectMap[".text"] = "DDR3";
    Program.sectMap[".code"] = "DDR3";
    Program.sectMap[".data"] = "DDR3";
    Program.sectMap[".sysmem"] = "DDR3"; /* Malloc memory area */
    Program.sectMap["platform_lib"] = "DDR3"; /* Platform Library data structures */
    Program.sectMap[".gBuffer"] = {loadSegment: "DDR3", loadAlign:32}; /* Upload buffer used by the Web Server */
    Program.sectMap[".far:WEBDATA"] = {loadSegment: "DDR3", loadAlign: 32}; /* Web Pages and web server structures */
    Program.sectMap[".far:taskStackSection"]= "L3SRAM";
    Program.sectMap[".far:NDK_PACKETMEM"]= {loadSegment: "L3SRAM", loadAlign: 128}; /* NDK Buffer Pool */

    The other cores load everything into MSMC (called L3SRAM in the map above). That fits because they use only the IPC package along with the core algorithm. I use RTSC platforms to define separate address regions in MSMC and DDR3 for each of the cores.

    I don't know much about GEL files. The code in core 1 (loaded into MSMC) does run since I get the startup printf() from it.

    There already is a significant delay between the create and open calls. That was the work around for #1 above.

    I'll try running the .out files from the evm on the quad board. It might be a little easier to just compile the evm configuration on the linux box and run that on the quad board.

    Best Regards,
    Fred

  • The problem is the old version of the PDK. Here is how I reached that conclusion using the eval board.

    • I built my test program using two cores for the eval board. The program worked.
    • Then I downloaded MCSDK 2.0.3.15 and built with the PDK 1.0.0.15. The program failed.
    • To make sure that there wasn't a problem with incompatible versions, I used the older versions of all the packages in the old MCSDK. The program still failed.

    So, I guess that TI did fix some bugs between MCSDK 2.0.3.15 and 2.0.9.21 which my program tickled.

    Now I have to figure out the details of what Advantech did so I can patch a newer version of the PDK. Maybe Adventech will send out a new release before I'm done.

    Fred