This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

IPC between 2 cores gives invalid region

Other Parts Discussed in Thread: TMS320C6678, SYSBIOS

I am trying to do bidirectional IPC between core 0 and core 1 using MessageQ.  To do this I am creating 2 IPC at the same time one that is 0->1 and one that is 1->0.  (NOTE: I will eventually want to create about 4 simultaneous IPC but need to get 2 working first!)

Both 0->1 and 1->0 allocate correctly but it fails at the first call to MessageQ_put() on core 0 with the abort() error

ti.sdo.ipc.transports.TransportShm: line 388: assertion failure: A_regionInvalid: Region is invalid

My CFG file is very similar to the Image Processing demo.  I create different .OUT for the different cores.  The MSMC is divided into 3 partitions, one for Core 0, one for Core 1, and one for IPC.  Both CFG have sections like

/* Shared Memory base address and length */
var SHAREDMEM = 0x0c200000;
var SHAREDMEMSIZE = 0x00200000;

SharedRegion.setEntryMeta(0,
{ base: SHAREDMEM,
len: SHAREDMEMSIZE,
ownerProcId: 0,
isValid: true,
name: "MSMCSRAM_IPC",
});

Both MessageQ_alloc() correctly return an address in the 0x0c200000 region.

The Heap is created on core 0 as

/*
* Create the heap that will be used to allocate messages.
*/
HeapBufMP_Params_init(&heapBufParams);
heapBufParams.regionId = 0;
heapBufParams.name = "MSMCSRAM_IPC";
heapBufParams.numBlocks = 20;
heapBufParams.blockSize = sizeof(MessageQ_MsgHeader)+64;
heapHandle = HeapBufMP_create(&heapBufParams);
if (heapHandle == NULL) {
System_abort("HeapBufMP_create failed\n" );
} else {
printf("%i: Heap %s created\n",0,heapBufParams.name);
}

/* Register this heap with MessageQ */
MessageQ_registerHeap((IHeap_Handle)heapHandle, 0);
printf("%i: Heap registered\n",0);

Core 0's functions to open and write to the port are

#define MAX_MSG 64

static void *openControlPort(void *pdv, int size) {
  ProcHandle pd = pdv;
  Port p = calloc(1,sizeof(PortRec));
  int status;
  char port_name[32];
  if (!size) size = MAX_MSG;
  if (size > MAX_MSG) size = MAX_MSG;
  sprintf(port_name,"controls%d",pd->unique);
  printf("%i: MessageQ_open %s\n",0,port_name);
  do {
    status = MessageQ_open(port_name,&p->sqid);
  } while (status < 0);
  printf("%i: MessageQ_alloc %s\n",0,port_name);
  p->smsg = (MessageQ_Msg *)MessageQ_alloc(0,32+size); /* header is 32 bytes */
  sprintf(port_name,"controlr%d",pd->unique);
  printf("%i: MessageQ_create %s %p\n",0,port_name,p->smsg);
  p->rhandle = MessageQ_create(port_name,NULL);
  p->rmsg = malloc(32+size);
  printf("%i: done openControlPort\n",0);
  return p;    }

static void writeAll(void *pp, void *vbuf, int bytes) {
  Port p = pp;
  int offset = 0;
  while (bytes > 0) {
    int wbytes = bytes > MAX_MSG ? MAX_MSG : bytes;
    printf("writeAll %d\n",wbytes);
    memcpy((char *)p->smsg + 32,(char *)vbuf+offset,wbytes);
    while (MessageQ_put(p->sqid,*p->smsg) != MessageQ_S_SUCCESS) printf("ack!\n");
    bytes -= wbytes;
    offset += wbytes;
  }
}

Core 1's functions to open the port is

void *embOpenControlPort(int size) {
  HostConnection p = calloc( 1, sizeof(HostConnectionRec));
  int status;
  char port_name[32];
  if (!size) size = MAX_MSG;
  if (size > MAX_MSG) size = MAX_MSG;
  sprintf(port_name,"controls%d",embId());
  printf("%i: MessageQ_create %s %d\n",(int)DNUM,port_name,size);
  p->shandle = MessageQ_create(port_name,NULL);
  p->smsg = malloc(32+size);
  sprintf(port_name,"controlr%d",embId());
  printf("%i: MessageQ_open %s\n",(int)DNUM,port_name);
  do {
    status = MessageQ_open(port_name,&p->rqid);
  } while (status < 0);
  printf("%i: MessageQ_alloc %s\n",(int)DNUM,port_name);
  p->rmsg = (MessageQ_Msg *)MessageQ_alloc(0,32+size); /* header is 32 bytes */
  printf("%i: done embOpenControlPort %p\n",(int)DNUM,p->rmsg);
  return p;
}

  • Jim:

    I noticed you don't check the return of MessageQ_registerHeap, nor pass the name as the second parameter.  Can you check what the return value is and consider passing the name?

    Can you check which queue it is trying to put it in?  (p->sqid)?

    Thanks,

    John Demery

  • Hey John

    I re-checked the registerHeap documentation and it says that 2nd parameter is an ID number.  Incorrect?

    I checked the return code on core 0 and it said success.

    I printed (int)p->sqid) and it gave me 65539.  Not sure what number to expect?

    Thanks

    Jim

  • Jim:

    p->sqid should be a queue number typically between 700-900 depending on how yours queues are setup. 

    It looks to me like you are in fact opening the queue before it is created on core0, indicitive by calling MessageQ_open before MessageQ_create, and so it is not in fact opening the queue you create for it.  However, you seem to do it correctly on core1.

    Try fixing that and see if it makes a difference.

     

    Sincerely,

    John Demery

  • Hi John

    Let me explain the code a little better. I think it is right. 

    As I said I am trying to do bi-directional communication. So I am creating two MessageQs, "controlr1" and "controls1".  The r is receive and the s is send.  

    The code on core 0 calls open and alloc on "s".   Then it calls create on "r".

    The code on core 1 calls create on "s".  Then it calls open and alloc on "r".

    So the way the code should work is core 0 should poll on open() for "s" waiting for core 1 to create() 's".  Then core 1 should poll on open() for "r" waiting for core 0 to create 'r".  With that in mind does the code look right?  

    Are you suggesting that core 0 can't call open() on "s" before core 1 create()'s "s"?  I don't think that's what you're suggesting, but want to make sure.  If it is so, I'm confused and would need more specific help on how to remedy this situation.

    Thanks

    Jim

  • Jim,

    I think you're creating two heaps, both with region ID zero.  This is causing the invalid ID error.

    When you initialize the SharedRegion in the .cfg you're creating a default heap with region ID 0.  Then in your source you attempt to reuse the default shared region heap to create a heapBufMP, again with ID 0.  When you register the heap with MessageQ it fails during allocate because it doesn't know which heap to use.

    If you look at some of the PDK multi-core examples using IPC you'll notice that in some you can register the SharedRegion default heap with MessageQ directly in the following manner:

    #define HEAP_ID 0

    MessageQ_registerHeap((IHeap_Handle)SharedRegion_getHeap(0), HEAP_ID);

    I'm not sure which PDK you're using but a good example is in pdk/packages/ti/transport/ipc/examples/qmssIpcBenchmark

    in bench_qmss.cfg the SharedRegion is configured:

    var SharedRegion = xdc.useModule('ti.sdo.ipc.SharedRegion');
    SharedRegion.translate = false;
    SharedRegion.setEntryMeta(0,
        { base: Program.global.shmBase,
          len: Program.global.shmSize,
          ownerProcId: 0,
          isValid: true,
          cacheEnable: cacheEnabled,
          cacheLineSize: cacheLineSize,  /* Aligns allocated messages to a cache line */
          name: "internal_shared_mem",
        });


    In the source, bench_qmss.c, the following process is performed:

    In main():

    1. attachAll() - Ipc_start() called and all cores attached via Ipc_attach()
    2. Every core creates its local queue via MessageQ_create()
    3. BIOS_start() -> tsk0 runs

    In tsk0:

    1. SharedRegion default heap registered with MessageQ -> MessageQ_registerHeap((IHeap_Handle)SharedRegion_getHeap(0), HEAP_ID);
    2. Each core opens a remote MessageQ
    3. Tests cases executed (Include calles to MessageQ_alloc/put/get/free()

    So again, trying removing the heapBufMP init code and registering the SharedRegion default heap with MessageQ directly.

    Justin


  • Hi Justin

    I am using: pdk_C6678_1_1_2_6

    1) I removed the HeapBufMP_Params_init() and HeapBufMP_create() as you suggested and altered registerHeap as you suggested, but I still fails in the same place.

    2) I re-checked the image processing demo and it does both, it calls HeapBufMP_create and defines the SharedRegion in the .cfg file.  Not sure why this works in image processing but is a potential problem for me. 

    3) Do the MessageQ_create have to happen before BIOS_start?

    My order is

    main

    1) Ipc_start

    2) BIOS_start, start master_main()

    master_main

    1) NC_NetStart -> NetworkOpen

    a) registerHeap

    b) MessageQ_create/alloc

    c) Put/Get

    Is there a problem with this order?

    It is not an easy change for me to move the MessageQ_create to main().  It means I cannot create MessageQ's dynamically, which hurts my project.  (That is I have to decide how many MessageQs I want before the program makes it to my actual application code.)  Would you please verify this limitation on where MessageQ_create can run before I move forward?

    Thanks

    Jim

  • Jim,

    You can call MessageQ_create at any time.  The problem is centered around the MessageQ_alloc -> MessageQ_put flow.

    I looked closer at your code and I see a pointer discrepancy.

      p->smsg = (MessageQ_Msg *)MessageQ_alloc(0,32+size); /* header is 32 bytes */

    ...


        while (MessageQ_put(p->sqid,*p->smsg) != MessageQ_S_SUCCESS) printf("ack!\n");


    Why are you derefencing p->smsg when passing it to MessageQ_put?  MessageQ_alloc returns a MessageQ_Msg and MessageQ_put takes a MessageQ_Msg as an argument.  That extra deference is causing MessageQ_put and all underlying IPC transport layers to see completely wrong MessageQ_Msg packet data.  That's probably the source of your problem, not the heap creation.

    Justin

  • You're illegally (not from a compiler standpoint) casting the MessageQ_Msg returned from MessageQ_alloc to a pointer to a MessageQ_Msg.  Since the return type of MessageQ_alloc is a MessageQ_Msg your eventual call to MessageQ_put with the "invalid" derefence of the MessageQ_Msg will cause a failure.  The dereference of something that was not meant to be derefenced in the first place will cause MessageQ to validate the provided MessageQ_Msg against the wrong data in memory.

    If you look at the MessageQ APIs you'll see that MessageQ_Msg is actually a typedef'd pointer to a MessageQ_MsgHeader.  Therefore, your "p" structure need only have an element of type MessageQ_Msg, not MessageQ_Msg *.

    Justin

  • Hi Justin

    Thanks.  That does get rid of the invalid section error.  However, it still doesn't work.

    As I said above I have a bidirectional IPC.  The 0->1 is fully initialized on both sides.  Core 0 starts putting data on the MessageQ, then hangs waiting to get its first response on the other MessageQ (this is sort of what I expect). 

    The 1->0 is not initialized, however.  Core 0 create()s it.  But core 1 is hanging in MessageQ_open().

    I stepped through an execution with the debugger but it doesn't make any sense to me.  As best as I can tell it gets lost in ti_sdo_ipc_nsremote_NameServerRemoteNotify_get__F() while doing NameServer_getUInt32().  It never makes it out of NameServer_getUint32().

  • You need to create a MessageQ on both core 0 and core 1, not both on core 0.  Core 0 needs to open Core 1's queue and vice versa.  Core 0 will _put to Core 1's queue and _get from its own.  Core 1 will _put to Core 0's queue and _get from its own.

    MessageQ transaction pseudo code flow I've used many times...

    /* Queue names */
    char core0QName[] = "core0";
    char core1QName[] = "core1";

    void core0pseudo(void)
    {
        MessageQ_Handle core0Q;
        MessageQ_QueueId core1QId;

        Ipc_Start() + attach to core 1...
        
        /* Create a message queue. */
        core0Q = MessageQ_create(core0QName, NULL);    
        if (core0Q == NULL) {
            System_abort("MessageQ_create failed\n" );
        }

        /* Register this heap with MessageQ */
        MessageQ_registerHeap((IHeap_Handle)SharedRegion_getHeap(0), HEAP_ID);

        /* Open the 'next' remote message queue. Spin until it is ready. */
        do {
            status = MessageQ_open(core1QName, &core1QId);
        }
        while (status < 0);

        /* send message to core 1 */
        msg = MessageQ_alloc(HEAP_ID, ...);

        MessageQ_put(core1QId, msg);

        /* receive message from core 1 */
        MessageQ_get(core0Q, &msg, MessageQ_FOREVER);

        do stuff...
    }

    void core1pseudo(void)
    {
        MessageQ_Handle  core1Q;
        MessageQ_QueueId core0QId;

        Ipc_Start() + attach to core 0...
        
        /* Create a message queue. */
        core1Q = MessageQ_create(core1QName, NULL);    
        if (core1Q == NULL) {
            System_abort("MessageQ_create failed\n" );
        }

        /* Register this heap with MessageQ */
        MessageQ_registerHeap((IHeap_Handle)SharedRegion_getHeap(0), HEAP_ID);

        /* Open the 'next' remote message queue. Spin until it is ready. */
        do {
            status = MessageQ_open(core0QName, &core0QId);
        }
        while (status < 0);

        /* send message to core 0 */
        msg = MessageQ_alloc(HEAP_ID, ...);

        MessageQ_put(core0QId, msg);

        /* receive message from core 0 */
        MessageQ_get(core1Q, &msg, MessageQ_FOREVER);

        do stuff...    
    }

  • Hi Justin

    Thanks for responding, but I think my code fits that.  You can see my code at the top.  

    The order is

    Core 0

    open "r"

    alloc "r"

    create "s"

    Core 1

    create "r"

    open "s"

    alloc "s"

    Core 1 is hanging in "open 's'".  And not hanging in the while loop polling, but hanging in the open function.

    A little backstory - it has taken a lot of effort to get to this point due to the number of TI features I need to work together.  If you work with Arun Mani or John Demery you might get a recap from them about what we're trying to do and what has been done. One issue we had to disable cache just to get around a mystery of a program stack getting clobbered.

    Given the vagueness of the error, and that the code is pretty simple/noncontroversial (let me know if you think it is wrong), I think there is something more sinister going on here.

    Is there any reason for open() not to return?  It is built to be nonblocking so this doesn't make any sense.

    Thanks

    Jim

  • Jim,

    MessageQ_open is getting stuck polling the other cores to see if they have the queue that is named "port_name".  MessageQ calls the NameServer layer which sends messages to each core asking if they have the queue with the name provided in the _open call.  The communication is made via the IPC registers and a small region in shared memory managed by SharedRegion.  I'm wondering if disabling caching is throwing off the NameServer transactions between the cores.  I'd expect NameServer to work even if cache is disabled but I've actually never tried this myself.  The fact that you had to disable cache to work around a stack corruption points to something bigger (and more sinister ;)) at work here. 

    You might want to take a step back and pursue the solution to the stack corruption.  A stack corruption points to a larger problem somewhere else that can have any number of effects on your code.  Having to disable cache to work around it suggests you may have a global variable somewhere that isn't aligned and padded to the 128byte cache line size.  On writeback the data surrounding the variable will be wiped if the variable isn't aligned and padded.

    Going back to your code, I think the flow is okay.  It seems weird to me that you're creating core 0's queue after core 1 starts to try to open it, but that's just personal preference.  If MessageQ were working correctly, the MessageQ_open would continuously return NULL until Core 0 creates the queue.  The corruption issues, cache disable workaround, and the NameServer layer not returning point to a bigger problem than your program flow.  I'd work to resolve the corruption so that cache can be reenabled.

    Justin

  • For what it's worth, the original problem occurred not in "my" code, but in NDK initialization.  So any global variables you'd assume were set up correctly.  The problems happened entirely before my code ran.

    The explanation I was given was that I was using half of L2 for SRAM and half for cache, and sometimes program memory crept into the cache side of L2.  And that I could go back and try and give L2 1/4 or 1/8 for cache and it should be no problem.  

    I am not sure what the next step is. Perhaps give L2 a small cache and see.

    Jim

  • Is there any reason why you're not using the entire L2 as SRAM?

  • That's what I meant by disabling cache.  Here's my custom platform

    /*!
    * File generated by platform wizard. DO NOT MODIFY
    *
    */

    metaonly module Platform inherits xdc.platform.IPlatform {

    config ti.platforms.generic.Platform.Instance CPU =
    ti.platforms.generic.Platform.create("CPU", {
    clockRate: 1000,
    catalogName: "ti.catalog.c6000",
    deviceName: "TMS320C6678",
    customMemoryMap:
    [
    ["MSMCSRAM_MASTER",
    {
    name: "MSMCSRAM_MASTER",
    base: 0x0c000000,
    len: 0x00180000,
    space: "code/data",
    access: "RWX",
    }
    ],
    ["L2SRAM",
    {
    name: "L2SRAM",
    base: 0x00800000,
    len: 0x00080000,
    space: "code/data",
    access: "RWX",
    }
    ],
    ["MSMCSRAM_SLAVE",
    {
    name: "MSMCSRAM_SLAVE",
    base: 0x0c180000,
    len: 0x00080000,
    space: "code/data",
    access: "RWX",
    }
    ],
    ["MSMCSRAM_IPC",
    {
    name: "MSMCSRAM_IPC",
    base: 0x0c200000,
    len: 0x00200000,
    space: "code/data",
    access: "RWX",
    }
    ],
    ["DDR3",
    {
    name: "DDR3",
    base: 0x80000000,
    len: 0x10000000,
    space: "code/data",
    access: "RWX",
    }
    ],
    ["DDR3_IPC",
    {
    name: "DDR3_IPC",
    base: 0x90000000,
    len: 0x10000000,
    space: "code/data",
    access: "RWX",
    }
    ],
    ],
    l2Mode:"0k",
    l1PMode:"32k",
    l1DMode:"32k",

    });

    instance :

    override config string codeMemory = "L2SRAM";
    override config string dataMemory = "L2SRAM";
    override config string stackMemory = "L2SRAM";

    }

  • Arun reviewed my code and suggested I might be missing code like this:

     

    memset(slave_queue_name, 0, MAX_SLICES*16);

    for (i = 0; i < MAX_SLICES; i++) {

            GET_SLAVE_QUEUE_NAME(slave_queue_name[i], i);

    }

    return 0;

     

    Does the name server need to know this?

  • NameServer is under the hood of IPC.  You shouldn't have to configure anything regarding it.  I have no idea what the code Arun posted relates too.

    Why were you splitting the L2 into cache and SRAM in the first place?  Was that what the image processing demo was doing?  I'm not extensively familiar with the internals of the demo.

    I conferred with another developer and if you're concerned about the L2 configuration we suggest you do the opposite of my original suggestion and set the entire L2 to cache.  Especially if you're placing code and data into DDR, which I think the image processing demo does.

    I was reviewing some of your other threads regarding your app and it seems you had something working at some point in time.  What was that something and at what point did things start going awry? 

    I'm still holding to my statement that you should be pursuing the stack corruption bug before taking on this IPC bug.  Even if it's in NDK somewhere there still something inherently wrong with the app, especially if you were still seeing the system crash due to an instruction fetch with the cache enabled (referenced from http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/p/261296/915541.aspx#915541)

  • The Image processing demo splits L2 in half and I was just copying that.  It maps some program components to L2, and I believe some of that mapping is required.  It doesn't use DDR except for the NDK heap.

    I got one core working a long while ago (with NDK comm to host) but have not had any luck getting NDK and IPC to work together.  Things went awry when I tried to use NDK and IPC together.

  • I know this isn't what you specifically asked me to do but thought I'd update anyway.  I enabled 128kb of cache in L2 to see if it would make a difference, and it didn't.  The behavior seems to be the same - it hangs sometime after calling MessageQ_alloc and doesn't return.  

    FWIW if I just break into core 1 
    block$56(void *, unsigned int, unsigned short, unsigned int *)() at Cache.c:718 0x0C1B0AD0 
    ti_sysbios_family_c66_Cache_inv__E(void *, unsigned int, unsigned short, unsigned short)() at Cache.c:654 0x0C1C7114 
    ti_sysbios_hal_Cache_CacheProxy_inv__E(void *, unsigned int, unsigned short, unsigned short)() at image_processing_evmc6678l_slave_pe66.c:31,303 0x0C1C4F00 
    ListMP_getHead(struct ListMP_Object *)() at ListMP.c:339 0x0C198C54 
    0x0000FFFF  (no symbols are defined for 0x0000FFFF)
    So there are some caching functions getting called. Not sure what to make of it.
    I guess one takeaway is that it didn't fail earlier. It still passed NDK initialization, etc., with the smaller cache.
  • Have you verified that the IPC components shared between cores that are used for the IPC communication mechanisms have all been placed in the following memory, according to your linker command file?

    ["MSMCSRAM_IPC",
    {
    name: "MSMCSRAM_IPC",
    base: 0x0c200000,
    len: 0x00200000,
    space: "code/data",
    access: "RWX",
    }

    In the error/stack trace you provide above the ListMP_getHead (and all sub functions) are all trying to operate on pointers outside this range.

    I'll admit these are all just shots in the dark.  I'm not that familiar with the image processing demo or NDK.

    With all this partitioning going on within L2, MSMC, and DDR it's pretty clear this is a case of NDK and IPC stepping on one another.  I'll try to get some more information on how NDK is configured and what resources it requires (memory, and other) in order to operate.  At that point I'll try to evaluate where the conflicts may be occurring with IPC.

    In the meantime, if you can revert to that older version you had working, I'd suggest verifying NDK to host still works.  Then try adding a very simple local IPC (for example, just starting IPC and creating a queue).  Approach the problem piecemeal, adding small bits of IPC until you see a problem.  That might help us narrow down what's causing the issue.

    Justin

  • Justin,

    Thanks so much for helping Jim. I am not familiar with IPC. But I have talking with Jim for a while now. I want to give the full back ground of the issue. 

    Jim can see that the data are sent via IPC from Core0 to core 1. But the first message we allocated in the core1 thre up an error. 

    Are you saying that all the messages that we create should be in the MSMCSRAM_IPC only? jim has two other sections in the MSMC one for Slave (core 1) and one for master (core 0)  should they not use that?Thanks,

    Arun 

  • Perhaps this is the problem.  (But how do I fix it)

    After core 0 sets up 0->1 and 1->0, it starts writing (MessageQ_put) to 0->1.

    It does a sequence of about 10 writes before it starts waiting on a read (MessageQ_get) from 1->0.

    When I comment out the puts, and have core 0 just wait there forever, then core 1 doesn't get lost in neverland.

    It is easy to add handshaking (so for each 0->1 I also wait for a small return receipt from 1->0) but is this the right approach for MessageQ API?

  • Hi jim,

    I think the problem is that you need to sync between cores before you start sending messages. You can use the notify to make sure that the handshake happens and then send the messages. 

    Is it possible to test one thing. set the while loop for test =1. Set test = 1 before loop and after you see the core 1 starts to wait, change  test =2 and see if the messages are going in both directions.

    thanks,

    arun.

  • I tried a similar test (that I understood how to implement easier).  Since 1 ends last, I put a 1->0 at the end of 1 so 0 would wait for 1. Unfortunately another "region is invalid" error in this put running on core 1

    void embWriteHostControl(void *handle, void *buf, int n) {
    #if USE_MESSAGE_Q
    HostConnection p = handle;
    int offset = 0;
    while (n > 0) {
    int wbytes = n > MAX_MSG ? MAX_MSG : n;
    printf("%i: writeHostControl %d\n",(int)DNUM,wbytes);
    memcpy((char *)p->smsg+32,(char *)buf+offset,wbytes);
    while (MessageQ_put(p->sqid,p->smsg) != MessageQ_S_SUCCESS);
    n -= wbytes;
    offset += wbytes;
    }
    printf("done embWriteHostControl\n");
    #endif
    }

    embWriteHostControl(p,&id,sizeof(int));

    And here is the log

    Connected!
    0: MessageQ_open controls1
    0: MessageQ_alloc controls1 65539
    0: MessageQ_create controlr1 c204a80
    0: done openControlPort 80032720
    readAll 4
    [C66xx_1] 1: MessageQ_alloc controlr1 5
    1: done embOpenControlPort c204b00
    1: writeHostControl 4
    ti.sdo.ipc.transports.TransportShm: line 388: assertion failure: A_regionInvalid: Region is invalid
    xdc.runtime.Error.raise: terminating execution

    So now it gets through create/open/alloc for both 0->1 and 1->0 but it fails on the first 1->0. (No 0->1 tried yet.)

    According to the log both 0->1 and 1->0 are correctly in MSMCSRAM_IPC at addr 0xc204a80 and 0xc204b00

  • His sync points should be the MessageQ_open(). If a remote core can successfully open a MessageQ via the MessageQ_open() API that remote core should be able to send a message via the returned MessageQ ID without a problem.

    Jim, can you try this flow:

    core 0:

    create "control_rec"

    open "control_send" ==> ret control_send_id

    alloc smsg put (control_send_id)

    core 1:

    create "control_send"

    open "control_rec" ==> ret control_rec_id

    alloc rmsg

  • ti.sdo.ipc.transports.TransportShm: line 388: assertion failure: A_regionInvalid: Region is invalid
    xdc.runtime.Error.raise: terminating execution

    This error means that the message is allocated from an invalid SharedRegion.  Is it possible to put a breakpoint in the TransportShm_put function to see where the message is allocated from?  Or it could be your SharedRegion on this core isn't correctly setup?

    Judah

     

  • Jim,

    Make sure your p->smsg points to a valid MessageQ_Msg handle.  In your code you're copying something from buf to p->smsg+32 but then passing p->smsg to MessageQ_put.  If buf contains the entire MessageQ_Msg, header+additional data, it will fail because the header data will be invalid.  Also, p->smsg originates from a handle you've passed into the function.  Could this handle be stale?  If you freed the MessageQ_Msg at any point any subsequent _puts() of that message will fail.

    Justin

  • Hi Justin,

    I've gotten a bit further since Fri so that both sides get past the initialization, without any region errors.

    However I'm not getting reliable transfers of large buffers. I made the MessageQ_alloc so that it's big enough for any data I want.  The small transfers are okay but the big ones, some of the data is different (corrupted during the sending process).  It's reliablely wrong, as well (same values are always wrong in the same way).  To verify I'm casting to int and printing all the ints. I'll paste a log below

    Additionally it seems if I use printf a lot, eventually I get a Heap error

    ti.sysbios.heaps.HeapMem: line 294: out of memory: handle=0x827cd0, size=200008

    Is there anything to look out for when using MessageQ_alloc to create the buffer and the header together?

    Jim

    write (4): 53309 
    [C66xx_1] read (4): 53309 
    [C66xx_0] write (20): 2 6 0 1 128 
    write (64): 0 200000 0 0 0 0 0 0 0 22104 371 187 0 48 98 0 
    [C66xx_1] read (20): 2 6 0 1 128 
    read (64): 0 200000 0 0 0 0 0 0 0 22104 371 187 0 48 98 0 
    [C66xx_0] write (4): 8 
    write (8): 1634100580 7629941 
    [C66xx_1] read (4): 8 
    read (8): 1634100580 7629941 
    read (0): 
    [C66xx_0] write (748): -2147483636 -2147483636 -2147483641 1610612744 14 -2147483646 1610612742 7 -2147483645 1610612745 12 1610612761 5 4 4 1610612753 6 4 -2147483642 5 1610612752 6 1610612742 11 13 7 -2147483647 6 11 2 -2147483646 6 -2147483646 4 -2147483647 2 2 -2147483645 1610612741 16 1610612751 9 -2147483642 5 -2147483646 1610612746 11 7 -2147483647 1610612756 6 6 3 1610612741 21 -2147483645 9 15 7 -2147483645 1610612746 4 4 -2147483646 3 3 3 2 1610612737 1073741830 1352 -32 1610612738 40 -2147483646 1610612744 10 536870921 16777216 -2147483643 1073741830 5 -1 -2147483643 1610612738 4 -2147483646 -2147483646 -2147483647 2 -2147483645 -2147483648 1 5720 1024 0 1 5720 1024 0 1 5720 8192 13912 1 5720 13912 1624 1024 0 1 1624 1024 5720 1024 0 1024 1452 1448 1488 136 1624 1024 1 1 1464 1456 1434 1 568 6 1072 2 1424 10000 1472 572 1 1072693248 1416 584 0 0 1192 1192 1 128 1434 1432 20480 1 1 1 1 1 1436 1440 3 1444 1096 96 5 0 2 16 1 96 72 32 0 16 48 0 0 0 0 3211264 0 128 3 1633906540 108 1634100580 7629941 1651470960 1949266789 7368037 
    write (4): 8 
    [C66xx_1] read (748): -2147483636 -2147483636 -2147483641 1610612744 14 -2147483646 1610612742 7 0 22104 371 187 0 48 98 0 -164466298 -369318196 1299506112 -1915922051 -1745438154 -1641442173 -1357592620 -247661771 13 7 -2147483647 6 11 2 -2147483646 6 -2147483646 4 -2147483647 2 2 -2147483645 1610612741 16 1610612751 9 -2147483642 5 -2147483646 1610612746 11 7 -2147483647 1610612756 6 6 3 1610612741 21 -2147483645 9 15 7 -2147483645 1610612746 4 4 -2147483646 3 3 3 2 1610612737 1073741830 1352 -32 1610612738 40 -2147483646 1610612744 10 536870921 16777216 -2147483643 1073741830 5 -1 -2147483643 1610612738 4 -2147483646 -2147483646 -2147483647 2 -2147483645 -2147483648 1 5720 1024 0 1 5720 1024 0 1 5720 8192 13912 1 5720 13912 1624 1024 0 1 1624 1024 5720 1024 0 1024 1452 1448 1488 136 1624 1024 1 1 1464 1456 1434 1 568 6 1072 2 1424 10000 1472 572 1 1072693248 1416 584 0 0 1192 1192 1 128 1434 1432 20480 1 1 1 1 1 1436 1440 3 1444 1096 96 5 0 2 16 1 96 72 32 0 16 48 0 0 0 0 3211264 0 128 3 1633906540 108 1634100580 7629941 1651470960 1949266789 7368037 
    read (4): 8 
    [C66xx_0] write (8): 1634100580 7629941 
    write (392): -2147483647 1610612742 536870936 1 536870915 5720 -2147483647 536870921 1 -2147483609 536870926 1 -2147483625 1610612739 1073741830 1072 4 1073741830 1352 -32 1073741833 1100 32 -2147483639 -2147483648 4 20 5720 36 44 13912 52 56 13912 60 1624 76 1624 84 5720 100 1452 104 1448 108 1488 116 1624 140 1464 144 1456 148 1434 196 568 204 1072 216 1424 300 0 0 376 1472 400 572 504 1416 568 584 588 1192 592 1192 604 128 608 1434 612 1432 900 1436 904 1440 936 1444 940 1096 96 0 16 96 72 32 0 16 48 
    [C66xx_1] read (8): 1634100580 7629941 
    read (392): -2147483647 1610612742 536870936 1 536870915 5720 -2147483647 536870921 0 22104 371 187 0 48 98 0 -164466298 -369318196 1299506112 -1915922051 -1745438154 -1641442173 -1357592620 -247661771 13 7 -2147483647 6 11 2 -2147483646 6 -2147483646 4 -2147483647 2 2 -2147483645 1610612741 16 1610612751 9 -2147483642 5 -2147483646 1610612746 11 7 -2147483647 1610612756 6 6 3 1610612741 21 -2147483645 9 15 7 -2147483645 1610612746 4 4 -2147483646 3 3 3 2 1610612737 1073741830 1352 -32 1610612738 40 -2147483646 1610612744 10 536870921 16777216 -2147483643 1073741830 5 -1 -2147483643 1610612738 4 -2147483646 -2147483646 -2147483647 2 -2147483645 -2147483648 1 5720 1024 0 1 5720 
    [C66xx_0] write (24): 0 4 3 1 5 2 
    write (4): 0 
    [C66xx_1] read (24): 0 4 3 1 5 2

  • Could this be a cache issue? Do I need to invalidate or flush for large data in MSMC?

  • For the sake of consistency have you tried defining your MessageQ_Msg's with a struct typedef instead of pointer offsets?

    #define MAX_PAYLOAD_SIZE_BYTES N

    typedef struct {
      MessageQ_MsgHeader header;      /* 32 bytes */
      uint8_t            payload[N];
      uint8_t            pad[128-N-sizeof(MessageQ_MsgHeader)]; /* pad to cache line size */
    } TstMsg;

    Allocs can now be allocated with the sizeof(TstMsg):

        MessageQ_Msg     msg;

        /* MessageQ_alloc will return a buffer that is cache line-aligned */
        msg = MessageQ_alloc(HEAP_ID, sizeof(TstMsg));

        MessageQ_put(remoteQueueId, msg);

    On reception the payload data can be accessed through a recast to TstMsg:

        TstMsg *rcvMsg;

        MessageQ_get(messageQ, (MessageQ_Msg *) &rcvMsg, MessageQ_FOREVER);

        if (rcvMsg->payload[0] == ...) {

            ...

        }

    This might be cleaner than tracking the MessageQ_MsgHeader and payload separately through offsets which may help you track down the incoherent data issue.

    Justin

  • Thanks for the response but perhaps I'm not describing the code enough. 

    At this point all my messages are size 1024.  I use one messageQ for all IPC in the current test.  If it's 4 bytes I only fill in 4+32 bytes.  If it's 748 bytes then I fill in 748+32 bytes.

    The same message Q is used for all the items in the log.

    I am copying data into and out of the Message Q buffer on both sides.  Before I put() I copy data into it.  After I get(), I copy data out of it. I put in extra handshaking to make sure I don't do the next put() before get() is done.

    The behavior is that the first 64 bytes are always right but after that they are reliably unreliable (wrong in the same way each time).

    To verify - I should not have to worry about the cache when doing MessageQ?  The fact that the correctness is 64-byte chunks has me worried about this so want to make sure. There are platforms where I have to invalidate the cache before sending, etc., so I wouldn't be too surprised despite not seeing it in the TI docs/examples.

    If you stare at the data above, the yellow side is send and the green side is receive.  If the cache line is 64bytes, the first line is good, but after that it isn't. Note that the green parts actually match in the two long xfers. My reading is there is probably still stale data in MSMC when core 1 is trying to read, and that it is a cache flushing issue.

    Is my platform_osal.c wrong? I copied this from some template project, but see now that it has cache stuff in it that is largely not filled in.

    Thanks

    Jim

  • FWIW I also tried moving the MessageQ to DDR instead of MSMC, and got similar results. So I don't think the messages are getting clobbered.

    Unfortunately when I moved the MessageQ to DDR, printf from core1 stopped working so I don't know how they were wrong, just that they were wrong. I assume they were wrong in the same way, but that's an assumption.

  • Jim,

    The buffers that you give to a MessageQ_put you don't have to do any cache operations on it because MessageQ internally will perform the right cache operation for you.

    From your description you are actually coping data into the MessageQ buffer.  Now, where are you copying the data from?  Is this from another buffer that is only ever used by a single core?  If so, I don't see a need for a Cache invalidate of this buffer before you receive the next data.  If you are getting this data from a DMA or another peripheral, then you should do an invalidate of the buffer first.

    Judah

  • Jim,

    It wasn't clear to me if you do multiple MessageQ_alloc() calls or a single call and just resuse the same message?

    If you did a single MessageQ_alloc and reuse the buffer, you need to realize that whatever was the original allocation size is what MessageQ will think the message size is....that is unless you actually change the message's msgSize field every time you reuse the message.

    Judah

  • The data is coming from NDK.  I have sockets Intel->core0. Then IPC core0->core1.

    Is it possible that core 0 has stale data, and the NDK has not flushed yet?

    If NDK is stale it might explain part of it, but it doesn't explain why the "read" and "write" printf's differ.

    Jim

  • I reuse the same message. It is 1024 bytes

    I tried to simplify the code by just making the buffer bigger than any size I will ever send.

    I am not telling it how many bytes of data are valid at any point.  It should only know the size based on the MessageQ_alloc, which is the biggest message I will send.

    Jim

  • Jim,

    Sounds to me like you need to cache invalidate the buffer which receives the data from "intel" before it receives the data.  If you don't invalidate that buffer, I think you would be getting old data because core0 will have this in the cache and won't see the new data.

    Judah

  • Hi Judah,

    I gave you bad info. The data is not coming from the socket, it is generated in core 0.  So if there is a cache problem it pretty much has to be in IPC.

    Jim

  • Jim,

    Sorry, didn't quite understand what you mean.  So if I understand you correctly its like this:

    You do a MessageQ_alloc() then you populate the message, then you call MessageQ_put() on the message?  As long as the SharedRegion from which the message is allocated has 'enableCache' set to 'TRUE" then when you do a MessageQ_put the Transport should do a Cache writeback invalidate of the message.  At least that is true for all MessageQ transports the were orginally written.

    Do you know which MessageQ Transport you are using?  Is it TransportShm or TransportShmCirc or some other one?

    Judah

  • Yes, my code is basically

    form big array local to core 0

    MessageQ_open / MessageQ_alloc

    copy data from big array into MSMC ptr

    MessageQ_put

    There's some fluff around it to do handshaking and set stuff up but that's the gist of it.

    My SharedRegion is

    /* Shared Memory base address and length */
    var SHAREDMEM = 0x0c200000;
    var SHAREDMEMSIZE = 0x00200000;

    SharedRegion.setEntryMeta(0,
    { base: SHAREDMEM,
    len: SHAREDMEMSIZE,
    ownerProcId: 0,
    isValid: true,
    name: "MSMCSRAM_IPC",
    });

    I welcome any suggestion on what to put here. Is the default for enableCache false? If it is false, am I not guaranteed of correct/current data?

    I don't know which transport I am using, how do I find out?

    Thanks

    Jim

  • Are you using CCS as your debugger?  If so you should be able to see the message in a memory browser and also see whether it is being cached or not (I'm pretty sure its being cached).  You should also be able to see that after you do a MessageQ_put() the message is no longer in the cache.

    One of the earlier post stated that you had an issue in TransportShm so that's likely the MessageQ Transport you are using.  The default for SharedRegion is 'cacheEnable' is true so you should be okay there.  If it was 'false' your data could be old stale data.  If the cache operations are not happening then Core1 would not see the correct data which is in Core0's cache and that is why Core0 needs to writeback then invalidate the message in cache.

    TransportShm_put() does this as the very first thing:

        /* writeback invalidate the message */
        if (SharedRegion_isCacheEnabled(id)) {
            Cache_wbInv(msg, ((MessageQ_Msg)(msg))->msgSize, Cache_Type_ALL,
                TRUE);
        }

    Furthermore, I don't know if you know but there if you are using CCS debugger there is a tool called ROV which can help in debugging.  You can run it under 'Tools->ROV'.

    Judah

  • Stepping through isn't working. I put a breakpoint on core 0 in my function that calls MessageQ_put.

    Start running

    • small transfer, press F8
    • small transfer, press F8
    • small transfer, press F8
    • lost in the ListMP
    Press Pause and I get this stack

    block$56(void *, unsigned int, unsigned short, unsigned int *)() at Cache.c:711 0x0C074ECA
    ti_sysbios_family_c66_Cache_inv__E(void *, unsigned int, unsigned short, unsigned short)() at Cache.c:654 0x0C0A3BB4
    ti_sysbios_hal_Cache_CacheProxy_inv__E(void *, unsigned int, unsigned short, unsigned short)() at image_processing_evmc6678l_master_pe66.c:32,827 0x0C0A0F00
    ListMP_getHead(struct ListMP_Object *)() at ListMP.c:361 0x0C04AD48
    0x0000FFFF (no symbols are defined for 0x0000FFFF)

    I'm going to try a few other things but thought I should give an update in case it made any sense.

  • Jim,

    Try this:

    1.  Put a breakpoint on MessageQ_put

    2.  Run to the breakpoint.

    3.  Open Memory browswer and set it to the msg.  There should be indication via different colors whether its in cache.

    4.  Put a breakpoint on the address in register B3.

    5.  Run til you get to the breakpoint (address in B3).  Verify that your msg is no longer in cache.

    Judah

  • Hi Judah

    Steps 4 and 5 didn't work for me (didn't break at the addr) but another issue:

    When I break at 3, it shows me 64 bytes are in L1D. I want to know if the whole 1024 bytes are in L1D. Is this telling me they're not, only 64 are? (My problem has been only the first 64 bytes are guaranteed to be right.)

    address c204ac0 is the first white addr after msg starting at c204a80 = 64bytes

    Perhaps if I step it will show me each line getting read in line-by-line, but I need to rerun to see that, so I'll post this first.

    Jim

  • I single stepped a bit and the only bytes highlighted were the first 64b (eventually it decides it doesn't want to single step anymore and just starts running).

    Is there some MessageQ function I can break at inside MessageQ_put to see when bytes 65-1024 get put in L1D?

    Thanks

    Jim

  • My version numbers if it matters

    opt/ti/ipc_1_24_03_32/packages;

    /opt/ti/bios_6_33_06_50/packages;

    /opt/ti/mcsdk_2_01_02_06;

    /opt/ti/imglib_c66x_3_1_1_0/packages;

    /opt/ti/uia_1_01_01_14/packages;

    /opt/ti/pdk_C6678_1_1_2_6/packages

  • One small point is that I have a lot of different messages going through this same IPC buffer. Early ones are 4-20 bytes. Some are closer to the full buffer size of 992. 

    The information above on page 3 is for the 1st transfer which is only 4 bytes.  That is the memcpy before the put() is size 4.

    I should try to break on the big one if I can figure out how to. (It won't let me step to the big one.)

    I mention this because perhaps only the 1st 64bytes are lit in blue because I am only updating the memory in that space. I need to force an example where I can break on where I write to the whole buffer.

  • I zero-filled the buffer just to make sure, and I get the same behavior that only the first 64 bytes are highlighted as in L1D.

  • Jim,

    I would have thought zero fill the whole buffer would have brought the whole buffer into the cache.  Can you try filling the whole buffer will a different value like 0x1234.  If the buffer memory is cacheable, then when you fill the buffer memory, you should see it in the cache before any MessageQ_put().  You should see that the MessageQ_put() removes it from the Cache.  So its not quite making sense as to why only 64bytes of the buffer is cached.

    Why isn't steps 4 & 5 working for you?  When you get to the MessageQ_put() function B3 contains the return address.  At the end of the MessageQ_put() function you will see a "B  B3" or equivalent meaning branch to the content contained in B3.  You can put a breakpoint on this "B  B3" if that's easier.

    4.  Put a breakpoint on the address in register B3.

    5.  Run til you get to the breakpoint (address in B3).  Verify that your msg is no longer in cache.

    Judah 

  • Hi Judah,

    I didn't interpret what I was seeing correctly. I was looking at the address for 1->0 instead of 0->1. 

    In order to handshake/synchronize I do both 1->0 and 0->1 sequentially. Simple/naive way to prevent the next 0->1 from happening when there is still live data in MSMC not copied out yet.

    The 1->0 buffer was in L1D presumably because the 1->0 message was gotten right before the memcpy into the 0->1 buffer.

    If I look at the actual address for 0->1, the data is never in L1D. It is always white.

    For the transfers I am able to see, the data is all correct in MSMC. Including my constant-fill of 43211234. However, I don't get to the big transfer - the one that is wrong. It gets lost in other ListMP code and doesn't make it back to the next breakpoint.

    I need to add more checking to core 1 to see if the padded sends are right but if you have any thoughts at this point let me know.

    Thanks
    Jim