IPC between 2 cores gives invalid region

James Steed

Expert 1905 points

Other Parts Discussed in Thread: TMS320C6678, SYSBIOS

I am trying to do bidirectional IPC between core 0 and core 1 using MessageQ. To do this I am creating 2 IPC at the same time one that is 0->1 and one that is 1->0. (NOTE: I will eventually want to create about 4 simultaneous IPC but need to get 2 working first!)

Both 0->1 and 1->0 allocate correctly but it fails at the first call to MessageQ_put() on core 0 with the abort() error

ti.sdo.ipc.transports.TransportShm: line 388: assertion failure: A_regionInvalid: Region is invalid

My CFG file is very similar to the Image Processing demo. I create different .OUT for the different cores. The MSMC is divided into 3 partitions, one for Core 0, one for Core 1, and one for IPC. Both CFG have sections like

/* Shared Memory base address and length */
var SHAREDMEM = 0x0c200000;
var SHAREDMEMSIZE = 0x00200000;

SharedRegion.setEntryMeta(0,
{ base: SHAREDMEM,
len: SHAREDMEMSIZE,
ownerProcId: 0,
isValid: true,
name: "MSMCSRAM_IPC",
});

Both MessageQ_alloc() correctly return an address in the 0x0c200000 region.

The Heap is created on core 0 as

/*
* Create the heap that will be used to allocate messages.
*/
HeapBufMP_Params_init(&heapBufParams);
heapBufParams.regionId = 0;
heapBufParams.name = "MSMCSRAM_IPC";
heapBufParams.numBlocks = 20;
heapBufParams.blockSize = sizeof(MessageQ_MsgHeader)+64;
heapHandle = HeapBufMP_create(&heapBufParams);
if (heapHandle == NULL) {
System_abort("HeapBufMP_create failed\n" );
} else {
printf("%i: Heap %s created\n",0,heapBufParams.name);
}

/* Register this heap with MessageQ */
MessageQ_registerHeap((IHeap_Handle)heapHandle, 0);
printf("%i: Heap registered\n",0);

Core 0's functions to open and write to the port are

#define MAX_MSG 64

static void *openControlPort(void *pdv, int size) {
ProcHandle pd = pdv;
Port p = calloc(1,sizeof(PortRec));
int status;
char port_name[32];
if (!size) size = MAX_MSG;
if (size > MAX_MSG) size = MAX_MSG;
sprintf(port_name,"controls%d",pd->unique);
printf("%i: MessageQ_open %s\n",0,port_name);
do {
status = MessageQ_open(port_name,&p->sqid);
} while (status < 0);
printf("%i: MessageQ_alloc %s\n",0,port_name);
p->smsg = (MessageQ_Msg *)MessageQ_alloc(0,32+size); /* header is 32 bytes */
sprintf(port_name,"controlr%d",pd->unique);
printf("%i: MessageQ_create %s %p\n",0,port_name,p->smsg);
p->rhandle = MessageQ_create(port_name,NULL);
p->rmsg = malloc(32+size);
printf("%i: done openControlPort\n",0);
return p; }

static void writeAll(void *pp, void *vbuf, int bytes) {
Port p = pp;
int offset = 0;
while (bytes > 0) {
int wbytes = bytes > MAX_MSG ? MAX_MSG : bytes;
printf("writeAll %d\n",wbytes);
memcpy((char *)p->smsg + 32,(char *)vbuf+offset,wbytes);
while (MessageQ_put(p->sqid,*p->smsg) != MessageQ_S_SUCCESS) printf("ack!\n");
bytes -= wbytes;
offset += wbytes;
}
}

Core 1's functions to open the port is

void *embOpenControlPort(int size) {
HostConnection p = calloc( 1, sizeof(HostConnectionRec));
int status;
char port_name[32];
if (!size) size = MAX_MSG;
if (size > MAX_MSG) size = MAX_MSG;
sprintf(port_name,"controls%d",embId());
printf("%i: MessageQ_create %s %d\n",(int)DNUM,port_name,size);
p->shandle = MessageQ_create(port_name,NULL);
p->smsg = malloc(32+size);
sprintf(port_name,"controlr%d",embId());
printf("%i: MessageQ_open %s\n",(int)DNUM,port_name);
do {
status = MessageQ_open(port_name,&p->rqid);
} while (status < 0);
printf("%i: MessageQ_alloc %s\n",(int)DNUM,port_name);
p->rmsg = (MessageQ_Msg *)MessageQ_alloc(0,32+size); /* header is 32 bytes */
printf("%i: done embOpenControlPort %p\n",(int)DNUM,p->rmsg);
return p;
}

over 10 years ago

0 John Demery over 10 years ago

TI__Intellectual 2745 points

Jim:

I noticed you don't check the return of MessageQ_registerHeap, nor pass the name as the second parameter. Can you check what the return value is and consider passing the name?

Can you check which queue it is trying to put it in? (p->sqid)?

Thanks,

John Demery

0 James Steed over 10 years ago in reply to John Demery

Expert 1905 points

Hey John

I re-checked the registerHeap documentation and it says that 2nd parameter is an ID number. Incorrect?

I checked the return code on core 0 and it said success.

I printed (int)p->sqid) and it gave me 65539. Not sure what number to expect?

Thanks

Jim

0 John Demery over 10 years ago in reply to James Steed

TI__Intellectual 2745 points

Jim:

p->sqid should be a queue number typically between 700-900 depending on how yours queues are setup.

It looks to me like you are in fact opening the queue before it is created on core0, indicitive by calling MessageQ_open before MessageQ_create, and so it is not in fact opening the queue you create for it. However, you seem to do it correctly on core1.

Try fixing that and see if it makes a difference.

Sincerely,

John Demery

0 James Steed over 10 years ago in reply to John Demery

Expert 1905 points

Hi John

Let me explain the code a little better. I think it is right.

As I said I am trying to do bi-directional communication. So I am creating two MessageQs, "controlr1" and "controls1". The r is receive and the s is send.

The code on core 0 calls open and alloc on "s". Then it calls create on "r".

The code on core 1 calls create on "s". Then it calls open and alloc on "r".

So the way the code should work is core 0 should poll on open() for "s" waiting for core 1 to create() 's". Then core 1 should poll on open() for "r" waiting for core 0 to create 'r". With that in mind does the code look right?

Are you suggesting that core 0 can't call open() on "s" before core 1 create()'s "s"? I don't think that's what you're suggesting, but want to make sure. If it is so, I'm confused and would need more specific help on how to remedy this situation.

Thanks

Jim

0 jsobota over 10 years ago in reply to James Steed

TI__Intellectual 995 points

Jim,

I think you're creating two heaps, both with region ID zero. This is causing the invalid ID error.

When you initialize the SharedRegion in the .cfg you're creating a default heap with region ID 0. Then in your source you attempt to reuse the default shared region heap to create a heapBufMP, again with ID 0. When you register the heap with MessageQ it fails during allocate because it doesn't know which heap to use.

If you look at some of the PDK multi-core examples using IPC you'll notice that in some you can register the SharedRegion default heap with MessageQ directly in the following manner:

#define HEAP_ID 0

MessageQ_registerHeap((IHeap_Handle)SharedRegion_getHeap(0), HEAP_ID);

I'm not sure which PDK you're using but a good example is in pdk/packages/ti/transport/ipc/examples/qmssIpcBenchmark

in bench_qmss.cfg the SharedRegion is configured:

var SharedRegion = xdc.useModule('ti.sdo.ipc.SharedRegion');
SharedRegion.translate = false;
SharedRegion.setEntryMeta(0,
    { base: Program.global.shmBase,
      len: Program.global.shmSize,
      ownerProcId: 0,
      isValid: true,
      cacheEnable: cacheEnabled,
      cacheLineSize: cacheLineSize, /* Aligns allocated messages to a cache line */
      name: "internal_shared_mem",
    });

In the source, bench_qmss.c, the following process is performed:

In main():

attachAll() - Ipc_start() called and all cores attached via Ipc_attach()
Every core creates its local queue via MessageQ_create()
BIOS_start() -> tsk0 runs

In tsk0:

SharedRegion default heap registered with MessageQ -> MessageQ_registerHeap((IHeap_Handle)SharedRegion_getHeap(0), HEAP_ID);
Each core opens a remote MessageQ
Tests cases executed (Include calles to MessageQ_alloc/put/get/free()

So again, trying removing the heapBufMP init code and registering the SharedRegion default heap with MessageQ directly.

Justin

0 James Steed over 10 years ago in reply to jsobota

Expert 1905 points

Hi Justin

I am using: pdk_C6678_1_1_2_6

1) I removed the HeapBufMP_Params_init() and HeapBufMP_create() as you suggested and altered registerHeap as you suggested, but I still fails in the same place.

2) I re-checked the image processing demo and it does both, it calls HeapBufMP_create and defines the SharedRegion in the .cfg file. Not sure why this works in image processing but is a potential problem for me.

3) Do the MessageQ_create have to happen before BIOS_start?

My order is

main

1) Ipc_start

2) BIOS_start, start master_main()

master_main

1) NC_NetStart -> NetworkOpen

a) registerHeap

b) MessageQ_create/alloc

c) Put/Get

Is there a problem with this order?

It is not an easy change for me to move the MessageQ_create to main(). It means I cannot create MessageQ's dynamically, which hurts my project. (That is I have to decide how many MessageQs I want before the program makes it to my actual application code.) Would you please verify this limitation on where MessageQ_create can run before I move forward?

Thanks

Jim

0 jsobota over 10 years ago in reply to James Steed

TI__Intellectual 995 points

Jim,

You can call MessageQ_create at any time. The problem is centered around the MessageQ_alloc -> MessageQ_put flow.

I looked closer at your code and I see a pointer discrepancy.

p->smsg = (MessageQ_Msg *)MessageQ_alloc(0,32+size); /* header is 32 bytes */

...

while (MessageQ_put(p->sqid,*p->smsg) != MessageQ_S_SUCCESS) printf("ack!\n");

Why are you derefencing p->smsg when passing it to MessageQ_put? MessageQ_alloc returns a MessageQ_Msg and MessageQ_put takes a MessageQ_Msg as an argument. That extra deference is causing MessageQ_put and all underlying IPC transport layers to see completely wrong MessageQ_Msg packet data. That's probably the source of your problem, not the heap creation.

Justin

0 jsobota over 10 years ago in reply to jsobota

TI__Intellectual 995 points

You're illegally (not from a compiler standpoint) casting the MessageQ_Msg returned from MessageQ_alloc to a pointer to a MessageQ_Msg. Since the return type of MessageQ_alloc is a MessageQ_Msg your eventual call to MessageQ_put with the "invalid" derefence of the MessageQ_Msg will cause a failure. The dereference of something that was not meant to be derefenced in the first place will cause MessageQ to validate the provided MessageQ_Msg against the wrong data in memory.

If you look at the MessageQ APIs you'll see that MessageQ_Msg is actually a typedef'd pointer to a MessageQ_MsgHeader. Therefore, your "p" structure need only have an element of type MessageQ_Msg, not MessageQ_Msg *.

Justin

0 James Steed over 10 years ago in reply to jsobota

Expert 1905 points

Hi Justin

Thanks. That does get rid of the invalid section error. However, it still doesn't work.

As I said above I have a bidirectional IPC. The 0->1 is fully initialized on both sides. Core 0 starts putting data on the MessageQ, then hangs waiting to get its first response on the other MessageQ (this is sort of what I expect).

The 1->0 is not initialized, however. Core 0 create()s it. But core 1 is hanging in MessageQ_open().

I stepped through an execution with the debugger but it doesn't make any sense to me. As best as I can tell it gets lost in ti_sdo_ipc_nsremote_NameServerRemoteNotify_get__F() while doing NameServer_getUInt32(). It never makes it out of NameServer_getUint32().

0 jsobota over 10 years ago in reply to James Steed

TI__Intellectual 995 points

You need to create a MessageQ on both core 0 and core 1, not both on core 0. Core 0 needs to open Core 1's queue and vice versa. Core 0 will _put to Core 1's queue and _get from its own. Core 1 will _put to Core 0's queue and _get from its own.

MessageQ transaction pseudo code flow I've used many times...

/* Queue names */
char core0QName[] = "core0";
char core1QName[] = "core1";

void core0pseudo(void)
{
    MessageQ_Handle core0Q;
    MessageQ_QueueId core1QId;

    Ipc_Start() + attach to core 1...

    /* Create a message queue. */
    core0Q = MessageQ_create(core0QName, NULL);
    if (core0Q == NULL) {
        System_abort("MessageQ_create failed\n" );
    }

    /* Register this heap with MessageQ */
    MessageQ_registerHeap((IHeap_Handle)SharedRegion_getHeap(0), HEAP_ID);

    /* Open the 'next' remote message queue. Spin until it is ready. */
    do {
        status = MessageQ_open(core1QName, &core1QId);
    }
    while (status < 0);

    /* send message to core 1 */
    msg = MessageQ_alloc(HEAP_ID, ...);

    MessageQ_put(core1QId, msg);

    /* receive message from core 1 */
    MessageQ_get(core0Q, &msg, MessageQ_FOREVER);

    do stuff...
}

void core1pseudo(void)
{
    MessageQ_Handle core1Q;
    MessageQ_QueueId core0QId;

    Ipc_Start() + attach to core 0...

    /* Create a message queue. */
    core1Q = MessageQ_create(core1QName, NULL);
    if (core1Q == NULL) {
        System_abort("MessageQ_create failed\n" );
    }

    /* Register this heap with MessageQ */
    MessageQ_registerHeap((IHeap_Handle)SharedRegion_getHeap(0), HEAP_ID);

    /* Open the 'next' remote message queue. Spin until it is ready. */
    do {
        status = MessageQ_open(core0QName, &core0QId);
    }
    while (status < 0);

    /* send message to core 0 */
    msg = MessageQ_alloc(HEAP_ID, ...);

    MessageQ_put(core0QId, msg);

    /* receive message from core 0 */
    MessageQ_get(core1Q, &msg, MessageQ_FOREVER);

    do stuff...
}

0 James Steed over 10 years ago in reply to jsobota

Expert 1905 points

Hi Justin

Thanks for responding, but I think my code fits that. You can see my code at the top.

The order is

Core 0

open "r"

alloc "r"

create "s"

Core 1

create "r"

open "s"

alloc "s"

Core 1 is hanging in "open 's'". And not hanging in the while loop polling, but hanging in the open function.

A little backstory - it has taken a lot of effort to get to this point due to the number of TI features I need to work together. If you work with Arun Mani or John Demery you might get a recap from them about what we're trying to do and what has been done. One issue we had to disable cache just to get around a mystery of a program stack getting clobbered.

Given the vagueness of the error, and that the code is pretty simple/noncontroversial (let me know if you think it is wrong), I think there is something more sinister going on here.

Is there any reason for open() not to return? It is built to be nonblocking so this doesn't make any sense.

Thanks

Jim

0 jsobota over 10 years ago in reply to James Steed

TI__Intellectual 995 points

Jim,

MessageQ_open is getting stuck polling the other cores to see if they have the queue that is named "port_name". MessageQ calls the NameServer layer which sends messages to each core asking if they have the queue with the name provided in the _open call. The communication is made via the IPC registers and a small region in shared memory managed by SharedRegion. I'm wondering if disabling caching is throwing off the NameServer transactions between the cores. I'd expect NameServer to work even if cache is disabled but I've actually never tried this myself. The fact that you had to disable cache to work around a stack corruption points to something bigger (and more sinister ;)) at work here.

You might want to take a step back and pursue the solution to the stack corruption. A stack corruption points to a larger problem somewhere else that can have any number of effects on your code. Having to disable cache to work around it suggests you may have a global variable somewhere that isn't aligned and padded to the 128byte cache line size. On writeback the data surrounding the variable will be wiped if the variable isn't aligned and padded.

Going back to your code, I think the flow is okay. It seems weird to me that you're creating core 0's queue after core 1 starts to try to open it, but that's just personal preference. If MessageQ were working correctly, the MessageQ_open would continuously return NULL until Core 0 creates the queue. The corruption issues, cache disable workaround, and the NameServer layer not returning point to a bigger problem than your program flow. I'd work to resolve the corruption so that cache can be reenabled.

Justin

0 James Steed over 10 years ago in reply to jsobota

Expert 1905 points

For what it's worth, the original problem occurred not in "my" code, but in NDK initialization. So any global variables you'd assume were set up correctly. The problems happened entirely before my code ran.

The explanation I was given was that I was using half of L2 for SRAM and half for cache, and sometimes program memory crept into the cache side of L2. And that I could go back and try and give L2 1/4 or 1/8 for cache and it should be no problem.

I am not sure what the next step is. Perhaps give L2 a small cache and see.

Jim

0 jsobota over 10 years ago in reply to James Steed

TI__Intellectual 995 points

Is there any reason why you're not using the entire L2 as SRAM?

0 James Steed over 10 years ago in reply to jsobota

Expert 1905 points

That's what I meant by disabling cache. Here's my custom platform

/*!
* File generated by platform wizard. DO NOT MODIFY
*
*/

metaonly module Platform inherits xdc.platform.IPlatform {

config ti.platforms.generic.Platform.Instance CPU =
ti.platforms.generic.Platform.create("CPU", {
clockRate: 1000,
catalogName: "ti.catalog.c6000",
deviceName: "TMS320C6678",
customMemoryMap:
[
["MSMCSRAM_MASTER",
{
name: "MSMCSRAM_MASTER",
base: 0x0c000000,
len: 0x00180000,
space: "code/data",
access: "RWX",
}
],
["L2SRAM",
{
name: "L2SRAM",
base: 0x00800000,
len: 0x00080000,
space: "code/data",
access: "RWX",
}
],
["MSMCSRAM_SLAVE",
{
name: "MSMCSRAM_SLAVE",
base: 0x0c180000,
len: 0x00080000,
space: "code/data",
access: "RWX",
}
],
["MSMCSRAM_IPC",
{
name: "MSMCSRAM_IPC",
base: 0x0c200000,
len: 0x00200000,
space: "code/data",
access: "RWX",
}
],
["DDR3",
{
name: "DDR3",
base: 0x80000000,
len: 0x10000000,
space: "code/data",
access: "RWX",
}
],
["DDR3_IPC",
{
name: "DDR3_IPC",
base: 0x90000000,
len: 0x10000000,
space: "code/data",
access: "RWX",
}
],
],
l2Mode:"0k",
l1PMode:"32k",
l1DMode:"32k",

});

instance :

override config string codeMemory = "L2SRAM";
override config string dataMemory = "L2SRAM";
override config string stackMemory = "L2SRAM";

}

0 James Steed over 10 years ago in reply to James Steed

Expert 1905 points

Arun reviewed my code and suggested I might be missing code like this:

memset(slave_queue_name, 0, MAX_SLICES*16);

for (i = 0; i < MAX_SLICES; i++) {

GET_SLAVE_QUEUE_NAME(slave_queue_name[i], i);

}

return 0;

Does the name server need to know this?

0 jsobota over 10 years ago in reply to James Steed

TI__Intellectual 995 points

NameServer is under the hood of IPC. You shouldn't have to configure anything regarding it. I have no idea what the code Arun posted relates too.

Why were you splitting the L2 into cache and SRAM in the first place? Was that what the image processing demo was doing? I'm not extensively familiar with the internals of the demo.

I conferred with another developer and if you're concerned about the L2 configuration we suggest you do the opposite of my original suggestion and set the entire L2 to cache. Especially if you're placing code and data into DDR, which I think the image processing demo does.

I was reviewing some of your other threads regarding your app and it seems you had something working at some point in time. What was that something and at what point did things start going awry?

I'm still holding to my statement that you should be pursuing the stack corruption bug before taking on this IPC bug. Even if it's in NDK somewhere there still something inherently wrong with the app, especially if you were still seeing the system crash due to an instruction fetch with the cache enabled (referenced from http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/p/261296/915541.aspx#915541)

0 James Steed over 10 years ago in reply to jsobota

Expert 1905 points

The Image processing demo splits L2 in half and I was just copying that. It maps some program components to L2, and I believe some of that mapping is required. It doesn't use DDR except for the NDK heap.

I got one core working a long while ago (with NDK comm to host) but have not had any luck getting NDK and IPC to work together. Things went awry when I tried to use NDK and IPC together.

0 James Steed over 10 years ago in reply to James Steed

Expert 1905 points

I know this isn't what you specifically asked me to do but thought I'd update anyway. I enabled 128kb of cache in L2 to see if it would make a difference, and it didn't. The behavior seems to be the same - it hangs sometime after calling MessageQ_alloc and doesn't return.

FWIW if I just break into core 1

block$56(void *, unsigned int, unsigned short, unsigned int *)() at Cache.c:718 0x0C1B0AD0

ti_sysbios_family_c66_Cache_inv__E(void *, unsigned int, unsigned short, unsigned short)() at Cache.c:654 0x0C1C7114

ti_sysbios_hal_Cache_CacheProxy_inv__E(void *, unsigned int, unsigned short, unsigned short)() at image_processing_evmc6678l_slave_pe66.c:31,303 0x0C1C4F00

ListMP_getHead(struct ListMP_Object *)() at ListMP.c:339 0x0C198C54

0x0000FFFF (no symbols are defined for 0x0000FFFF)

So there are some caching functions getting called. Not sure what to make of it.

I guess one takeaway is that it didn't fail earlier. It still passed NDK initialization, etc., with the smaller cache.

0 jsobota over 10 years ago in reply to James Steed

TI__Intellectual 995 points

Have you verified that the IPC components shared between cores that are used for the IPC communication mechanisms have all been placed in the following memory, according to your linker command file?

["MSMCSRAM_IPC",
{
name: "MSMCSRAM_IPC",
base: 0x0c200000,
len: 0x00200000,
space: "code/data",
access: "RWX",
}

In the error/stack trace you provide above the ListMP_getHead (and all sub functions) are all trying to operate on pointers outside this range.

I'll admit these are all just shots in the dark. I'm not that familiar with the image processing demo or NDK.

With all this partitioning going on within L2, MSMC, and DDR it's pretty clear this is a case of NDK and IPC stepping on one another. I'll try to get some more information on how NDK is configured and what resources it requires (memory, and other) in order to operate. At that point I'll try to evaluate where the conflicts may be occurring with IPC.

In the meantime, if you can revert to that older version you had working, I'd suggest verifying NDK to host still works. Then try adding a very simple local IPC (for example, just starting IPC and creating a queue). Approach the problem piecemeal, adding small bits of IPC until you see a problem. That might help us narrow down what's causing the issue.

Justin

0 ArunMani over 10 years ago in reply to jsobota

TI__Genius 9915 points

Justin,

Thanks so much for helping Jim. I am not familiar with IPC. But I have talking with Jim for a while now. I want to give the full back ground of the issue.

Jim can see that the data are sent via IPC from Core0 to core 1. But the first message we allocated in the core1 thre up an error.

Are you saying that all the messages that we create should be in the MSMCSRAM_IPC only? jim has two other sections in the MSMC one for Slave (core 1) and one for master (core 0) should they not use that?Thanks,

Arun

0 James Steed over 10 years ago in reply to jsobota

Expert 1905 points

Perhaps this is the problem. (But how do I fix it)

After core 0 sets up 0->1 and 1->0, it starts writing (MessageQ_put) to 0->1.

It does a sequence of about 10 writes before it starts waiting on a read (MessageQ_get) from 1->0.

When I comment out the puts, and have core 0 just wait there forever, then core 1 doesn't get lost in neverland.

It is easy to add handshaking (so for each 0->1 I also wait for a small return receipt from 1->0) but is this the right approach for MessageQ API?

0 ArunMani over 10 years ago in reply to James Steed

TI__Genius 9915 points

Hi jim,

I think the problem is that you need to sync between cores before you start sending messages. You can use the notify to make sure that the handshake happens and then send the messages.

Is it possible to test one thing. set the while loop for test =1. Set test = 1 before loop and after you see the core 1 starts to wait, change test =2 and see if the messages are going in both directions.

thanks,

arun.

0 James Steed over 10 years ago in reply to ArunMani

Expert 1905 points

I tried a similar test (that I understood how to implement easier). Since 1 ends last, I put a 1->0 at the end of 1 so 0 would wait for 1. Unfortunately another "region is invalid" error in this put running on core 1

void embWriteHostControl(void *handle, void *buf, int n) {
#if USE_MESSAGE_Q
HostConnection p = handle;
int offset = 0;
while (n > 0) {
int wbytes = n > MAX_MSG ? MAX_MSG : n;
printf("%i: writeHostControl %d\n",(int)DNUM,wbytes);
memcpy((char *)p->smsg+32,(char *)buf+offset,wbytes);
while (MessageQ_put(p->sqid,p->smsg) != MessageQ_S_SUCCESS);
n -= wbytes;
offset += wbytes;
}
printf("done embWriteHostControl\n");
#endif
}

embWriteHostControl(p,&id,sizeof(int));

And here is the log

Connected!
0: MessageQ_open controls1
0: MessageQ_alloc controls1 65539
0: MessageQ_create controlr1 c204a80
0: done openControlPort 80032720
readAll 4
[C66xx_1] 1: MessageQ_alloc controlr1 5
1: done embOpenControlPort c204b00
1: writeHostControl 4
ti.sdo.ipc.transports.TransportShm: line 388: assertion failure: A_regionInvalid: Region is invalid
xdc.runtime.Error.raise: terminating execution

So now it gets through create/open/alloc for both 0->1 and 1->0 but it fails on the first 1->0. (No 0->1 tried yet.)

According to the log both 0->1 and 1->0 are correctly in MSMCSRAM_IPC at addr 0xc204a80 and 0xc204b00

0 jsobota over 10 years ago in reply to ArunMani

TI__Intellectual 995 points

His sync points should be the MessageQ_open(). If a remote core can successfully open a MessageQ via the MessageQ_open() API that remote core should be able to send a message via the returned MessageQ ID without a problem.

Jim, can you try this flow:

core 0:

create "control_rec"

open "control_send" ==> ret control_send_id

alloc smsg put (control_send_id)

core 1:

create "control_send"

open "control_rec" ==> ret control_rec_id

alloc rmsg

0 judahvang over 10 years ago in reply to James Steed

TI__Mastermind 32475 points

ti.sdo.ipc.transports.TransportShm: line 388: assertion failure: A_regionInvalid: Region is invalid
xdc.runtime.Error.raise: terminating execution

This error means that the message is allocated from an invalid SharedRegion. Is it possible to put a breakpoint in the TransportShm_put function to see where the message is allocated from? Or it could be your SharedRegion on this core isn't correctly setup?

Judah

0 jsobota over 10 years ago in reply to judahvang

TI__Intellectual 995 points

Jim,

Make sure your p->smsg points to a valid MessageQ_Msg handle. In your code you're copying something from buf to p->smsg+32 but then passing p->smsg to MessageQ_put. If buf contains the entire MessageQ_Msg, header+additional data, it will fail because the header data will be invalid. Also, p->smsg originates from a handle you've passed into the function. Could this handle be stale? If you freed the MessageQ_Msg at any point any subsequent _puts() of that message will fail.

Justin

0 James Steed over 10 years ago in reply to jsobota

Expert 1905 points

Hi Justin,

I've gotten a bit further since Fri so that both sides get past the initialization, without any region errors.

However I'm not getting reliable transfers of large buffers. I made the MessageQ_alloc so that it's big enough for any data I want. The small transfers are okay but the big ones, some of the data is different (corrupted during the sending process). It's reliablely wrong, as well (same values are always wrong in the same way). To verify I'm casting to int and printing all the ints. I'll paste a log below

Additionally it seems if I use printf a lot, eventually I get a Heap error

ti.sysbios.heaps.HeapMem: line 294: out of memory: handle=0x827cd0, size=200008

Is there anything to look out for when using MessageQ_alloc to create the buffer and the header together?

Jim

write (4): 53309
[C66xx_1] read (4): 53309
[C66xx_0] write (20): 2 6 0 1 128
write (64): 0 200000 0 0 0 0 0 0 0 22104 371 187 0 48 98 0
[C66xx_1] read (20): 2 6 0 1 128
read (64): 0 200000 0 0 0 0 0 0 0 22104 371 187 0 48 98 0
[C66xx_0] write (4): 8
write (8): 1634100580 7629941
[C66xx_1] read (4): 8
read (8): 1634100580 7629941
read (0):
[C66xx_0] write (748): -2147483636 -2147483636 -2147483641 1610612744 14 -2147483646 1610612742 7 -2147483645 1610612745 12 1610612761 5 4 4 1610612753 6 4 -2147483642 5 1610612752 6 1610612742 11 13 7 -2147483647 6 11 2 -2147483646 6 -2147483646 4 -2147483647 2 2 -2147483645 1610612741 16 1610612751 9 -2147483642 5 -2147483646 1610612746 11 7 -2147483647 1610612756 6 6 3 1610612741 21 -2147483645 9 15 7 -2147483645 1610612746 4 4 -2147483646 3 3 3 2 1610612737 1073741830 1352 -32 1610612738 40 -2147483646 1610612744 10 536870921 16777216 -2147483643 1073741830 5 -1 -2147483643 1610612738 4 -2147483646 -2147483646 -2147483647 2 -2147483645 -2147483648 1 5720 1024 0 1 5720 1024 0 1 5720 8192 13912 1 5720 13912 1624 1024 0 1 1624 1024 5720 1024 0 1024 1452 1448 1488 136 1624 1024 1 1 1464 1456 1434 1 568 6 1072 2 1424 10000 1472 572 1 1072693248 1416 584 0 0 1192 1192 1 128 1434 1432 20480 1 1 1 1 1 1436 1440 3 1444 1096 96 5 0 2 16 1 96 72 32 0 16 48 0 0 0 0 3211264 0 128 3 1633906540 108 1634100580 7629941 1651470960 1949266789 7368037
write (4): 8
[C66xx_1] read (748): -2147483636 -2147483636 -2147483641 1610612744 14 -2147483646 1610612742 7 0 22104 371 187 0 48 98 0 -164466298 -369318196 1299506112 -1915922051 -1745438154 -1641442173 -1357592620 -247661771 13 7 -2147483647 6 11 2 -2147483646 6 -2147483646 4 -2147483647 2 2 -2147483645 1610612741 16 1610612751 9 -2147483642 5 -2147483646 1610612746 11 7 -2147483647 1610612756 6 6 3 1610612741 21 -2147483645 9 15 7 -2147483645 1610612746 4 4 -2147483646 3 3 3 2 1610612737 1073741830 1352 -32 1610612738 40 -2147483646 1610612744 10 536870921 16777216 -2147483643 1073741830 5 -1 -2147483643 1610612738 4 -2147483646 -2147483646 -2147483647 2 -2147483645 -2147483648 1 5720 1024 0 1 5720 1024 0 1 5720 8192 13912 1 5720 13912 1624 1024 0 1 1624 1024 5720 1024 0 1024 1452 1448 1488 136 1624 1024 1 1 1464 1456 1434 1 568 6 1072 2 1424 10000 1472 572 1 1072693248 1416 584 0 0 1192 1192 1 128 1434 1432 20480 1 1 1 1 1 1436 1440 3 1444 1096 96 5 0 2 16 1 96 72 32 0 16 48 0 0 0 0 3211264 0 128 3 1633906540 108 1634100580 7629941 1651470960 1949266789 7368037
read (4): 8
[C66xx_0] write (8): 1634100580 7629941
write (392): -2147483647 1610612742 536870936 1 536870915 5720 -2147483647 536870921 1 -2147483609 536870926 1 -2147483625 1610612739 1073741830 1072 4 1073741830 1352 -32 1073741833 1100 32 -2147483639 -2147483648 4 20 5720 36 44 13912 52 56 13912 60 1624 76 1624 84 5720 100 1452 104 1448 108 1488 116 1624 140 1464 144 1456 148 1434 196 568 204 1072 216 1424 300 0 0 376 1472 400 572 504 1416 568 584 588 1192 592 1192 604 128 608 1434 612 1432 900 1436 904 1440 936 1444 940 1096 96 0 16 96 72 32 0 16 48
[C66xx_1] read (8): 1634100580 7629941
read (392): -2147483647 1610612742 536870936 1 536870915 5720 -2147483647 536870921 0 22104 371 187 0 48 98 0 -164466298 -369318196 1299506112 -1915922051 -1745438154 -1641442173 -1357592620 -247661771 13 7 -2147483647 6 11 2 -2147483646 6 -2147483646 4 -2147483647 2 2 -2147483645 1610612741 16 1610612751 9 -2147483642 5 -2147483646 1610612746 11 7 -2147483647 1610612756 6 6 3 1610612741 21 -2147483645 9 15 7 -2147483645 1610612746 4 4 -2147483646 3 3 3 2 1610612737 1073741830 1352 -32 1610612738 40 -2147483646 1610612744 10 536870921 16777216 -2147483643 1073741830 5 -1 -2147483643 1610612738 4 -2147483646 -2147483646 -2147483647 2 -2147483645 -2147483648 1 5720 1024 0 1 5720
[C66xx_0] write (24): 0 4 3 1 5 2
write (4): 0
[C66xx_1] read (24): 0 4 3 1 5 2

0 James Steed over 10 years ago in reply to James Steed

Expert 1905 points

Could this be a cache issue? Do I need to invalidate or flush for large data in MSMC?

0 jsobota over 10 years ago in reply to James Steed

TI__Intellectual 995 points

For the sake of consistency have you tried defining your MessageQ_Msg's with a struct typedef instead of pointer offsets?

#define MAX_PAYLOAD_SIZE_BYTES N

typedef struct {
MessageQ_MsgHeader header;      /* 32 bytes */
uint8_t            payload[N];
uint8_t            pad[128-N-sizeof(MessageQ_MsgHeader)]; /* pad to cache line size */
} TstMsg;

Allocs can now be allocated with the sizeof(TstMsg):

MessageQ_Msg msg;

/* MessageQ_alloc will return a buffer that is cache line-aligned */
msg = MessageQ_alloc(HEAP_ID, sizeof(TstMsg));

MessageQ_put(remoteQueueId, msg);

On reception the payload data can be accessed through a recast to TstMsg:

TstMsg *rcvMsg;

MessageQ_get(messageQ, (MessageQ_Msg *) &rcvMsg, MessageQ_FOREVER);

if (rcvMsg->payload[0] == ...) {

...

}

This might be cleaner than tracking the MessageQ_MsgHeader and payload separately through offsets which may help you track down the incoherent data issue.

Justin

0 James Steed over 10 years ago in reply to jsobota

Expert 1905 points

Thanks for the response but perhaps I'm not describing the code enough.

At this point all my messages are size 1024. I use one messageQ for all IPC in the current test. If it's 4 bytes I only fill in 4+32 bytes. If it's 748 bytes then I fill in 748+32 bytes.

The same message Q is used for all the items in the log.

I am copying data into and out of the Message Q buffer on both sides. Before I put() I copy data into it. After I get(), I copy data out of it. I put in extra handshaking to make sure I don't do the next put() before get() is done.

The behavior is that the first 64 bytes are always right but after that they are reliably unreliable (wrong in the same way each time).

To verify - I should not have to worry about the cache when doing MessageQ? The fact that the correctness is 64-byte chunks has me worried about this so want to make sure. There are platforms where I have to invalidate the cache before sending, etc., so I wouldn't be too surprised despite not seeing it in the TI docs/examples.

If you stare at the data above, the yellow side is send and the green side is receive. If the cache line is 64bytes, the first line is good, but after that it isn't. Note that the green parts actually match in the two long xfers. My reading is there is probably still stale data in MSMC when core 1 is trying to read, and that it is a cache flushing issue.

Is my platform_osal.c wrong? I copied this from some template project, but see now that it has cache stuff in it that is largely not filled in.

Thanks

Jim

0 James Steed over 10 years ago in reply to James Steed

Expert 1905 points

FWIW I also tried moving the MessageQ to DDR instead of MSMC, and got similar results. So I don't think the messages are getting clobbered.

Unfortunately when I moved the MessageQ to DDR, printf from core1 stopped working so I don't know how they were wrong, just that they were wrong. I assume they were wrong in the same way, but that's an assumption.

0 judahvang over 10 years ago in reply to James Steed

TI__Mastermind 32475 points

Jim,

The buffers that you give to a MessageQ_put you don't have to do any cache operations on it because MessageQ internally will perform the right cache operation for you.

From your description you are actually coping data into the MessageQ buffer. Now, where are you copying the data from? Is this from another buffer that is only ever used by a single core? If so, I don't see a need for a Cache invalidate of this buffer before you receive the next data. If you are getting this data from a DMA or another peripheral, then you should do an invalidate of the buffer first.

Judah

0 judahvang over 10 years ago in reply to James Steed

TI__Mastermind 32475 points

Jim,

It wasn't clear to me if you do multiple MessageQ_alloc() calls or a single call and just resuse the same message?

If you did a single MessageQ_alloc and reuse the buffer, you need to realize that whatever was the original allocation size is what MessageQ will think the message size is....that is unless you actually change the message's msgSize field every time you reuse the message.

Judah

0 James Steed over 10 years ago in reply to judahvang

Expert 1905 points

The data is coming from NDK. I have sockets Intel->core0. Then IPC core0->core1.

Is it possible that core 0 has stale data, and the NDK has not flushed yet?

If NDK is stale it might explain part of it, but it doesn't explain why the "read" and "write" printf's differ.

Jim

0 James Steed over 10 years ago in reply to judahvang

Expert 1905 points

I reuse the same message. It is 1024 bytes

I tried to simplify the code by just making the buffer bigger than any size I will ever send.

I am not telling it how many bytes of data are valid at any point. It should only know the size based on the MessageQ_alloc, which is the biggest message I will send.

Jim

0 judahvang over 10 years ago in reply to James Steed

TI__Mastermind 32475 points

Jim,

Sounds to me like you need to cache invalidate the buffer which receives the data from "intel" before it receives the data. If you don't invalidate that buffer, I think you would be getting old data because core0 will have this in the cache and won't see the new data.

Judah

0 James Steed over 10 years ago in reply to judahvang

Expert 1905 points

Hi Judah,

I gave you bad info. The data is not coming from the socket, it is generated in core 0. So if there is a cache problem it pretty much has to be in IPC.

Jim

0 judahvang over 10 years ago in reply to James Steed

TI__Mastermind 32475 points

Jim,

Sorry, didn't quite understand what you mean. So if I understand you correctly its like this:

You do a MessageQ_alloc() then you populate the message, then you call MessageQ_put() on the message? As long as the SharedRegion from which the message is allocated has 'enableCache' set to 'TRUE" then when you do a MessageQ_put the Transport should do a Cache writeback invalidate of the message. At least that is true for all MessageQ transports the were orginally written.

Do you know which MessageQ Transport you are using? Is it TransportShm or TransportShmCirc or some other one?

Judah

0 James Steed over 10 years ago in reply to judahvang

Expert 1905 points

Yes, my code is basically

form big array local to core 0

MessageQ_open / MessageQ_alloc

copy data from big array into MSMC ptr

MessageQ_put

There's some fluff around it to do handshaking and set stuff up but that's the gist of it.

My SharedRegion is

/* Shared Memory base address and length */
var SHAREDMEM = 0x0c200000;
var SHAREDMEMSIZE = 0x00200000;

SharedRegion.setEntryMeta(0,
{ base: SHAREDMEM,
len: SHAREDMEMSIZE,
ownerProcId: 0,
isValid: true,
name: "MSMCSRAM_IPC",
});

I welcome any suggestion on what to put here. Is the default for enableCache false? If it is false, am I not guaranteed of correct/current data?

I don't know which transport I am using, how do I find out?

Thanks

Jim

0 judahvang over 10 years ago in reply to James Steed

TI__Mastermind 32475 points

Are you using CCS as your debugger? If so you should be able to see the message in a memory browser and also see whether it is being cached or not (I'm pretty sure its being cached). You should also be able to see that after you do a MessageQ_put() the message is no longer in the cache.

One of the earlier post stated that you had an issue in TransportShm so that's likely the MessageQ Transport you are using. The default for SharedRegion is 'cacheEnable' is true so you should be okay there. If it was 'false' your data could be old stale data. If the cache operations are not happening then Core1 would not see the correct data which is in Core0's cache and that is why Core0 needs to writeback then invalidate the message in cache.

TransportShm_put() does this as the very first thing:

    /* writeback invalidate the message */
    if (SharedRegion_isCacheEnabled(id)) {
        Cache_wbInv(msg, ((MessageQ_Msg)(msg))->msgSize, Cache_Type_ALL,
            TRUE);
    }

Furthermore, I don't know if you know but there if you are using CCS debugger there is a tool called ROV which can help in debugging. You can run it under 'Tools->ROV'.

Judah

0 James Steed over 10 years ago in reply to judahvang

Expert 1905 points

Stepping through isn't working. I put a breakpoint on core 0 in my function that calls MessageQ_put.

Start running

small transfer, press F8
small transfer, press F8
small transfer, press F8
lost in the ListMP

Press Pause and I get this stack

block$56(void *, unsigned int, unsigned short, unsigned int *)() at Cache.c:711 0x0C074ECA
ti_sysbios_family_c66_Cache_inv__E(void *, unsigned int, unsigned short, unsigned short)() at Cache.c:654 0x0C0A3BB4
ti_sysbios_hal_Cache_CacheProxy_inv__E(void *, unsigned int, unsigned short, unsigned short)() at image_processing_evmc6678l_master_pe66.c:32,827 0x0C0A0F00
ListMP_getHead(struct ListMP_Object *)() at ListMP.c:361 0x0C04AD48
0x0000FFFF (no symbols are defined for 0x0000FFFF)

I'm going to try a few other things but thought I should give an update in case it made any sense.

0 judahvang over 10 years ago in reply to James Steed

TI__Mastermind 32475 points

Jim,

Try this:

1. Put a breakpoint on MessageQ_put

2. Run to the breakpoint.

3. Open Memory browswer and set it to the msg. There should be indication via different colors whether its in cache.

4. Put a breakpoint on the address in register B3.

5. Run til you get to the breakpoint (address in B3). Verify that your msg is no longer in cache.

Judah

0 James Steed over 10 years ago in reply to judahvang

Expert 1905 points

Hi Judah

Steps 4 and 5 didn't work for me (didn't break at the addr) but another issue:

When I break at 3, it shows me 64 bytes are in L1D. I want to know if the whole 1024 bytes are in L1D. Is this telling me they're not, only 64 are? (My problem has been only the first 64 bytes are guaranteed to be right.)

address c204ac0 is the first white addr after msg starting at c204a80 = 64bytes

Perhaps if I step it will show me each line getting read in line-by-line, but I need to rerun to see that, so I'll post this first.

Jim

0 James Steed over 10 years ago in reply to James Steed

Expert 1905 points

I single stepped a bit and the only bytes highlighted were the first 64b (eventually it decides it doesn't want to single step anymore and just starts running).

Is there some MessageQ function I can break at inside MessageQ_put to see when bytes 65-1024 get put in L1D?

Thanks

Jim

0 James Steed over 10 years ago in reply to James Steed

Expert 1905 points

My version numbers if it matters

opt/ti/ipc_1_24_03_32/packages;

/opt/ti/bios_6_33_06_50/packages;

/opt/ti/mcsdk_2_01_02_06;

/opt/ti/imglib_c66x_3_1_1_0/packages;

/opt/ti/uia_1_01_01_14/packages;

/opt/ti/pdk_C6678_1_1_2_6/packages

0 James Steed over 10 years ago in reply to James Steed

Expert 1905 points

One small point is that I have a lot of different messages going through this same IPC buffer. Early ones are 4-20 bytes. Some are closer to the full buffer size of 992.

The information above on page 3 is for the 1st transfer which is only 4 bytes. That is the memcpy before the put() is size 4.

I should try to break on the big one if I can figure out how to. (It won't let me step to the big one.)

I mention this because perhaps only the 1st 64bytes are lit in blue because I am only updating the memory in that space. I need to force an example where I can break on where I write to the whole buffer.

0 James Steed over 10 years ago in reply to James Steed

Expert 1905 points

I zero-filled the buffer just to make sure, and I get the same behavior that only the first 64 bytes are highlighted as in L1D.

0 judahvang over 10 years ago in reply to James Steed

TI__Mastermind 32475 points

Jim,

I would have thought zero fill the whole buffer would have brought the whole buffer into the cache. Can you try filling the whole buffer will a different value like 0x1234. If the buffer memory is cacheable, then when you fill the buffer memory, you should see it in the cache before any MessageQ_put(). You should see that the MessageQ_put() removes it from the Cache. So its not quite making sense as to why only 64bytes of the buffer is cached.

Why isn't steps 4 & 5 working for you? When you get to the MessageQ_put() function B3 contains the return address. At the end of the MessageQ_put() function you will see a "B B3" or equivalent meaning branch to the content contained in B3. You can put a breakpoint on this "B B3" if that's easier.

4. Put a breakpoint on the address in register B3.

5. Run til you get to the breakpoint (address in B3). Verify that your msg is no longer in cache.

Judah

0 James Steed over 10 years ago in reply to judahvang

Expert 1905 points

Hi Judah,

I didn't interpret what I was seeing correctly. I was looking at the address for 1->0 instead of 0->1.

In order to handshake/synchronize I do both 1->0 and 0->1 sequentially. Simple/naive way to prevent the next 0->1 from happening when there is still live data in MSMC not copied out yet.

The 1->0 buffer was in L1D presumably because the 1->0 message was gotten right before the memcpy into the 0->1 buffer.

If I look at the actual address for 0->1, the data is never in L1D. It is always white.

For the transfers I am able to see, the data is all correct in MSMC. Including my constant-fill of 43211234. However, I don't get to the big transfer - the one that is wrong. It gets lost in other ListMP code and doesn't make it back to the next breakpoint.

I need to add more checking to core 1 to see if the padded sends are right but if you have any thoughts at this point let me know.

Thanks
Jim

Processors

Processors forum

IPC between 2 cores gives invalid region