Concerto shared memory mutual exclusion

Tim11828

We are using a segment of shared memory with the M3 writing and C28 reading. We want to wrap the reads and writes in a mutual exclusion access method to prevent C28 reading while M3 is writing. We are aware of the IPC registers and the IPC API for message passing but we do not see an actual mutual exclusion mechanism. The C28 will read many times more often then the M3 will write so we do not want or need the overhead of a message acknowledgement cycle around each C28 read. Please advise a recommended method for implementing a mutual exclusion access method for this application. Thank you.

over 14 years ago

0 Janos Szeman over 10 years ago

Intellectual 995 points

I'd also be very interested in an answer for this question. I am trying to implement something even more complicated: read/write access to a shared memory segment from both cores.

I would like to have a gainSHMem() and a leaveSHMem() function for both cores. Any ideas how this could be implemented without explicit acknowledgement from the remote core?

0 Vivek Singh over 10 years ago

TI__Guru** 116041 points

Hi Tim,

One need to use IPC messages to implement mutual exclusive access to same memory address from both the masters. Basically M3 need to send a message to C28 after writes are done and then wait for message from C28x before wrapping the write. Similarly C28x need to wait for the message from M3 before start of READ and then send message back to M3 once it completes all the READs. You could also split the memory segment into two and implement it like ping-pong to improve the performance.

Regards,

Vivek Singh

0 Janos Szeman over 10 years ago in reply to Vivek Singh

Intellectual 995 points

Hi Vivek,

That is of course a safe and straightforward solution, but we want something like a bit that signals active shared memory access to the remote core, so the cores can just simply wait until the other one finishes the memory access.

This approach would save a lot of communication and also shorten access time when the other core is not using the shared memory anyway.

Within the same core, this can easily be implemented with a mutex or semaphore, but we are looking for an interprocessor mutex.

BR,
Janos

0 Noah:Berhanu over 10 years ago in reply to Janos Szeman

TI__Intellectual 2795 points

The IPC doesn't provide dedicated registers to help with mutexes. But it may be possible to use the IPC message registers to signal availability of resources. You can use a bit in the IPC registers to give status of resource availabilty. Either core would read the specific bit before accessing a shared resource. As an alternative, you can also use the interrupt capability. That way the cores get notice only when data is available.

You may also want to check the TI RTOS/BIOS as it has a different implementation of IPC.

Thanks

Noah

0 Janos Szeman over 10 years ago in reply to Noah:Berhanu

Intellectual 995 points

Hello Noah,

Simply using one bit will not work. What happens, when both core reads the bits at the same time? They will think that they can use the memory. The M3 will set itself as master, and the C28 writes will result in a silent error...

Let me tell you more about my application, so you can see the problem I am facing right now:
The C28 core is controlling 1 to 3 motors parallel. All settings and information (PID parameters, feedbacks, current and voltage measurements, etc) regarding the motors are condensed in a C struct that is placed in shared memory. This means a lot of processing and frequent access to the shared memory (20kHz PWM interrupt).
The M3 core is simply used as a bridge between the motor control application and CAN and USB interfaces. There is a monitoring software running on a PC that periodically (10-20Hz) reads a bunch of parameters (20-30) from the shared C struct, and occasionally write new values like PID parameters, target position, etc.

Since the C28 is running the real time application it is the default master of the memory. Unfortunately only the M3 can change the master for a memory section, so there must be an agreement on who is using the memory. The C28 will be the master for the whole time the PWM ISR is running, while the M3 will just take the memory, read one variable then give it back immediately so the C28 will not have to wait for a long time.

According the above I think that acknowledging each transfer would not be very efficient but I am ready to be convinced otherwise.

I checked (although a bit superficially) what TI RTOS/BIOS offers in terms of IPC but was very disappointed that exactly the "interesting" modules are not supported on Concerto (GateMP, HeapMP, ListMP and SharedRegion).

0 Vivek Singh over 10 years ago in reply to Janos Szeman

TI__Guru** 116041 points

Hi Janos,

Thank you for explaining the application use case.

On following point -

The C28 will be the master for the whole time the PWM ISR is running, while the M3 will just take the memory, read one variable then give it back immediately so the C28 will not have to wait for a long time.

Please note that irrespective of which core is master for Shared RAM, READ from all the masters are allowed. So if M3 is only going to read some variable at some point, it need not to take the ownership of the Shared RAM block. Ownership is needed only when it need to write into Shared RAM.

Regards,

Vivek Singh

0 Janos Szeman over 10 years ago in reply to Vivek Singh

Intellectual 995 points

Hello Vivek,

I am using libraries on the C28 that are not designed with a multicore environment in mind. I am afraid that there are accesses to variables that are not atomic. I want to avoid reading half updated values.

BR,

Janos

0 Janos Szeman over 10 years ago

Intellectual 995 points

Looks like I have finally found a nice solution: Lamport's bakery algorithm. It uses two variables on both cores, that are read-only by the remote core, so I can link them into the IPC message RAM.

I had to add counters to the entry and exit functions that allows nested calls to them. The counter must be incremented and decremented atomically (__inc, __dec on the C28 and __ldrex, __strex on the M3).

(If the unbounded numbers are problematic, one may use the modified Black-white bakery algorithm described here: www.cs.tau.ac.il/~afek/gadi.pdf)

I am testing the code right now, will come back if I get the permission to post it here.

0 Noah:Berhanu over 10 years ago in reply to Janos Szeman

TI__Intellectual 2795 points

Janos,

A couple of suggestions that may help.

1. Can you use M3->C28 and C28->M3 message RAMs? Only M3 can write to the M3->C28 message RAM and C28 can read from it but can't write to it. Similiarly only C28 can write to C28->M3 message RAM and M3 can read from it but can't write to it.

2. Regarding the single bit usage:

Assuming you are using the shared RAM memory as your PID loop variable and you don't copy it to the local C28x memory, you can designate any bits from M3.MTOCIPCSET

and C28.CTOMIPCSET register to signal that, values are about to change and values have actulally changed.

Lets use two bits for the handshake- bit 5 and bit 6. (setting bit 5 of M3.MTOCIPCSET will result in bit 5 of C28.MTOCIPCSTS being set). You would check C28.MTOCIPCSTS.bit.5 in your control loop. If that value is set, it would mean M3 is about to write new control loop paramters. C28 would stop reading the parameters and signal M3 that it has stopped reading the parametrers and is ready for new. It can do so by setting C28.CTOMIPCSET, which will set M3.CTOMIPCSTS (M3 would also need to do monitoring). M3 now can write the newcontorl loop parameters and and set bit 6 to signal new values are available etc....

You may need one or two if statements to complete the state machine.

It would also be nice to see the implemention you posted works. I will look into it and see if that can be integrated in the current examples

Thanks

Noah

If let say you designate bit 5 of the M3.MTOCIPCSET register, this will cause bit 5 of C28.MTOCIPCSTS to be set. You can use bits 6 and 7 as well. Bit 5 can mean new parameters for Motor 1,Bit 6 for Motor 2 and Bit 7 for Motor 3.

When M3 has new values to C28 , it would overwrite the values in M3->C28 and sets the appropriate MTOCIPCSET bits. On the C28x side, In your control loop, you just need to check if the appropriate bits of C28.MTOCIPCSTS are set.If so, new parameters are available and C28 reads it from the M3->C28 message RAM. Once the variables on

0 Janos Szeman over 10 years ago in reply to Noah:Berhanu

Intellectual 995 points

Hi Noah,

Thanks for the suggestions.

1. I am using the IPC message ram mainly for printf and to stream arbitrary data from the C28 PWM interrupt to the M3 core and than to the PC via USB CDC. It works like a scope...

2. I previously used similar method for synchronization, but this is not symmetric and may (at least theoretically) lead to "starvation" of the M3 core. This is not such a huge problem now, but for the CAN protocol I am using very strict timing is required, and I was afraid this could cause problems later, when the C28 is driving 3 motors. This was the reason I started looking for a more symmetric solution and found this thread.

Regards,

Janos

0 Noah:Berhanu over 10 years ago in reply to Janos Szeman

TI__Intellectual 2795 points

Janos,

Have you looked into using the interrupt generation capability of IPC?

Thanks

Noah

0 Janos Szeman over 10 years ago in reply to Noah:Berhanu

Intellectual 995 points

Hi Noah,

Yes, I did. I am using it to stop the M3 core when the C28 exit hook catches a non-master access violation exception (CNMAVFLG) so I can preserve the state of both cores.

But as I said, I wanted a solution that works without any contribution from the remote core. It seems to be working fine now.

Regards,

Janos

0 Janos Szeman over 10 years ago

Intellectual 995 points

Hi!

This is the code that I am using.

Common structure definition:

#define MUTEX_COUNT								1

typedef struct ipcMutex_t 	*ipcMutex_h;

struct ipcMutex_t
{
	uint32_t					number;				/* Variable used in Lamport's bakery algorithm */
	uint16_t					choosing;			/* Variable used in Lamport's bakery algorithm */
	int16_t						accessCount;			/* Counter to handle nested locking */
	int16_t						inCriticalSection;		/* Boolean variable to signal if we are in critical section */

	volatile struct ipcMutex_t	*remote;					/* Pointer to the remote mutex placed in IPC message ram */
};

M3 code:

inline void ipcMutex_wait(ipcMutex_h mtxH)
{
	uint32_t num_remote;
	uint32_t num_local;

	mtxH->choosing = 1;
	mtxH->number = mtxH->remote->number + 1;
	mtxH->choosing = 0;

	while(mtxH->remote->choosing);

	do {
		num_remote = mtxH->remote->number;
		num_local = mtxH->number;
	} while(num_remote && (num_remote <= num_local));
}

void ipcMutex_lock(ipcMutex_h mtxH)
{
	disableInterrupts();

	atomic_inc(&mtxH->accessCount);

	if (mtxH->inCriticalSection == 0)
	{
		ipcMutex_wait(mtxH);
	}

	mtxH->inCriticalSection = 1;

	enableInterrupts();
}

void ipcMutex_unlock(ipcMutex_h mtxH)
{
	disableInterrupts();

	atomic_dec(&mtxH->accessCount);

	if (mtxH->accessCount == 0)
	{
		mtxH->inCriticalSection = 0;

		mtxH->number = 0;
	}

	enableInterrupts();
}

void ipcMutex_forceUnlock(ipcMutex_h mtxH)
{
	mtxH->accessCount = 0;
	mtxH->choosing = 0;
	mtxH->inCriticalSection = 0;
	mtxH->number = 0;
}

#pragma DATA_SECTION(mutex, "mutex_m3")
struct ipcMutex_t mutex[MUTEX_COUNT];

#pragma DATA_SECTION(mutex_remote, "mutex_c28")
volatile struct ipcMutex_t mutex_remote[MUTEX_COUNT];

void ipcMutex_init(void)
{
	uint16_t i;

	for (i=0; i<MUTEX_COUNT; i++)
	{
		memset(mutex+i, 0, sizeof(struct ipcMutex_t));
		mutex[i].remote = &mutex_remote[i];
	}

	HWREG(SYSCTL_MWRALLOW) = 0xA5A5A5A5;
	HWREG(RAM_CONFIG_BASE + RAM_O_MSXMSEL) |= SHARED_MEM_SEGMENT;
	HWREG(SYSCTL_MWRALLOW) = 0;
}

void gainSharedMem(void)
{
	ipcMutex_lock(mutex+0);

	HWREG(SYSCTL_MWRALLOW) = 0xA5A5A5A5;
	HWREG(RAM_CONFIG_BASE + RAM_O_MSXMSEL) &= ~SHARED_MEM_SEGMENT;
	HWREG(SYSCTL_MWRALLOW) = 0;
}


void leaveSharedMem(void)
{
	HWREG(SYSCTL_MWRALLOW) = 0xA5A5A5A5;
	HWREG(RAM_CONFIG_BASE + RAM_O_MSXMSEL) |= SHARED_MEM_SEGMENT;
	HWREG(SYSCTL_MWRALLOW) = 0;

	ipcMutex_unlock(mutex+0);
}

C28 code:

inline void ipcMutex_wait(ipcMutex_h mtxH)
{
	uint32_t num_remote;
	uint32_t num_local;

	mtxH->choosing = 1;
	mtxH->number = mtxH->remote->number + 1;
	mtxH->choosing = 0;

	while(mtxH->remote->choosing);

	do {
		num_remote = mtxH->remote->number;
		num_local = mtxH->number;
	} while(num_remote && (num_remote < num_local));
}

void ipcMutex_lock(ipcMutex_h mtxH)
{
	disableInterrupts();

	__inc(&mtxH->accessCount);

	if (mtxH->inCriticalSection == 0)
	{
		ipcMutex_wait(mtxH);
	}

	mtxH->inCriticalSection = 1;

	enableInterrupts();
}

void ipcMutex_unlock(ipcMutex_h mtxH)
{
	disableInterrupts();

	__dec(&mtxH->accessCount);

	if (mtxH->accessCount == 0)
	{
		mtxH->inCriticalSection = 0;

		mtxH->number = 0;
	}

	enableInterrupts();
}

void ipcMutex_forceUnlock(ipcMutex_h mtxH)
{
	mtxH->accessCount = 0;
	mtxH->choosing = 0;
	mtxH->inCriticalSection = 0;
	mtxH->number = 0;
}

#pragma DATA_SECTION(mutex, "mutex_c28")
struct ipcMutex_t mutex[MUTEX_COUNT];

#pragma DATA_SECTION(mutex_remote, "mutex_m3")
volatile struct ipcMutex_t mutex_remote[MUTEX_COUNT];

void ipcMutex_init(void)
{
	uint16_t i;

	for (i=0; i<MUTEX_COUNT; i++)
	{
		memset(mutex+i, 0, sizeof(struct ipcMutex_t));
		mutex[i].remote = &mutex_remote[i];
	}
}

void gainSharedMem(void)
{
	ipcMutex_lock(mutex+0);

	while(RAMRegs.CSxMSEL.bit.S7MSEL == 0);
}


void leaveSharedMem(void)
{
	ipcMutex_unlock(mutex+0);
}

Note: With this code, the C28 has inherent priority over the M3. If both core tries to enter the critical section at the same time and pick the same number, C28 will win. This is due to the simplification of the original algorithm. (See the while condition in the wait functions.)

BR,

Janos

C2000™︎ microcontrollers

C2000 microcontrollers forum

Concerto shared memory mutual exclusion