[TDA4] Shared memory between A72 <-> c66, A72 <-> r5f, A72 <-> c71

lord lei

Intellectual 480 points

Other Parts Discussed in Thread: SYSBIOS

Hi,

I want to use shared memory between A72 、c66、r5f .

(1) How to change share memory size? Need to 1Gb.

(2) Can you tell me a demof of shared memory ?

(3) Shared memory is cached or not ?

(4) I do a test between A72 and c71:

In c71 sysbios side :

diff --git a/rtos_automotive_06_01_00_05/dji/psdk_rtos_auto_j7_06_01_00_15/vision_apps/apps/basic_demos/app_tirtos/tirtos_linux/c7x_1/main.c b/rtos_automotive_06_01_00_05/dji/psdk_rtos_auto_j7_06_01_00_15/vision_apps/apps/basic_demo
index 594c1c3..edc6690 100755
--- a/rtos_automotive_06_01_00_05/dji/psdk_rtos_auto_j7_06_01_00_15/vision_apps/apps/basic_demos/app_tirtos/tirtos_linux/c7x_1/main.c
+++ b/rtos_automotive_06_01_00_05/dji/psdk_rtos_auto_j7_06_01_00_15/vision_apps/apps/basic_demos/app_tirtos/tirtos_linux/c7x_1/main.c
@@ -73,6 +73,19 @@
#include <app_ipc_rsctable.h>
#include <ti/csl/soc.h>
#include <ti/csl/csl_clec.h>
+#include <ti/sysbios/hal/Cache.h>
+
+static Void shared_memory_test(Void)
+{
+ volatile uint32_t* testPtr = (uint32_t*)0xBC000000;
+ uint32_t counter = 0U;
+ while(1)
+ {
+ *testPtr = counter++;
+ Cache_wb((Ptr)testPtr, (SizeT)4, (Bits16)Cache_Type_ALL, (Bool)TRUE);
+ appLogWaitMsecs(1000u);; // 1sec
+ }
+}

static Void appMain(UArg arg0, UArg arg1)
{
@@ -82,6 +95,7 @@ static Void appMain(UArg arg0, UArg arg1)
while(1)
{
appLogWaitMsecs(100u);
+ shared_memory_test();
}
#else
appDeInit();

In A72 linux side:

root@j7-evm:/opt/vision_apps# devmem2 0xbc000000
/dev/mem opened.
Error at line 90, file devmem2.c (1) [Operation not permitted]
root@j7-evm:/opt/vision_apps# devmem2 0xbc000000
/dev/mem opened.
Error at line 90, file devmem2.c (1) [Operation not permitted]

devmem2 get data failed, why ?

Thanks & Regards,

Lei

over 6 years ago

0 lord lei over 6 years ago

Intellectual 480 points

Hi,

Add a question:

(5) Is shared memory in r5f core have cache or not ?

pls, update.

Thanks & Regards,

Lei

0 Kedar Chitnis over 6 years ago in reply to lord lei

TI__Genius 9101 points

<1>

See this page for information about memory map and how to change it

${PSDKRA_INSTALL_PATH}/psdk_rtos_auto/docs/user_guide/developer_notes_memory_map.html

<2> <4>

A demo of shared memory including cache ops can be found here

vision_apps/apps/basic_demos/app_linux_arm_ipc

See also vision_apps/utils/ipc

<3> <5>

ION Shared memory, typically used for exchanging large pixel buffers, is cached.

Some other IPC memory used for passing smaller messages is non-cached.

See above developer note for more details.

NOTE: My recommendation is not to change memory map unless you really need it. Most use-cases can be run with default memory map. I recommend you try with this and only later change it. Note, R5F and C6x are 32b so if you spill beyond 32b space things may not work

regards
Kedar

0 lord lei over 6 years ago in reply to Kedar Chitnis

Intellectual 480 points

Hi Kedar,

1. ION Shared memory is virtual memory.

How to do if i want to Physical Shared memory ?

2. gen_linker_mem_map.py auto generate file : k3-j721e-vision_apps.dts,

How to apply this file?

　 Need I to modify the file manually follow this ?

/*
* IMPORTANT NOTE: Follow below instructions to apply the updated memory map to linux dts/dtso/dtsi files,
*
* 1. Copy the memory sections, from the generated dts file, to the file shown below under reserved_memory: reserved-memory { ... }
* ${LINUX_KERNEL_PATH}/arch/arm64/boot/dts/ti/k3-j721e-som-p0.dtsi
*
* 2. In file ${LINUX_KERNEL_PATH}/arch/arm64/boot/dts/ti/k3-j721e-auto-common.dtso
* - Remove the fragment@xyz { ... } entries for xyz = 101 to 119
*
* 3. In file ${LINUX_KERNEL_PATH}/arch/arm64/boot/dts/ti/k3-j721e-vision-apps.dtso
* - Remove the &reserved_memory { ... } entry
*
* 4. The entries are removed since the same are updated in k3-j721e-som-p0.dtsi, hence we should not have duplicate outdated one's
*
* 5. Rebuild the dtb, dtbo and use the updated dtb, dtbo files
* - In PSDKLA install directory, doing below should build the dtb and dtbo
* make linux-dtbs
* - Copy the below updated dtb, dtbo files to "boot/" folder in your target linux SD card filesystem
* arch/arm64/boot/dts/ti/k3-j721e-common-proc-board.dtb
* arch/arm64/boot/dts/ti/k3-j721e-auto-common.dtbo
* arch/arm64/boot/dts/ti/k3-j721e-vision-apps.dtbo
*/

3. vision_apps/apps/basic_demos/app_tirtos/tirtos_linux shared memory between in c66x_1, c66x_2, c7x_1, mcu2_0, mcu2_1, mpu1 core,

So how to do add mcu1_0, mcu1_1, mcu3_0, mcu3_1 core ?

Thanks & Regards,

Lei

0 lord lei over 6 years ago in reply to lord lei

Intellectual 480 points

Hi Kedar,

4. If change DDR memory form 4G to 8G, how to porting it ?

5. Can you tell the API of synchronous shared memory operations ?

Please update,

Thanks & Regards,

Lei

0 lord lei over 6 years ago in reply to lord lei

Intellectual 480 points

Hi Kedar,

6. How to set shared memory to memreserve ?

We don't want to use ION memory in A72 side, but if set shared memory to no-map reserved-memory,

It will have bus error when memset

Please update,

Thanks & Regards,

Lei

0 Kedar Chitnis over 6 years ago in reply to lord lei

TI__Genius 9101 points

See vision_apps/utils/mem/include/app_mem.h

appMemGetVirt2PhyBufPtr()

See vision_apps/utils/mem/src/app_mem_ion.c for implementation of the same.

Applying the dtsi file has to be done manually

One can add mcu1_0, mcu1_1, mcu3_0, mcu3_1 similar to mcu2-1 core. Unfortunately we don't have a ready example for the same. We plan to have a example showing this in future releases.

Firstly we need to check if 8G is supported on the EVM. I think 4G is the DDR size on TI EVM.

Assuming 8G is available on a board one needs to change DDR init sequence in SPL/uboot to support it.

5."API of synchronous shared memory", not sure what you mean by this. If you mean to say cache operations, then again see app_mem.h for cache operation APIs.

6. To reserve a memory you need to set in dts/dtsi file. See ION memory segment for example.

However if you try to simply mmap this via /dev/mem, there few things to note.

a. Any memory mapped to user space via /dev/mem is mapped as strongly ordered, non-cache memory. If you want to say use this for large pixel data sharing between say A72 and some other CPU/HW, this this will be inefficient from A72 point of view since data is not cached.

b. Memory mapped via /dev/mem MUST be accessed in a aligned manner, ex, 32b access MUST be 32b aligned, 64b access MUST be 64b aligned, else one gets a bus error. Optimized functions like memcpy could result in unaligned accesses and hence you will see a bus error.

It is strong recommendation to use a memory allocator like ION or develop a similar kernel module for shared memory access.

Both the above constraints are not present when using ION memory allocator.

regards
Kedar

ps, sorry for delay in responding (it was due to year end holidays)

0 lord lei over 6 years ago in reply to Kedar Chitnis

Intellectual 480 points

Hi Kedar,

Thanks for you help.

This follow code will lead system hang. Why ?

int lock_mem(void* x){
asm("ldaxr w2,[x0]");
}
int main(int argc,char**argv){
long long addr = 0xbc000000;
long long size = 1*1024*1024;
int fd = open("/dev/mem",O_RDWR|O_SYNC);
if(fd < 0){
perror("open");
return 0;
}
char *ret = mmap(NULL,size,PROT_READ|PROT_WRITE,MAP_SHARED|MAP_LOCKED,fd,addr);
if((intptr_t)ret < 0){
perror("mmap");
return 0;
}
lock_mem(ret);
return 0;

}

Thanks & Regards,

Lei

0 lord lei over 6 years ago in reply to lord lei

Intellectual 480 points

Hi Kedar,

system hang (A72) log:

ERROR: Unhandled External Abort received on 0x80000001 at EL3!
ERROR: exception reason=0 syndrome=0xbf000002
PANIC in EL3 at x30 = 0x000000007000442c
x0 = 0x0000000000000000
x1 = 0x0000000000000060
x2 = 0x0000000000000060
x3 = 0x000000000000000b
x4 = 0x0000000000000062
x5 = 0x0000000000000008
x6 = 0x000000000000003b
x7 = 0x0000000000000000
x8 = 0x0000000000000062
x9 = 0x0000000000000000
x10 = 0x0101010101010101
x11 = 0x0000000000000028
x12 = 0x00000000000000d1
x13 = 0x0000ffffdd52d92d
x14 = 0x0000ffffb03e0f60
x15 = 0x00000000000000d5
x16 = 0x0000ffffb03e9ad8
x17 = 0x0000ffffb0443308
x18 = 0x0000ffffdd52d9d0
x19 = 0x0000000000000000
x20 = 0x00000000bf000002
x21 = 0x0000600000000048
x22 = 0x0000000000000000
x23 = 0x0000ffffb0443350
x24 = 0x0000000000000000
x25 = 0x0000000000000000
x26 = 0x0000000000000000
x27 = 0x0000000000000000
x28 = 0x0000000000000000
x29 = 0x000000007000a160
scr_el3 = 0x000000000000073d
sctlr_el3 = 0x0000000030cd183f
cptr_el3 = 0x0000000000000000
tcr_el3 = 0x0000000080803520
daif = 0x00000000000002c0
mair_el3 = 0x00000000004404ff
spsr_el3 = 0x0000000080000000
elr_el3 = 0x0000ffffb03e96f4
ttbr0_el3 = 0x000000007000e420
esr_el3 = 0x00000000bf000002
far_el3 = 0x0000000000000000
spsr_el1 = 0x0000000060000000
elr_el1 = 0x0000ffffb03e9450
spsr_abt = 0x0000000000000000
spsr_und = 0x0000000000000000
spsr_irq = 0x0000000000000000
spsr_fiq = 0x0000000000000000
sctlr_el1 = 0x0000000034d5d91d
actlr_el1 = 0x0000000000000000
cpacr_el1 = 0x0000000000300000
csselr_el1 = 0x0000000000000000
sp_el1 = 0xffff000011130000
esr_el1 = 0x0000000056000000
ttbr0_el1 = 0x00000008c16d6c00
ttbr1_el1 = 0x108c000080e20000
mair_el1 = 0x0000bbff440c0400
amair_el1 = 0x0000000000000000
tcr_el1 = 0x00000034f5507510
tpidr_el1 = 0x0000800876e60000
tpidr_el0 = 0x0000ffffb0509430
tpidrro_el0 = 0x0000000000000000
dacr32_el2 = 0x0000000000000000
ifsr32_el2 = 0x0000000000000000
par_el1 = 0x0000000000000000
mpidr_el1 = 0x0000000080000001
afsr0_el1 = 0x0000000000000000
afsr1_el1 = 0x0000000000000000
contextidr_el1 = 0x0000000000000000
vbar_el1 = 0xffff000008081800
cntp_ctl_el0 = 0x0000000000000005
cntp_cval_el0 = 0x0000009bc2b0160c
cntv_ctl_el0 = 0x0000000000000000
cntv_cval_el0 = 0x0000000000000000
cntkctl_el1 = 0x00000000000000e6
sp_el0 = 0x000000007000a160
isr_el1 = 0x0000000000000040
cpuectlr_el1 = 0x0000001b00000040
cpumerrsr_el1 = 0x0000000000000000
l2merrsr_el1 = 0x0000000000000000

Thanks & Regards,

Lei

0 Subhajit Paul over 5 years ago in reply to lord lei

TI__Expert 7015 points

Lei,

Why are you trying to expressly call "ldaxr" instructions? Why are you not using synchronization tools like mutex etc?

- Subhajit

0 lord lei over 5 years ago in reply to Subhajit Paul

Intellectual 480 points

Subhajit,

Because we need locks shared by multiple process in linux

Thanks & Regards,

Lei

0 Subhajit Paul over 5 years ago in reply to lord lei

TI__Expert 7015 points

If that is the case, you can put your mutex in a shared memory that is accessible by multiple processes. You can make the mutex shared among processes by calling this:

int pthread_mutexattr_setpshared(pthread_mutexattr_t *attr,
    int pshared);

0 Kedar Chitnis over 5 years ago in reply to Subhajit Paul

TI__Genius 9101 points

You can use Posix semaphore for multi process mutex.

0 lord lei over 5 years ago in reply to lord lei

Intellectual 480 points

Subhajit,

There will be the following problems on multi-core shared memory, but there will be no such problems on other memory ranges:
1. Multithreaded shared locks cannot guarantee atomicity in critical sections
2. The lock priority reversal function cannot be turned on, otherwise it will fail to unlock：

　　pthread_mutexattr_setprotocol(&attr,PTHREAD_PRIO_INHERIT);

log:
Expectations:
value of atomic add var is 40000000
abnormal situation:
value of atomic add var is 37021808

Relate source code，See attachment

Thanks & Regards,

Lei

Fullscreen 8308.test_mmap_mt_inherit.c Download

#include <sys/mman.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <pthread.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/wait.h>


pthread_mutex_t *lock;;

void* thread_fun(void*arg){
    volatile int *p = arg;
    int i = 0;
    for(i = 0; i < 10000000;i++){
        int ret = pthread_mutex_lock(lock);
        if(ret != 0){
            printf("lock ret = %d,errno = %d\n",ret,errno);
        }
        (*p)++;
       // usleep(1);
        ret = pthread_mutex_unlock(lock);
        if(ret != 0){
            printf("unlock ret = %d,errno = %d\n",ret,errno);
        }
    }
    return NULL;
}

int main(int argc,char**argv){
    long long addr = 0xC0000000;
    long long size = 0x100000;
    int fd = open("/dev/mem",O_RDWR);//|O_SYNC);
    if(fd < 0){
        perror("open");
        return 0;
    }
    char *ret = mmap(NULL,size,PROT_READ|PROT_WRITE,MAP_SHARED|MAP_LOCKED,fd,addr);
    if((intptr_t)ret < 0){
        perror("mmap");
        return 0;
    }
    *(volatile int*)ret = 0;
    lock = (void*)((char*)ret+4);
    pthread_mutexattr_t attr;
    pthread_mutexattr_init(&attr);
    int pret = pthread_mutexattr_setpshared(&attr,PTHREAD_PROCESS_SHARED);
    if(pret != 0){
        printf("pret is %d@%d\n",pret,__LINE__);
        return 0;
    }
    //pret = pthread_mutexattr_setprotocol(&attr,PTHREAD_PRIO_INHERIT);
    if(pret != 0){
        printf("pret is %d@%d\n",pret,__LINE__);
        return 0;
    }
    pret = pthread_mutex_init(lock,&attr);
    if(pret != 0){
        printf("pret is %d@%d\n",pret,__LINE__);
        return 0;
    }

    int fork_ret = fork();
    pthread_t tid[2];
    int pthread_ret = pthread_create(&tid[0],NULL,&thread_fun,ret);
    pthread_ret = pthread_create(&tid[1],NULL,&thread_fun,ret);
    pthread_join(tid[0],NULL);
    pthread_join(tid[1],NULL);
    if(fork_ret){
        printf("value of atomic add var is %d\n",*(volatile int*)ret);
    }
    return 0;
}

/* 在多核共享内存上面会有如下问题，而在其他的内存区间上面不会有这种问题：
 * 1. 多线程共享锁无法保证临界区的原子性
 * 2. 锁优先级反转功能不能打开，否则会解锁失败
 * log:
 * 期望情况:
 *    value of atomic add var is 40000000
 * 异常情况:
 *    value of atomic add var is 37021808
 * */

0 Subhajit Paul over 5 years ago in reply to lord lei

TI__Expert 7015 points

I can see that you have done two pthread_join calls to make sure the threads exit. But I do not see a waitpid call to ensure that the forked process returning.

- Subhajit

0 lord lei over 5 years ago in reply to Subhajit Paul

Intellectual 480 points

Subhajit,

Thank you for u help.

if enable this function:

　　pthread_mutexattr_setprotocol(&attr,PTHREAD_PRIO_INHERIT);

then multithreaded shared locks cannot guarantee atomicity in critical sections

log:
Expectations:
value of atomic add var is 40000000
abnormal situation:
value of atomic add var is 36976716

Relate source code，See attachment

Fullscreen 0871.test_mmap_mt_inherit_2.c Download

#include <sys/mman.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <pthread.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/wait.h>


pthread_mutex_t *lock;;

void* thread_fun(void*arg){
    volatile int *p = arg;
    int i = 0;
    for(i = 0; i < 10000000;i++){
        int ret = pthread_mutex_lock(lock);
        if(ret != 0){
            printf("lock ret = %d,errno = %d\n",ret,errno);
        }
        (*p)++;
       // usleep(1);
        ret = pthread_mutex_unlock(lock);
        if(ret != 0){
    //        printf("unlock ret = %d,errno = %d\n",ret,errno);
        }
    }
    return NULL;
}

int main(int argc,char**argv){
    long long addr = 0xC0000000;
    long long size = 0x100000;
    int fd = open("/dev/mem",O_RDWR);//|O_SYNC);
    if(fd < 0){
        perror("open");
        return 0;
    }
    char *ret = mmap(NULL,size,PROT_READ|PROT_WRITE,MAP_SHARED|MAP_LOCKED,fd,addr);
    if((intptr_t)ret < 0){
        perror("mmap");
        return 0;
    }
    *(volatile int*)ret = 0;
    lock = (void*)((char*)ret+4);
    pthread_mutexattr_t attr;
    pthread_mutexattr_init(&attr);
    int pret = pthread_mutexattr_setpshared(&attr,PTHREAD_PROCESS_SHARED);
    if(pret != 0){
        printf("pret is %d@%d\n",pret,__LINE__);
        return 0;
    }
    pret = pthread_mutexattr_setprotocol(&attr,PTHREAD_PRIO_INHERIT);
    if(pret != 0){
        printf("pret is %d@%d\n",pret,__LINE__);
        return 0;
    }
    pret = pthread_mutex_init(lock,&attr);
    if(pret != 0){
        printf("pret is %d@%d\n",pret,__LINE__);
        return 0;
    }

    int fork_ret = fork();
    pthread_t tid[2];
    int pthread_ret = pthread_create(&tid[0],NULL,&thread_fun,ret);
    pthread_ret = pthread_create(&tid[1],NULL,&thread_fun,ret);
    pthread_join(tid[0],NULL);
    pthread_join(tid[1],NULL);
    if (fork_ret) {
		int s;
		waitpid(-1, &s,NULL);
        printf("value of atomic add var is %d\n",*(volatile int*)ret);
    }
    return 0;
}

/* 在多核共享内存上面会有如下问题，而在其他的内存区间上面不会有这种问题：
 * 1. 多线程共享锁无法保证临界区的原子性
 * 2. 锁优先级反转功能不能打开，否则会解锁失败
 * log:
 * 期望情况:
 *    value of atomic add var is 40000000
 * 异常情况:
 *    value of atomic add var is 36976716
 * */

Thanks & Regards,

0 Subhajit Paul over 5 years ago in reply to lord lei

TI__Expert 7015 points

Lei,

Somehow I do not agree with you.

Attached a source code that uses shared mutex (over shared memory) and successfully prints out the desired value.

Compile as <gcc -o app main.c -lpthread -lrt>

Run on two terminals

Terminal 1:

$ ./app

Creating ... success.
mapped as 0x7f9d2a54f000
Deleting.
v = 80000000
Deleting.

Terminal 2:

$ ./app
Creating ... failed (File exists). Opening ... success.
mapped as 0x7fc6b720c000
Deleting.
Deleting.

Fullscreen 1682.1727.main.c Download

#include <stdio.h>
#include <stdbool.h>
#include <stdatomic.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <errno.h>
#include <signal.h>
#include <pthread.h>

struct shared_mutex {
	atomic_int latch;
	pthread_mutex_t mutex;
	int test;
};


void *add(void *data)
{
	int i = 0;
	struct shared_mutex *smutex = data;

	/*
	 * Remember: PTHREAD_PRIO_INHERIT is very, very slow on my machine
	 *
	 * Loop till 10000000 took ~90 seconds
	 *
	 * Without PTHREAD_PRIO_INHERIT, it took less that 5 secs
	 *
	 * In both the cases atomicity was ensured, though!
	 */
	while(i++ < 10000000) {
		pthread_mutex_lock(&smutex->mutex);
		smutex->test++;
		pthread_mutex_unlock(&smutex->mutex);
	}
	return NULL;
}

int main()
{
	int fd, ret;
	void *vaddr;
	struct shared_mutex *smutex = NULL;

	/* try to create a "named" shared memory */
	printf("Creating ... ");
	fd = shm_open("/mutex-mem", O_RDWR | O_CREAT | O_EXCL, 0664);
	if(fd < 0) {
		/* create failed because I am not the first command line app
		 * try to just open */
		printf("failed (%s). Opening ... ", strerror(errno));
		fd = shm_open("/mutex-mem", O_RDWR, 0664);
		if(fd < 0) {
			printf("failed (%s). Bailing out\n", strerror(errno));
			exit(0);
		} else {
			printf("success.\n");

			/* I am the user, mmap the shmem ... */
			vaddr = mmap(NULL, getpagesize(), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
			if(vaddr == MAP_FAILED)
				printf("mmap failed (%s)\n", strerror(errno));
			else
				printf("mapped as %p\n", vaddr);

			smutex = vaddr;
			/* ... wait for latch to go to 1 ... */
			while(atomic_load(&smutex->latch) != 1);
			/* .. and set it back to zero */
			atomic_store(&smutex->latch, 0);
		}
	} else {
		printf("success.\n");

		/* I am the creator, so resize shmem to 4K and mmap ... */
		ftruncate(fd, getpagesize());

		vaddr = mmap(NULL, getpagesize(), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
		if(vaddr == MAP_FAILED)
			printf("mmap failed (%s)\n", strerror(errno));
		else
			printf("mapped as %p\n", vaddr);

		smutex = vaddr;

		/* ... create the mutex ... */
		pthread_mutexattr_t attr;
		pthread_mutexattr_init(&attr);
		pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);

		/* Note: The line below will kill your performance */
		pthread_mutexattr_setprotocol(&attr,PTHREAD_PRIO_INHERIT);

		pthread_mutex_init(&smutex->mutex, &attr);

		/* ... set latch to 1 ... */
		atomic_init(&smutex->latch, 1);
		/* ... and then wait for it to go to 0 */
		while(atomic_load(&smutex->latch) != 0);

		/*
		 * Note:
		 * This latch mechanism is required so that only one command
		 * line app creates the shmem, resizes it, and inits the mutex
		 */
	}

	/*
	 * We will run 8 instances of the loop
	 * 2 command line apps
	 * 2 forks per command line app
	 * 2 threads per forked process
	 *
	 * 2 threads x 2 forks x 2 apps = 8 loops
	 */

	/* Run a forked process */
	pid_t pid = fork();
	pthread_t th;

	/* Run a thread ... */
	pthread_create(&th, NULL, add, smutex);
	/* ... And run in main thread */
	add(smutex);
	/* Wait for thread to finish */
	pthread_join(th, NULL);

	/* If parent, wait for forked process to finish */
	if(pid) {
		wait(NULL);

		/* At this point, latch is 0 (set previously)
		 * The command-line app that reaches here first
		 * sets it to 1 (if it is set to 0)
		 *
		 * So evidently the other guy would see it set to
		 * 1 when it reaches here.
		 *
		 * The loser of the race prints the final value.
		 */
		int zero = 0;
		if(!atomic_compare_exchange_strong(&smutex->latch, &zero, 1))
			printf("v = %d\n", smutex->test);
	}

	/* Dont worry, it wont actually delete unless all users "close" it */
	printf("Deleting.");
	shm_unlink("/mutex-mem");

}

I would like to point out that the performance is MUCH MUCH faster if PTHREAD_PRIO_INHERIT is not used. Attaching screenshots

0 lord lei over 5 years ago in reply to Subhajit Paul

Intellectual 480 points

Subhajit,

Thank you for u help.

1. Remember: PTHREAD_PRIO_INHERIT is very, very slow on you machine.

Why this ?

2. we want alloc share memory in Ti ION allocer , can not use shm_open("/mutex-mem"), but it is the same issue in open("/dev/mem"), So we used the

open("/dev/mem") function to be test code.

3. So please do fix test_mmap_mt_inherit_2.c issue.

Thanks & Regards,

0 Subhajit Paul over 5 years ago in reply to lord lei

TI__Expert 7015 points

1. I do not have a clear answer for this. Unless this is important, we should not spend time on answering this

2 and 3. Please refer to my attached application and make appropriate changes to your code.

- Subhajit

0 lord lei over 5 years ago in reply to Subhajit Paul

Intellectual 480 points

Subhajit,

Unfortunately shm_open("/mutex-mem") does not meet our needs. Becase we want used shard memory between in all cores(A72,c66,c71).

Thanks & Regards,

lei

0 Subhajit Paul over 5 years ago in reply to lord lei

TI__Expert 7015 points

I did not mention anything about shmem in particular. You can use mmaped memory just like I used shmem.

I was referencing to understanding the logic in my application and modify your source code accordingly.

Unfortunately I do not have a setup to test out devmem based direct memory sharing, and therefore I can only provide you reference codes that adhere to linux / posix standard APIs

- Subhajit

0 lord lei over 5 years ago in reply to Subhajit Paul

Intellectual 480 points

Subhajit,

1727.main.c use shm_open("/mutex-mem") function . so the phyical addr of alloced shared memory is not in 0xC0000000(multi-core share phyical addr) area,

test_mmap_mt_inherit.c use open("/dev/mem") function, just ensure the phyical addr of alloced shared memory is in 0xC0000000(multi-core share phyical addr) area,

so It is not shm_open("/mutex-mem") function or open("/dev/mem") function issue. It is used pthread_mutexattr_setpshared(&attr,PTHREAD_PROCESS_SHARED) function in 0xC0000000(multi-core share phyical addr) area issue

Thanks & Regards,

lei

0 lord lei over 5 years ago in reply to lord lei

Intellectual 480 points

Hi, experts

Any update?

Thanks & Regards,

lei

0 Kedar Chitnis over 5 years ago in reply to lord lei

TI__Genius 9101 points

Lei,

Lets reset the discussion a bit.

Can you tell once what you want to do ?

Do you want to implement,

1: A multi-process mutex, i.e mutual exclusion between Linux user space processes

2: A multi-CPU mutex, i.e mutual exclusion between Linux user space processes on A72 and BIOS tasks on C6x, C7x, R5F and so on.

I will elaborate below how to do 2. As part of 2, 1 will also get implemented.

On Linux A72,

1. Take a Posix semaphore. This will do mutual exclusion between Linux user space processes

2. Take a hardware spinlock. This will do mutual exclusion between CPUs on the SoC

On BIOS side (C6x, C7x, R5F)

1. Take a BIOS binary semaphore. This will do mutual exclusion between tasks on the same CPU.

2. Take a hardware spinlock. This will do mutual exclusion between CPUs on the SoC. All CPUs MUST take the same spinlock.

Note, it is important to first do "CPU local" mutual exclusion using posix semaphore or BIOS semaphore.

And then do mutual exclusion across CPU.

When taking spinlock, since the other CPU is "spinning" it is important to not take spinlock for longer duration.

Use it just for critical section update and release ASAP.

Sample code is shown below,

Linux side

Posix semaphore init (one time per process)

#include <semaphore.h>

sem_t *g_semaphore;

        /* create a named semaphore that is used by all TIOVX process
         * to serialize access to critical resources shared between processes
         * example, obj desc shared memory
         * mode/permissions = 00700 octal = 0x01C0
         */
        g_semaphore = sem_open("/mysem", (O_CREAT), (0x01C0), 1);

HW spinlock init (one time per process)

appIpcHwLockInit()

See file vision_apps/utils/ipc/src/app_ipc_linux_hw_spinlock.c for appIpcHwLockInit, appIpcHwLockAcquire, appIpcHwLockRelease

See file vision_apps/utils/ipc/src/app_ipc_linux.c for appMemMap

Take a lock to enter critical section

sem_wait(g_semaphore);

appIpcHwLockAcquire(255, APP_IPC_WAIT_FOREVER); /* 255 is the spinlock instance, there 256 spinlocks from 0..255 */

Release a lock to leave critical section

appIpcHwLockRelease(255);

sem_post(g_semaphore);

BIOS side

Init BIOS binary semaphore and HW spinlock (one time per CPU)

#include <ti/osal/SemaphoreP.h>

SemaphoreP_Handle handle;
SemaphoreP_Params semParams;

/* Default parameter initialization */
SemaphoreP_Params_init(&semParams);

semParams.mode = SemaphoreP_Mode_BINARY;

handle = SemaphoreP_create(1U, &semParams);

        if (NULL == handle)
        {
            status = (vx_status)VX_FAILURE;
        }

See file vision_apps/utils/ipc/src/app_ipc_sysbios.c for appIpcHwLockAcquire, appIpcHwLockRelease

There is no appIpcHwLockInit on BIOS side.

Take a lock

retVal = SemaphoreP_pend((SemaphoreP_Handle)handle,
SemaphoreP_WAIT_FOREVER);

        if (SemaphoreP_OK != retVal)
        {
            status = (vx_status)VX_FAILURE;
        }

appIpcHwLockAcquire(255, APP_IPC_WAIT_FOREVER); /* 255 is the spinlock instance, there 256 spinlocks from 0..255, make sure to take the same HW spinlock as Linux side */

Release a lock

appIpcHwLockRelease(255);

SemaphoreP_post((SemaphoreP_Handle)handle);

Hope this helps

regards
Kedar

0 Kedar Chitnis over 5 years ago in reply to lord lei

TI__Genius 9101 points

See this thread

e2e.ti.com/.../3218381

Processors

Processors forum

[TDA4] Shared memory between A72 <-> c66, A72 <-> r5f, A72 <-> c71