This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

[TDA4] Shared memory between A72 <-> c66, A72 <-> r5f, A72 <-> c71

Other Parts Discussed in Thread: SYSBIOS

Hi,

I want to use shared memory between A72 、c66、r5f . 

 (1) How to change share memory size?  Need to 1Gb.

 (2) Can you tell me a demof of shared memory ?

 (3)  Shared memory is cached or not ?  

 (4) I do a test between A72 and c71:

    In c71 sysbios side :

diff --git a/rtos_automotive_06_01_00_05/dji/psdk_rtos_auto_j7_06_01_00_15/vision_apps/apps/basic_demos/app_tirtos/tirtos_linux/c7x_1/main.c b/rtos_automotive_06_01_00_05/dji/psdk_rtos_auto_j7_06_01_00_15/vision_apps/apps/basic_demo
index 594c1c3..edc6690 100755
--- a/rtos_automotive_06_01_00_05/dji/psdk_rtos_auto_j7_06_01_00_15/vision_apps/apps/basic_demos/app_tirtos/tirtos_linux/c7x_1/main.c
+++ b/rtos_automotive_06_01_00_05/dji/psdk_rtos_auto_j7_06_01_00_15/vision_apps/apps/basic_demos/app_tirtos/tirtos_linux/c7x_1/main.c
@@ -73,6 +73,19 @@
#include <app_ipc_rsctable.h>
#include <ti/csl/soc.h>
#include <ti/csl/csl_clec.h>
+#include <ti/sysbios/hal/Cache.h>
+
+static Void shared_memory_test(Void)
+{
+ volatile uint32_t* testPtr = (uint32_t*)0xBC000000;
+ uint32_t counter = 0U;
+ while(1)
+ {
+ *testPtr = counter++;
+ Cache_wb((Ptr)testPtr, (SizeT)4, (Bits16)Cache_Type_ALL, (Bool)TRUE);
+ appLogWaitMsecs(1000u);; // 1sec
+ }
+}

static Void appMain(UArg arg0, UArg arg1)
{
@@ -82,6 +95,7 @@ static Void appMain(UArg arg0, UArg arg1)
while(1)
{
appLogWaitMsecs(100u);
+ shared_memory_test();
}
#else
appDeInit();

 In A72 linux side: 

root@j7-evm:/opt/vision_apps# devmem2 0xbc000000
/dev/mem opened.
Error at line 90, file devmem2.c (1) [Operation not permitted]
root@j7-evm:/opt/vision_apps# devmem2 0xbc000000
/dev/mem opened.
Error at line 90, file devmem2.c (1) [Operation not permitted]

devmem2 get data failed, why ?

Thanks & Regards,

Lei


  • Hi,

      Add a question:

       (5) Is  shared memory in r5f core have cache or not ?

     pls, update. 

    Thanks & Regards,

    Lei

  • <1>

    See this page for information about memory map and how to change it

    ${PSDKRA_INSTALL_PATH}/psdk_rtos_auto/docs/user_guide/developer_notes_memory_map.html

    <2> <4>

    A demo of shared memory including cache ops can be found here

    vision_apps/apps/basic_demos/app_linux_arm_ipc

    See also vision_apps/utils/ipc

    <3> <5>

    ION Shared memory, typically used for exchanging large pixel buffers, is cached.

    Some other IPC memory used for passing smaller messages is non-cached.

    See above developer note for more details.

    NOTE: My recommendation is not to change memory map unless you really need it. Most use-cases can be run with default memory map. I recommend you try with this and only later change it. Note, R5F and C6x are 32b so if you spill beyond 32b space things may not work

    regards
    Kedar

  • Hi Kedar,

     1. ION Shared memory  is virtual memory. 

         How to do if i want to Physical Shared memory  ?

     2. gen_linker_mem_map.py auto generate file : k3-j721e-vision_apps.dts,

         How to apply this file?  

      Need I to modify the file manually follow this ?

    /*
    * IMPORTANT NOTE: Follow below instructions to apply the updated memory map to linux dts/dtso/dtsi files,
    *
    * 1. Copy the memory sections, from the generated dts file, to the file shown below under reserved_memory: reserved-memory { ... }
    * ${LINUX_KERNEL_PATH}/arch/arm64/boot/dts/ti/k3-j721e-som-p0.dtsi
    *
    * 2. In file ${LINUX_KERNEL_PATH}/arch/arm64/boot/dts/ti/k3-j721e-auto-common.dtso
    * - Remove the fragment@xyz { ... } entries for xyz = 101 to 119
    *
    * 3. In file ${LINUX_KERNEL_PATH}/arch/arm64/boot/dts/ti/k3-j721e-vision-apps.dtso
    * - Remove the &reserved_memory { ... } entry
    *
    * 4. The entries are removed since the same are updated in k3-j721e-som-p0.dtsi, hence we should not have duplicate outdated one's
    *
    * 5. Rebuild the dtb, dtbo and use the updated dtb, dtbo files
    * - In PSDKLA install directory, doing below should build the dtb and dtbo
    * make linux-dtbs
    * - Copy the below updated dtb, dtbo files to "boot/" folder in your target linux SD card filesystem
    * arch/arm64/boot/dts/ti/k3-j721e-common-proc-board.dtb
    * arch/arm64/boot/dts/ti/k3-j721e-auto-common.dtbo
    * arch/arm64/boot/dts/ti/k3-j721e-vision-apps.dtbo
    */

     3. vision_apps/apps/basic_demos/app_tirtos/tirtos_linux shared memory between in c66x_1, c66x_2, c7x_1, mcu2_0, mcu2_1, mpu1 core, 

      So how to do add mcu1_0, mcu1_1, mcu3_0, mcu3_1 core  ?

    Thanks & Regards,

    Lei

  • Hi Kedar,

    4. If change DDR memory form 4G to 8G, how to porting it ?

    5. Can you tell  the API of synchronous shared memory operations ?

    Please update, 

    Thanks & Regards,

    Lei

  • Hi Kedar,

    6. How to set   shared memory to  memreserve ?

      We  don't want to use ION memory  in A72 side,  but if set shared memory to no-map reserved-memory, 

     It will have bus error when memset

    Please update, 

    Thanks & Regards,

    Lei

  • 1.

    See vision_apps/utils/mem/include/app_mem.h

    appMemGetVirt2PhyBufPtr()

    See vision_apps/utils/mem/src/app_mem_ion.c for implementation of the same.

    2.

    Applying the dtsi file has to be done manually

    3.

    One can add mcu1_0, mcu1_1, mcu3_0, mcu3_1 similar to mcu2-1 core. Unfortunately we don't have a ready example for the same. We plan to have a example showing this in future releases.

    4.

    Firstly we need to check if 8G is supported on the EVM. I think 4G is the DDR size on TI EVM.

    Assuming 8G is available on a board one needs to change DDR init sequence in SPL/uboot to support it.

    5."API of synchronous shared memory", not sure what you mean by this. If you mean to say cache operations, then again see app_mem.h for cache operation APIs.

    6. To reserve a memory you need to set in dts/dtsi file. See ION memory segment for example.

    However if you try to simply mmap this via /dev/mem, there few things to note.

    a. Any memory mapped to user space via /dev/mem is mapped as strongly ordered, non-cache memory. If you want to say use this for large pixel data sharing between say A72 and some other CPU/HW, this this will be inefficient from A72 point of view since data is not cached.

    b. Memory mapped via /dev/mem MUST be accessed in a aligned manner, ex, 32b access MUST be 32b aligned, 64b access MUST be 64b aligned, else one gets a bus error. Optimized functions like memcpy could result in unaligned accesses and hence you will see a bus error.

    It is strong recommendation to use a memory allocator like ION or develop a similar kernel module for shared memory access.

    Both the above constraints are not present when using ION memory allocator.

    regards
    Kedar

    ps, sorry for delay in responding (it was due to year end holidays)

  • Hi Kedar,

      Thanks for you help.

      This follow code will lead system hang. Why ?

    int lock_mem(void* x){
    asm("ldaxr w2,[x0]");
    }
    int main(int argc,char**argv){
    long long addr = 0xbc000000;
    long long size = 1*1024*1024;
    int fd = open("/dev/mem",O_RDWR|O_SYNC);
    if(fd < 0){
    perror("open");
    return 0;
    }
    char *ret = mmap(NULL,size,PROT_READ|PROT_WRITE,MAP_SHARED|MAP_LOCKED,fd,addr);
    if((intptr_t)ret < 0){
    perror("mmap");
    return 0;
    }
    lock_mem(ret);
    return 0;

    }

    Thanks & Regards,

    Lei

  • Hi Kedar,

    system hang (A72) log:

    ERROR: Unhandled External Abort received on 0x80000001 at EL3!
    ERROR: exception reason=0 syndrome=0xbf000002
    PANIC in EL3 at x30 = 0x000000007000442c
    x0 = 0x0000000000000000
    x1 = 0x0000000000000060
    x2 = 0x0000000000000060
    x3 = 0x000000000000000b
    x4 = 0x0000000000000062
    x5 = 0x0000000000000008
    x6 = 0x000000000000003b
    x7 = 0x0000000000000000
    x8 = 0x0000000000000062
    x9 = 0x0000000000000000
    x10 = 0x0101010101010101
    x11 = 0x0000000000000028
    x12 = 0x00000000000000d1
    x13 = 0x0000ffffdd52d92d
    x14 = 0x0000ffffb03e0f60
    x15 = 0x00000000000000d5
    x16 = 0x0000ffffb03e9ad8
    x17 = 0x0000ffffb0443308
    x18 = 0x0000ffffdd52d9d0
    x19 = 0x0000000000000000
    x20 = 0x00000000bf000002
    x21 = 0x0000600000000048
    x22 = 0x0000000000000000
    x23 = 0x0000ffffb0443350
    x24 = 0x0000000000000000
    x25 = 0x0000000000000000
    x26 = 0x0000000000000000
    x27 = 0x0000000000000000
    x28 = 0x0000000000000000
    x29 = 0x000000007000a160
    scr_el3 = 0x000000000000073d
    sctlr_el3 = 0x0000000030cd183f
    cptr_el3 = 0x0000000000000000
    tcr_el3 = 0x0000000080803520
    daif = 0x00000000000002c0
    mair_el3 = 0x00000000004404ff
    spsr_el3 = 0x0000000080000000
    elr_el3 = 0x0000ffffb03e96f4
    ttbr0_el3 = 0x000000007000e420
    esr_el3 = 0x00000000bf000002
    far_el3 = 0x0000000000000000
    spsr_el1 = 0x0000000060000000
    elr_el1 = 0x0000ffffb03e9450
    spsr_abt = 0x0000000000000000
    spsr_und = 0x0000000000000000
    spsr_irq = 0x0000000000000000
    spsr_fiq = 0x0000000000000000
    sctlr_el1 = 0x0000000034d5d91d
    actlr_el1 = 0x0000000000000000
    cpacr_el1 = 0x0000000000300000
    csselr_el1 = 0x0000000000000000
    sp_el1 = 0xffff000011130000
    esr_el1 = 0x0000000056000000
    ttbr0_el1 = 0x00000008c16d6c00
    ttbr1_el1 = 0x108c000080e20000
    mair_el1 = 0x0000bbff440c0400
    amair_el1 = 0x0000000000000000
    tcr_el1 = 0x00000034f5507510
    tpidr_el1 = 0x0000800876e60000
    tpidr_el0 = 0x0000ffffb0509430
    tpidrro_el0 = 0x0000000000000000
    dacr32_el2 = 0x0000000000000000
    ifsr32_el2 = 0x0000000000000000
    par_el1 = 0x0000000000000000
    mpidr_el1 = 0x0000000080000001
    afsr0_el1 = 0x0000000000000000
    afsr1_el1 = 0x0000000000000000
    contextidr_el1 = 0x0000000000000000
    vbar_el1 = 0xffff000008081800
    cntp_ctl_el0 = 0x0000000000000005
    cntp_cval_el0 = 0x0000009bc2b0160c
    cntv_ctl_el0 = 0x0000000000000000
    cntv_cval_el0 = 0x0000000000000000
    cntkctl_el1 = 0x00000000000000e6
    sp_el0 = 0x000000007000a160
    isr_el1 = 0x0000000000000040
    cpuectlr_el1 = 0x0000001b00000040
    cpumerrsr_el1 = 0x0000000000000000
    l2merrsr_el1 = 0x0000000000000000

    Thanks & Regards,

    Lei

  • Lei,

    Why are you trying to expressly call "ldaxr" instructions? Why are you not using synchronization tools like mutex etc?

    - Subhajit

  • Subhajit,

     Because we need locks shared by multiple process in linux 

    Thanks & Regards,

    Lei

  • If that is the case, you can put your mutex in a shared memory that is accessible by multiple processes. You can make the mutex shared among processes by calling this:

    int pthread_mutexattr_setpshared(pthread_mutexattr_t *attr,
        int pshared);

  • You can use Posix semaphore for multi process mutex.

  • Subhajit,

    There will be the following problems on multi-core shared memory, but there will be no such problems on other memory ranges:
      1. Multithreaded shared locks cannot guarantee atomicity in critical sections
      2. The lock priority reversal function cannot be turned on, otherwise it will fail to unlock:

      pthread_mutexattr_setprotocol(&attr,PTHREAD_PRIO_INHERIT);


       log:
      Expectations:
      value of atomic add var is 40000000
      abnormal situation:
      value of atomic add var is 37021808

    Relate source code,See attachment  

    Thanks & Regards,

    Lei

    #include <sys/mman.h>
    #include <fcntl.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>
    #include <string.h>
    #include <pthread.h>
    #include <errno.h>
    #include <sys/types.h>
    #include <sys/wait.h>
    
    
    pthread_mutex_t *lock;;
    
    void* thread_fun(void*arg){
        volatile int *p = arg;
        int i = 0;
        for(i = 0; i < 10000000;i++){
            int ret = pthread_mutex_lock(lock);
            if(ret != 0){
                printf("lock ret = %d,errno = %d\n",ret,errno);
            }
            (*p)++;
           // usleep(1);
            ret = pthread_mutex_unlock(lock);
            if(ret != 0){
                printf("unlock ret = %d,errno = %d\n",ret,errno);
            }
        }
        return NULL;
    }
    
    int main(int argc,char**argv){
        long long addr = 0xC0000000;
        long long size = 0x100000;
        int fd = open("/dev/mem",O_RDWR);//|O_SYNC);
        if(fd < 0){
            perror("open");
            return 0;
        }
        char *ret = mmap(NULL,size,PROT_READ|PROT_WRITE,MAP_SHARED|MAP_LOCKED,fd,addr);
        if((intptr_t)ret < 0){
            perror("mmap");
            return 0;
        }
        *(volatile int*)ret = 0;
        lock = (void*)((char*)ret+4);
        pthread_mutexattr_t attr;
        pthread_mutexattr_init(&attr);
        int pret = pthread_mutexattr_setpshared(&attr,PTHREAD_PROCESS_SHARED);
        if(pret != 0){
            printf("pret is %d@%d\n",pret,__LINE__);
            return 0;
        }
        //pret = pthread_mutexattr_setprotocol(&attr,PTHREAD_PRIO_INHERIT);
        if(pret != 0){
            printf("pret is %d@%d\n",pret,__LINE__);
            return 0;
        }
        pret = pthread_mutex_init(lock,&attr);
        if(pret != 0){
            printf("pret is %d@%d\n",pret,__LINE__);
            return 0;
        }
    
        int fork_ret = fork();
        pthread_t tid[2];
        int pthread_ret = pthread_create(&tid[0],NULL,&thread_fun,ret);
        pthread_ret = pthread_create(&tid[1],NULL,&thread_fun,ret);
        pthread_join(tid[0],NULL);
        pthread_join(tid[1],NULL);
        if(fork_ret){
            printf("value of atomic add var is %d\n",*(volatile int*)ret);
        }
        return 0;
    }
    
    /* 在多核共享内存上面会有如下问题,而在其他的内存区间上面不会有这种问题:
     * 1. 多线程共享锁无法保证临界区的原子性
     * 2. 锁优先级反转功能不能打开,否则会解锁失败
     * log:
     * 期望情况:
     *    value of atomic add var is 40000000
     * 异常情况:
     *    value of atomic add var is 37021808
     * */
    

  • I can see that you have done two pthread_join calls to make sure the threads exit. But I do not see a waitpid call to ensure that the forked process returning.

    - Subhajit

  • Subhajit,

    Thank you for u help.

      if enable this function:

      pthread_mutexattr_setprotocol(&attr,PTHREAD_PRIO_INHERIT);

     then multithreaded shared locks cannot guarantee atomicity in critical sections

       log:
      Expectations:
      value of atomic add var is 40000000
      abnormal situation:
      value of atomic add var is 36976716

    Relate source code,See attachment  

    #include <sys/mman.h>
    #include <fcntl.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>
    #include <string.h>
    #include <pthread.h>
    #include <errno.h>
    #include <sys/types.h>
    #include <sys/wait.h>
    
    
    pthread_mutex_t *lock;;
    
    void* thread_fun(void*arg){
        volatile int *p = arg;
        int i = 0;
        for(i = 0; i < 10000000;i++){
            int ret = pthread_mutex_lock(lock);
            if(ret != 0){
                printf("lock ret = %d,errno = %d\n",ret,errno);
            }
            (*p)++;
           // usleep(1);
            ret = pthread_mutex_unlock(lock);
            if(ret != 0){
        //        printf("unlock ret = %d,errno = %d\n",ret,errno);
            }
        }
        return NULL;
    }
    
    int main(int argc,char**argv){
        long long addr = 0xC0000000;
        long long size = 0x100000;
        int fd = open("/dev/mem",O_RDWR);//|O_SYNC);
        if(fd < 0){
            perror("open");
            return 0;
        }
        char *ret = mmap(NULL,size,PROT_READ|PROT_WRITE,MAP_SHARED|MAP_LOCKED,fd,addr);
        if((intptr_t)ret < 0){
            perror("mmap");
            return 0;
        }
        *(volatile int*)ret = 0;
        lock = (void*)((char*)ret+4);
        pthread_mutexattr_t attr;
        pthread_mutexattr_init(&attr);
        int pret = pthread_mutexattr_setpshared(&attr,PTHREAD_PROCESS_SHARED);
        if(pret != 0){
            printf("pret is %d@%d\n",pret,__LINE__);
            return 0;
        }
        pret = pthread_mutexattr_setprotocol(&attr,PTHREAD_PRIO_INHERIT);
        if(pret != 0){
            printf("pret is %d@%d\n",pret,__LINE__);
            return 0;
        }
        pret = pthread_mutex_init(lock,&attr);
        if(pret != 0){
            printf("pret is %d@%d\n",pret,__LINE__);
            return 0;
        }
    
        int fork_ret = fork();
        pthread_t tid[2];
        int pthread_ret = pthread_create(&tid[0],NULL,&thread_fun,ret);
        pthread_ret = pthread_create(&tid[1],NULL,&thread_fun,ret);
        pthread_join(tid[0],NULL);
        pthread_join(tid[1],NULL);
        if (fork_ret) {
    		int s;
    		waitpid(-1, &s,NULL);
            printf("value of atomic add var is %d\n",*(volatile int*)ret);
        }
        return 0;
    }
    
    /* 在多核共享内存上面会有如下问题,而在其他的内存区间上面不会有这种问题:
     * 1. 多线程共享锁无法保证临界区的原子性
     * 2. 锁优先级反转功能不能打开,否则会解锁失败
     * log:
     * 期望情况:
     *    value of atomic add var is 40000000
     * 异常情况:
     *    value of atomic add var is 36976716
     * */
    

    Thanks & Regards,

  • Lei,

    Somehow I do not agree with you.

    Attached a source code that uses shared mutex (over shared memory) and successfully prints out the desired value.

    Compile as <gcc -o app main.c -lpthread -lrt>

    Run on two terminals

    Terminal 1:

    $ ./app 

    Creating ... success.
    mapped as 0x7f9d2a54f000
    Deleting.
    v = 80000000
    Deleting.

    Terminal 2:

    $ ./app 
    Creating ... failed (File exists). Opening ... success.
    mapped as 0x7fc6b720c000
    Deleting.
    Deleting.

    #include <stdio.h>
    #include <stdbool.h>
    #include <stdatomic.h>
    #include <stdlib.h>
    #include <string.h>
    #include <unistd.h>
    #include <fcntl.h>
    #include <sys/mman.h>
    #include <errno.h>
    #include <signal.h>
    #include <pthread.h>
    
    struct shared_mutex {
    	atomic_int latch;
    	pthread_mutex_t mutex;
    	int test;
    };
    
    
    void *add(void *data)
    {
    	int i = 0;
    	struct shared_mutex *smutex = data;
    
    	/*
    	 * Remember: PTHREAD_PRIO_INHERIT is very, very slow on my machine
    	 *
    	 * Loop till 10000000 took ~90 seconds
    	 *
    	 * Without PTHREAD_PRIO_INHERIT, it took less that 5 secs
    	 *
    	 * In both the cases atomicity was ensured, though!
    	 */
    	while(i++ < 10000000) {
    		pthread_mutex_lock(&smutex->mutex);
    		smutex->test++;
    		pthread_mutex_unlock(&smutex->mutex);
    	}
    	return NULL;
    }
    
    int main()
    {
    	int fd, ret;
    	void *vaddr;
    	struct shared_mutex *smutex = NULL;
    
    	/* try to create a "named" shared memory */
    	printf("Creating ... ");
    	fd = shm_open("/mutex-mem", O_RDWR | O_CREAT | O_EXCL, 0664);
    	if(fd < 0) {
    		/* create failed because I am not the first command line app
    		 * try to just open */
    		printf("failed (%s). Opening ... ", strerror(errno));
    		fd = shm_open("/mutex-mem", O_RDWR, 0664);
    		if(fd < 0) {
    			printf("failed (%s). Bailing out\n", strerror(errno));
    			exit(0);
    		} else {
    			printf("success.\n");
    
    			/* I am the user, mmap the shmem ... */
    			vaddr = mmap(NULL, getpagesize(), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    			if(vaddr == MAP_FAILED)
    				printf("mmap failed (%s)\n", strerror(errno));
    			else
    				printf("mapped as %p\n", vaddr);
    
    			smutex = vaddr;
    			/* ... wait for latch to go to 1 ... */
    			while(atomic_load(&smutex->latch) != 1);
    			/* .. and set it back to zero */
    			atomic_store(&smutex->latch, 0);
    		}
    	} else {
    		printf("success.\n");
    
    		/* I am the creator, so resize shmem to 4K and mmap ... */
    		ftruncate(fd, getpagesize());
    
    		vaddr = mmap(NULL, getpagesize(), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    		if(vaddr == MAP_FAILED)
    			printf("mmap failed (%s)\n", strerror(errno));
    		else
    			printf("mapped as %p\n", vaddr);
    
    		smutex = vaddr;
    
    		/* ... create the mutex ... */
    		pthread_mutexattr_t attr;
    		pthread_mutexattr_init(&attr);
    		pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
    
    		/* Note: The line below will kill your performance */
    		pthread_mutexattr_setprotocol(&attr,PTHREAD_PRIO_INHERIT);
    
    		pthread_mutex_init(&smutex->mutex, &attr);
    
    		/* ... set latch to 1 ... */
    		atomic_init(&smutex->latch, 1);
    		/* ... and then wait for it to go to 0 */
    		while(atomic_load(&smutex->latch) != 0);
    
    		/*
    		 * Note:
    		 * This latch mechanism is required so that only one command
    		 * line app creates the shmem, resizes it, and inits the mutex
    		 */
    	}
    
    	/*
    	 * We will run 8 instances of the loop
    	 * 2 command line apps
    	 * 2 forks per command line app
    	 * 2 threads per forked process
    	 *
    	 * 2 threads x 2 forks x 2 apps = 8 loops
    	 */
    
    	/* Run a forked process */
    	pid_t pid = fork();
    	pthread_t th;
    
    	/* Run a thread ... */
    	pthread_create(&th, NULL, add, smutex);
    	/* ... And run in main thread */
    	add(smutex);
    	/* Wait for thread to finish */
    	pthread_join(th, NULL);
    
    	/* If parent, wait for forked process to finish */
    	if(pid) {
    		wait(NULL);
    
    		/* At this point, latch is 0 (set previously)
    		 * The command-line app that reaches here first
    		 * sets it to 1 (if it is set to 0)
    		 *
    		 * So evidently the other guy would see it set to
    		 * 1 when it reaches here.
    		 *
    		 * The loser of the race prints the final value.
    		 */
    		int zero = 0;
    		if(!atomic_compare_exchange_strong(&smutex->latch, &zero, 1))
    			printf("v = %d\n", smutex->test);
    	}
    
    	/* Dont worry, it wont actually delete unless all users "close" it */
    	printf("Deleting.");
    	shm_unlink("/mutex-mem");
    
    }
    

    I would like to point out that the performance is MUCH MUCH faster if  PTHREAD_PRIO_INHERIT is not used. Attaching screenshots

  • Subhajit,

    Thank you for u help.

    1. Remember: PTHREAD_PRIO_INHERIT is very, very slow on you machine.

      Why this ?

    2.  we want alloc share memory in Ti ION allocer , can not use shm_open("/mutex-mem"), but it is the same issue in open("/dev/mem"), So we used the 

    open("/dev/mem") function to be test code.

    3. So  please do fix   test_mmap_mt_inherit_2.c  issue.

    Thanks & Regards,

  • 1. I do not have a clear answer for this. Unless this is important, we should not spend time on answering this

    2 and 3. Please refer to my attached application and make appropriate changes to your code.

    - Subhajit

  • Subhajit,

    Unfortunately shm_open("/mutex-mem") does not meet our needs. Becase we want used shard memory between in all cores(A72,c66,c71).

    Thanks & Regards,

    lei

  • I did not mention anything about shmem in particular. You can use mmaped memory just like I used shmem.

    I was referencing to understanding the logic in my application and modify your source code accordingly.

    Unfortunately I do not have a setup to test out devmem based direct memory sharing, and therefore I can only provide you reference codes that adhere to linux / posix standard APIs

    - Subhajit

  • Subhajit,

    1727.main.c use shm_open("/mutex-mem")  function . so the phyical addr of alloced shared memory is not in 0xC0000000(multi-core share  phyical addr) area, 

    test_mmap_mt_inherit.c use open("/dev/mem") function, just ensure the phyical addr of alloced shared memory is in 0xC0000000(multi-core share  phyical addr) area,

    so It is not shm_open("/mutex-mem")  function or open("/dev/mem") function issue. It is used pthread_mutexattr_setpshared(&attr,PTHREAD_PROCESS_SHARED) function in  0xC0000000(multi-core share  phyical addr) area issue

    Thanks & Regards,

    lei

  • Hi, experts

     Any update?

    Thanks & Regards,

    lei

  • Lei,

    Lets reset the discussion a bit.

    Can you tell once what you want to do ?

    Do you want to implement,

    1: A multi-process mutex, i.e mutual exclusion between Linux user space processes

    2: A multi-CPU mutex, i.e mutual exclusion between Linux user space processes on A72 and BIOS tasks on C6x, C7x, R5F and so on.

    I will elaborate below how to do 2. As part of 2, 1 will also get implemented.

    On Linux A72,

    1. Take a Posix semaphore. This will do mutual exclusion between Linux user space processes

    2. Take a hardware spinlock. This will do mutual exclusion between CPUs on the SoC

    On BIOS side (C6x, C7x, R5F)

    1. Take a BIOS binary semaphore. This will do mutual exclusion between tasks on the same CPU.

    2. Take a hardware spinlock. This will do mutual exclusion between CPUs on the SoC. All CPUs MUST take the same spinlock.

    Note, it is important to first do "CPU local" mutual exclusion using posix semaphore or BIOS semaphore.

    And then do mutual exclusion across CPU.

    When taking spinlock, since the other CPU is "spinning" it is important to not take spinlock for longer duration.

    Use it just for critical section update and release ASAP.

    Sample code is shown below,

    Linux side

    Posix semaphore init (one time per process)

    #include <semaphore.h>

    sem_t *g_semaphore;

            /* create a named semaphore that is used by all TIOVX process
             * to serialize access to critical resources shared between processes
             * example, obj desc shared memory
             * mode/permissions = 00700 octal = 0x01C0
             */
            g_semaphore = sem_open("/mysem", (O_CREAT), (0x01C0), 1);

    HW spinlock init (one time per process)

    appIpcHwLockInit()

    See file vision_apps/utils/ipc/src/app_ipc_linux_hw_spinlock.c for appIpcHwLockInit, appIpcHwLockAcquire, appIpcHwLockRelease

    See file vision_apps/utils/ipc/src/app_ipc_linux.c for appMemMap

    Take a lock to enter critical section

    sem_wait(g_semaphore);

    appIpcHwLockAcquire(255, APP_IPC_WAIT_FOREVER); /* 255 is the spinlock instance, there 256 spinlocks from 0..255 */

    Release a lock to leave critical section

    appIpcHwLockRelease(255);

    sem_post(g_semaphore);

    BIOS side

    Init BIOS binary semaphore and HW spinlock (one time per CPU)

    #include <ti/osal/SemaphoreP.h>

        SemaphoreP_Handle handle;
        SemaphoreP_Params semParams;

            /* Default parameter initialization */
            SemaphoreP_Params_init(&semParams);

            semParams.mode = SemaphoreP_Mode_BINARY;

            handle = SemaphoreP_create(1U, &semParams);

            if (NULL == handle)
            {
                status = (vx_status)VX_FAILURE;
            }

    See file vision_apps/utils/ipc/src/app_ipc_sysbios.c for appIpcHwLockAcquire, appIpcHwLockRelease

    There is no appIpcHwLockInit on BIOS side.

    Take a lock

            retVal = SemaphoreP_pend((SemaphoreP_Handle)handle,
                SemaphoreP_WAIT_FOREVER);

            if (SemaphoreP_OK != retVal)
            {
                status = (vx_status)VX_FAILURE;
            }

            appIpcHwLockAcquire(255, APP_IPC_WAIT_FOREVER); /* 255 is the spinlock instance, there 256 spinlocks from 0..255, make sure to take the same HW spinlock as Linux side */

    Release a lock

    appIpcHwLockRelease(255);

    SemaphoreP_post((SemaphoreP_Handle)handle);

    Hope this helps

    regards
    Kedar