This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Spinlock for Baremetal Starterware ARM to PRU communication on BeagleBone Black?

My question is: How do I send data in a multiprocessor safe way from the ARM CortexA8 to PRU0 via a FIFO queue?

(The data might just be a structure with two uint32_t's.)

The PRU cape demo doesn't seem to have any examples of communicating between the ARM and the PRU (other than loading an image :) )..

I am using CCSv6 and a Beaglebone Black with AM355X Starterware and a TI PRU cape with the PRU software support package. 

I can recompile and run a Debug version of the PRU cape demo. Using the TI XDS100v2 USB debug probe I can connect to both the CortexA8 and PRU0  and single step both.   All good.

To continue, I have set up an example based on the PRU cape demo "PRU_LED0" where  the ARM C program requests the user to enter a period (in PRU clock ticks) which is sent to the PRU via a shared variable "shared" defined in the link.cmd file (see below). This works and I can change the PRU cape LED flashing period interactively.

To protect simultaneous access to "shared" by both the ARM and the PRU I thought I could use a simple spinlock mechanism noting the Sitara processors spinlock registers for atomic accesses.

However, when I try to access the spinlock registers (e.g. by using HWREG(0x480CA010) = 0x02;)  I get an UndefInstHandler() error.

Any suggestions for ways to accomplish this basic communication task between the ARM and a PRU or to fix my spinlock problems with my approach would be very much appreciated.

Are there any relevant examples available?

Bryan

Some More Detail

The method I have used for setting up the shared buffer is as follows (any comments appreciated):

In the ARM C program, declare                          extern uint32_t shared[10];

In the ARM link cmd file:
MEMORY
{
        MYSHARED_MEM        : org = 0x80000000  len = 0x30
        DDR_MEM        : org = 0x80000030  len = 0x8000000-0x30           /* RAM 128MB*/
}

/* SPECIFY THE SECTIONS ALLOCATION INTO MEMORY */

SECTIONS
{
    .text:Entry : load > 0x80000030

    .text    : load > DDR_MEM              /* CODE                          */
    .data    : load > DDR_MEM              /* INITIALIZED GLOBAL AND STATIC VARIABLES */
    .bss     : load > DDR_MEM              /* UNINITIALIZED OR ZERO INITIALIZED */
                                           /* GLOBAL & STATIC VARIABLES */
                    RUN_START(bss_start)
                    RUN_END(bss_end)
    .const   : load > DDR_MEM              /* GLOBAL CONSTANTS              */
    .cinit     : load > DDR_MEM
    .stack   : load > DDR_MEM HIGH //0x87FFF000           /* SOFTWARE SYSTEM STACK         */
    LED0_text: {PRU_LED0_image.obj(.text)} load > DDR_MEM run_start(LED0_INST)
    LED0_data: {PRU_LED0_image.obj(.data)} load > DDR_MEM run_start(LED0_DATA)
    MyShared: load > MYSHARED_MEM RUN_START(shared) RUN_SIZE(shared_size)
}

In the PRU_LED0 C program declare:           volatile uint32_t pru_shared[10] __attribute__((cregister("DDR")));

(The PRU_LED0 link.cmd file remains unchanged).

 

Now, I thought I could use a spinlock to protect the accesses:

while (readlock()==1);

shared[0] = 57;        // Or on the PRU pru_shared[0] = 37;

writelock(0);

  • Hi,

    May be this is late,

    Refer to bsp_hwspinlock_init API in sdk\protocols\ethercat_slave\ecat_appl\EcatStack\tiescbsp.c file from Industrial SDK : downloads.ti.com/.../index_FDS.html

    *(volatile Uint32 *)(CM_PER_SPINLOCK_CLKCTRL_OFFSET) = 2; //Enable spinlock clock in CM_PER_SPINLOCK_CLKCTRL
    regval = *(volatile Uint32 *)(ICSS_CFG_BASE+ICSS_CFG_SYSCFG);
    *(volatile Uint32 *)(ICSS_CFG_BASE+ICSS_CFG_SYSCFG) = regval&~(0x10); //Enable OCP master ports for accessing spinlock from PRUs
  • Hi Pratheesh,

    Thanks for your response and sorry for my delay in answering. Your answer confirmed the direction I was taking.

    Using TI Starterware defines such as:

    #define HWREG(x)  (*((volatile unsigned int *)(x)))

    and the spinlock module   LOCK_REG_0 register at 0x800 the following code seemed to work:


        HWREG(SOC_PRCM_REGS + CM_PER_SPINLOCK_CLKCTRL) = 0x02;    // Enable the spinlock module.
       
        /* Waiting for field to reflect the written value. */
        while(0x02 != (HWREG(SOC_PRCM_REGS + CM_PER_SPINLOCK_CLKCTRL)));

        int reg = HWREG(0x480CA000 + 0);
        ConsoleUtilsPrintf("\nSpinlock revision = %xd\n", reg);
        reg = HWREG(0x480CA000 + 0x10);
        ConsoleUtilsPrintf("\nSpinlock config = %xd\n", reg);

        reg = HWREG(0x480CA000 + 0x14);
        ConsoleUtilsPrintf("\nSpinlock status = %xd\n", reg);

        uint32_t lock;
        if ((lock = HWREG(0x480CA000 + 0x800)) == 0) {
            ConsoleUtilsPrintf("Lock taken\n");
        }
        if ((lock = HWREG(0x480CA000 + 0x800)) == 1) {
            ConsoleUtilsPrintf("Lock already taken\n");
            HWREG(0x480CA000 + 0x800) = 0;
        }

        // If the lock is taken here then the pru queue is stopped.
        while (HWREG(0x480CA000 + 0x800) == 1); // Wait until lock is released by another process.
      

    -------------------------------------------------------------------------------

     

    However, when I set up a FIFO between the ARM and the PRU using a spinlock I sometimes got bad timing. Note that he FIFO was

    being used heavily with new values queued as fast as every 20 microseconds.

    About the same time as my original post on this forum I also posted on the google groups beagleboard forum too.

    See https://groups.google.com/forum/#!searchin/beagleboard/bryanb/beagleboard/CU-FlZDoOt8/Hph3DTX-1VcJ

     I had a helpful response from Charles Steinkeuhler and I put the queue into PRU memory rather than ARM memory.

    It now seems that I do not need any synchronization when accessing the queue. I don't completely understand this yet

    but I presume that it may be that the read and write operations are atomic because of  the way the SOC bus works.

    Any more definite information on this would be most helpful. I did see something like this alluded to in a post somewhere

    but I can't recall where.

    Thanks again,

    Bryan

     

     

     

  • Hi,

    Thanks for pointing to the thread, really good discussion. About below query

    " I had a helpful response from Charles Steinkeuhler and I put the queue into PRU memory rather than ARM memory.

    It now seems that I do not need any synchronization when accessing the queue. I don't completely understand this yet

    but I presume that it may be that the read and write operations are atomic because of  the way the SOC bus works."

    I can confirm that  If ARM does an aligned 4 bytes read/write to ICSS data memory, all 4 bytes will get written or read at the same time to/from ICSS memory  and the access is atomic.

  • Hi Pratheesh,
    Thanks for the info. Did you do experiments to confirm the atomic nature of the writes or could you point me to a reference where this is described?
    Regards,
    Bryan
  • Hello


    I have maybe found a solution. I made the same test with HWREG() (an that not work like you) but it possible to write in the shared memory of the PRU with the memory()  function you just need to cast the pointer in char* :

    /memcpy((unsigned char*)(SOC_PRUICSS1_REGS + SOC_PRUICSS_SHARED_RAM_OFFSET),(unsigned char*)test,10);

    the macro SOC_PRUICSS1_REGS and SOC_PRUICSS_SHARED_RAM_OFFSET are in the hw_pruss.h file who is use to load PRU plication.
     test is a tab fill with 0 for my test :

    memcpy((unsigned char*)(SOC_PRUICSS1_REGS + SOC_PRUICSS_SHARED_RAM_OFFSET),(unsigned char*)test,10);


    for(i=0;i<1088;i++)
    {
            test[0][0][i]=(char)(i);

    }

    memcpy((unsigned char*)(SOC_PRUICSS1_REGS + SOC_PRUICSS_SHARED_RAM_OFFSET),(unsigned char*)test,10);

    42E5F23B    D55445AB    2810E849    4778DE8E    D11B9485    97F91FA5    EE869E6F    2F6C06EA            Value of the memory before the first memcpy
    00000000    00000000    28100000    4778DE8E    D11B9485    97F91FA5    EE869E6F    2F6C06EA              Value of the memory after the first memcpy
    03020100    07060504    28100908    4778DE8E    D11B9485    97F91FA5    EE869E6F    2F6C06EA              Value of the memory after the second memcpy

    Before starting to write in the memory you need to call this function :

     //Sets and Enables clock, Zeros memory, resets PRU
        PRUICSSInit();

    I hop ti helpful