This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM5728: DSP response time to PRU event

Part Number: AM5728


We started from

pdk_am57xx_1_0_12/packages/ti/drv/pruss/test/am572x/c66/bios

to build a joint DSP - PRU program, which reads PRU GPI inputs with the PRU and pass the data,
after preprocessing if need be, to the DSP. The program works as expected if the frequency of
PRU0_ARM_EVENT is rather low (e.g. __delay_cycles(1000) in the PRU source code) but not
any more for higher frequency  of PRU0_ARM_EVENT (e.g. __delay_cycles(1000) in the PRU
source code). This was observed with the release build. We wonder if the minimal time period
can be decreased, e.g. in changing the bios configuration.

The C code for the PRU inspired from

pdk_am57xx_1_0_12/packages/ti/drv/pruss/test/src/pru_firmware/pruss_test_pru0.txt

is

#include <stdint.h>
#include <pru_cfg.h>
#include <pru_intc.h>

#define AM33XX
#ifdef AM33XX
#define PRU0_ARM_EVENT       (17)
#else
#define PRU0_ARM_EVENT       (33)
#endif


volatile register uint32_t __R31;

#pragma DATA_SECTION(gpi, ".gpiData")
volatile uint32_t gpi;

/**
 * main.c
 */
int main(void)
{

    while (1) {
        /* Read input */
        gpi = __R31;

        /* Send notification to host */
#ifdef AM33XX
            __R31 = PRU0_ARM_EVENT+16;
#else
            __R31 = PRU0_ARM_EVENT;
#endif

        /* Wait */
        __delay_cycles(1000);
    }
    
    /* Should never return */
    return 0;
}

The resulting content of

pdk_am57xx_1_0_12/packages/ti/drv/pruss/test/src/pru_firmware/pruss_test_pru0_bin.h

is

const unsigned int XYZ[] =  {
0x240000c0,
0x24010080,
0x0504e0e2,
0x2eff818e,
0x230007c3,
0x240001ee,
0x230010c3,
0x240100e1,
0xe100219f,
0x240021ff,
0x240000c0,
0x2401f380,
0x0501e0e0,
0x6f00e0ff,
0x10000000,
0x21000700,
0x230012c3,
0x21001100,
0x10000000,
0x20c30000};

  • Hi Gilbert, to understand a little bit better, please help me with some additional information.

    - What is the behavior, or the issue you see when you reduce __delay_cycles(1000)?

    - How do you inform DSP that GPI is ready? .. I see PRU0_ARM_EVENT, but not sure if you inform first ARM and then DSP. 

    - Is anything else running on your system? any additional code running on DSP and/or ARM?

    Just trying to understand better the flow and your test.

    thank you,

    Paula

  • Hi Paula,

    Thank you very much for taking time to answer my question.

    Please find hereafter the answers to your questions and some more information:

    The DSP program is based on
    /opt/PHYTEC_BSPs/rtos_ti/install/pdk_am57xx_1_0_12/packages/ti/drv/pruss/test/src/main.c

    However, it has only one task, since we are only interested in PRU2. The task basically performs the following endless loop,
    which displays the increase of PRU eventscounts between two loop steps

        while(1)
        {
            SLEEP(400);

            PRINT("Num. PRU events : %d\n", pruEventCount2-lastPruEventCount2);
            lastPruEventCount2 = pruEventCount2;
        }

    In the interrupt routine, the PRU event is aknowledged, 4 bytes are copied from the PRU, and the counter of PRU events
    is incremented by 1:

    void pruss_isr2(void * ptr)
    {
        PRUICSS_pruClearEvent(handle2,PRU0_ARM_EVENT);
        PRUICSS_pruReadMemory(handle2,PRUICSS_PRU0_DATARAM,0,&gpi,4);
        pruEventCount2++;
    }

    When the number of __delay_cycles decreases (100 instead of 1000), the increment of PRU event counts falls suddenly
    to zero.

    I compiled and loaded the program for the DSP with CCS v8. Since the original test program (main.c from the package) was
    working, I understood that the DSP was receiving the PRU0_ARM_EVENT, but I am may-be wrong.

    The ARM booted from the SD card on which RTOS SDK was installed.

    Best regards,
    Gilbert

  • Gilbert, thank you! I got a better picture of your test and intention. Let me check with some of our PRU experts and come back to you.

    Paula

  • Gilbert, checking with some colleagues, the consensus was that PRU shouldn't be the limiting factor. Probably pruss_isr2() is not able to keep up with the rate at which PRU is sending interrupts. In other words, the read/write latency from DSP to PRU's interrupt bit and data could be the limiting factor.  Some cycles could be saved by replacing PRUICSS_pruClearEvent() with a single register write, and other changes in that line.

    Wondering if you have done any test only PRU -> ARM. If so, same behaviour/limitations? if not, then maybe worthy to explore the option of a direct interrupt handling between PRU and DSP..

    thank you,

    Paula

  • Hello Paula,

    Thank you for your suggestions. I am going to try replacing the call to PRUICSS_pruClearEvent() with a single register write.

    Concerning direct interrupt between PRU and DSP, is this not achieved in the function Board_initPruss

    #ifndef SOC_K2G
    void Board_initPruss(void)
    {
    #ifdef __ARM_ARCH_7A__
        CSL_xbarMpuIrqConfigure(CSL_XBAR_INST_MPU_IRQ_134, CSL_XBAR_PRUSS1_IRQ_HOST2);
        CSL_xbarMpuIrqConfigure(CSL_XBAR_INST_MPU_IRQ_135, CSL_XBAR_PRUSS2_IRQ_HOST2);
    #elif _TMS320C6X
        CSL_xbarDspIrqConfigure(1, CSL_XBAR_INST_DSP1_IRQ_92, CSL_XBAR_PRUSS1_IRQ_HOST2);
        CSL_xbarDspIrqConfigure(1, CSL_XBAR_INST_DSP1_IRQ_93, CSL_XBAR_PRUSS2_IRQ_HOST2);
    #else
        CSL_xbarIpuIrqConfigure(1, CSL_XBAR_INST_IPU1_IRQ_48, CSL_XBAR_PRUSS1_IRQ_HOST2);
        CSL_xbarIpuIrqConfigure(1, CSL_XBAR_INST_IPU1_IRQ_49, CSL_XBAR_PRUSS2_IRQ_HOST2);
    #endif
    }
    #endif

    if _TMS320C6X is defined ?

    Best regards,
    Gilbert

  • Gilbert,

    _TMS320C6X must be defined for configuring cross bar to map interrupt from PRUSS to DSP.

    Regards,
    Garrett

  • Hi Garrett,

    Thank you very much for helping in finding a solution.

    Yes, _TMS320C6X has to be defined and it was defined for compiling the code I executed.

    But, I still find underwhelming the time used by the DSP to process interrupts generated by the PRU...

    Best regards,

    Gilbert

  • Hi Gilbert,

    Is it possible for you to upload your DSP and PRU CCS projects for the issue so we can try to reproduce it? Also, you may try this with the latest PRSDK v6.0 - www.ti.com/.../PROCESSOR-SDK-AM57X

    Regards,

    Garrett

  • Hi Garrett,

    Thank you very much for willing to help me in solving the problem.

    As you proposed, I upload the DSP and PRU CCS projects. I will also try to move to the latest PRSDK v6.0.

    Regards,
    Gilbert

    pruss_GPI_firmware_20190722.zipDSP_proc_pru_gpi_20190722.zip

  • Hi Gilbert,

    Can you please include your shared/ folder in DPS_proc project as well? the config.bld is missing....

    ----------

    gmake: *** No rule to make target 'C:/Users/workspace/DSP_proc/shared/config.bld', needed by 'configPkg/compiler.opt'.

    Regards,

    Garrett

  • Hi Garrett,

    Thank you very much for trying to reproduce the issue I have encountered.

    In release configuration, I did not encountered this error. The shared folder is not needed. You can remove the option -b,
    which is set unwillingly for the XDC Tools.

    Best regards,
    Gilbert

  • Hi Gilbert,

    The DSP project was built after removing the share folder /or build release. I don't have a ti-cgt-c6000_8.2.3 in my PC so build with ti-cgt-c6000_8.2.4, and the 

    Num. PRU events : 16 

    With delay cycle 100 (#include "pruss_GPI_firmware_100.h")

    There are certain latency in RTOS for interrupt: 

    You may try the latest Processor SDK 6.0 to see if there is more improvement, which includes newer SYS/BIOS and compiler.

    Regards,
    Garrett

  • Hi Garrett,

    Thank you very much for this explanation.

    Best regards,

    Gilbert