This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CCS/EK-TM4C1294XL: 'Slow' GPIO?

Part Number: EK-TM4C1294XL

Tool/software: Code Composer Studio

A new project has shown up.

We want to read in a (long) string of 32 data words from our equipment and send it onto ethernet.

Just for testing the GPIO speed, I created a small program, that read 32 bits in from ports A, K, M (8 bit each), C and D (4 bit each), stitch it all together in a single uint32, and then back.

I added a toggle output so I could measure loop time with an oscilloscope.

It turns out one tour through the loop takes 1.3 us.  If I comment out reading the 32 bits (assigning values to a-e instead of reading in from ports) , the loop takes 300 ns.

In other words, one port access takes 200 ns, while everything else in the loop takes 100.

In this project, however, I need at least a factor 3 up in my data rate!

Are the used macros simply too slow? (alternatives?)

Or is the processor, running at 120 MHz, not faster than that?

uint32_t Get32(void)
{
    int a,b,c,d,e;
    uint32_t res;

    a=GPIOPinRead(GPIO_PORTA_BASE,0xFF);           //Read port A
    b=GPIOPinRead(GPIO_PORTK_BASE,0xFF);           //Read port K
    c=GPIOPinRead(GPIO_PORTM_BASE,0xFF);           //Read port M
    d=GPIOPinRead(GPIO_PORTD_BASE,0x0F);           //Read port D
    e=GPIOPinRead(GPIO_PORTC_BASE,0xF0);           //Read port C

    res = a + b<<8 + c<<16 + d<<24 + e<<28;

    return res;
}

void main(void)
{
    int Indicator;

    SysCtlClockFreqSet((SYSCTL_XTAL_25MHZ | SYSCTL_OSC_MAIN | SYSCTL_USE_PLL | SYSCTL_CFG_VCO_480), 120000000);

    Setup();


    //
    // Loop forever.
    //
    Indicator=0;

    while (1) {
        Get32();
        if (Indicator==0)
            Indicator=4;
        else
            Indicator=0;

        GPIOPinWrite(GPIO_PORTN_BASE,GPIO_PIN_2,Indicator);
    }
}

  • Use of the generalized GPIOPinRead() function does add some overhead that can be avoided in this application. This may be a good case to use direct register reads (which we usually try to avoid) or assembly language (which most people try to avoid). Let me work on this a bit and see if I can come up with a more efficient solution.

  • I did a similar test toggling PN0.

    #include <stdint.h>
    #include <stdbool.h>
    #include "inc/hw_types.h"
    #include "inc/hw_memmap.h"
    #include "driverlib/sysctl.h"
    #include "driverlib/gpio.h"
    
    //*****************************************************************************
    //
    // Define pin to LED mapping.
    //
    //*****************************************************************************
    #define USER_LED1  GPIO_PIN_0
    #define USER_LED2  GPIO_PIN_1
    
    void Setup(void)
    {
        //
        // Enable and wait for the port to be ready for access
        //
        SysCtlPeripheralEnable(SYSCTL_PERIPH_GPIOA);
        SysCtlPeripheralEnable(SYSCTL_PERIPH_GPIOK);
        SysCtlPeripheralEnable(SYSCTL_PERIPH_GPIOM);
        SysCtlPeripheralEnable(SYSCTL_PERIPH_GPIOD);
        SysCtlPeripheralEnable(SYSCTL_PERIPH_GPIOC);
        SysCtlPeripheralEnable(SYSCTL_PERIPH_GPION);
        while(!SysCtlPeripheralReady(SYSCTL_PERIPH_GPION))
        {
        }
        
        //
        // Configure the GPIO port for the LED operation.
        //
        GPIOPinTypeGPIOOutput(GPIO_PORTN_BASE, (USER_LED1|USER_LED2));
    
    }
    
    uint32_t Get32(void) // Total loop time 1.42S
    {
        int a,b,c,d,e;
        uint32_t res;
    
        a=GPIOPinRead(GPIO_PORTA_BASE,0xFF);           //Read port A
        b=GPIOPinRead(GPIO_PORTK_BASE,0xFF);           //Read port K
        c=GPIOPinRead(GPIO_PORTM_BASE,0xFF);           //Read port M
        d=GPIOPinRead(GPIO_PORTD_BASE,0x0F);           //Read port D
        e=GPIOPinRead(GPIO_PORTC_BASE,0xF0);           //Read port C
    
        res = a + b<<8 + c<<16 + d<<24 + e<<28;
    
        return res;
    }
    
    uint32_t Get32b(void) // Total loop time 740nS
    {
        uint32_t ret;
    
        ret = HWREG(GPIO_PORTA_BASE + (0xFF << 2));
        ret |= (HWREG(GPIO_PORTK_BASE + (0xFF << 2)) << 8);
        ret |= (HWREG(GPIO_PORTM_BASE + (0xFF << 2)) << 16);
        ret |= (HWREG(GPIO_PORTD_BASE + (0x0F << 2)) << 24);
        ret |= (HWREG(GPIO_PORTC_BASE + (0xF0 << 2)) << 28);
    
        return ret;
    }
    
    uint32_t Get32c(void) // Total loop time 160nS
    {
        return 0;
    }
    
    void main(void)
    {
        int Indicator;
    
        SysCtlClockFreqSet((SYSCTL_XTAL_25MHZ | SYSCTL_OSC_MAIN | SYSCTL_USE_PLL | SYSCTL_CFG_VCO_480), 120000000);
    
        Setup();
    
    
        //
        // Loop forever.
        //
        Indicator=0;
    
        while (1)
        {
            Get32c();
            Indicator ^= USER_LED1;
            GPIOPinWrite(GPIO_PORTN_BASE,USER_LED1,Indicator);
        }
    }
    

    Using the TivaWare drivers, I saw a loop time of 1.42uS, similar to your 1.3uS.

    Using direct register writes I see that down to 740nS, about half.

    Using a dummy function that is basically removed by the optimizer, the loop time is 160nS.

    Looking at the assembly generated by the compiler with the direct register reads, I don't think you can get much faster in assembly.

              Get32b():
    00000604:   B510                push       {r4, lr}
     82           ret = HWREG(GPIO_PORTA_BASE + (0xFF << 2));
    00000606:   4809                ldr        r0, [pc, #0x24]
     83           ret |= (HWREG(GPIO_PORTK_BASE + (0xFF << 2)) << 8);
    00000608:   4B09                ldr        r3, [pc, #0x24]
     84           ret |= (HWREG(GPIO_PORTM_BASE + (0xFF << 2)) << 16);
    0000060a:   4A0A                ldr        r2, [pc, #0x28]
     85           ret |= (HWREG(GPIO_PORTD_BASE + (0x0F << 2)) << 24);
    0000060c:   490A                ldr        r1, [pc, #0x28]
     82           ret = HWREG(GPIO_PORTA_BASE + (0xFF << 2));
    0000060e:   6800                ldr        r0, [r0]
     83           ret |= (HWREG(GPIO_PORTK_BASE + (0xFF << 2)) << 8);
    00000610:   681B                ldr        r3, [r3]
     84           ret |= (HWREG(GPIO_PORTM_BASE + (0xFF << 2)) << 16);
    00000612:   6812                ldr        r2, [r2]
     85           ret |= (HWREG(GPIO_PORTD_BASE + (0x0F << 2)) << 24);
    00000614:   F8D14C7C            ldr.w      r4, [r1, #0xc7c]
     86           ret |= (HWREG(GPIO_PORTC_BASE + (0xF0 << 2)) << 28);
    00000618:   6809                ldr        r1, [r1]
     83           ret |= (HWREG(GPIO_PORTK_BASE + (0xFF << 2)) << 8);
    0000061a:   EA402003            orr.w      r0, r0, r3, lsl #8
     84           ret |= (HWREG(GPIO_PORTM_BASE + (0xFF << 2)) << 16);
    0000061e:   EA404002            orr.w      r0, r0, r2, lsl #16
     85           ret |= (HWREG(GPIO_PORTD_BASE + (0x0F << 2)) << 24);
    00000622:   EA406004            orr.w      r0, r0, r4, lsl #24
     86           ret |= (HWREG(GPIO_PORTC_BASE + (0xF0 << 2)) << 28);
    00000626:   EA407001            orr.w      r0, r0, r1, lsl #28
    0000062a:   BD10                pop        {r4, pc}

  • Thanks, this may just be the input I need to go on.

    I will try some of your code myself, and then bring the results on to the group. Credit to this forum, of course!

    Assembly is probably out of the question. I did pretty well in 80286 code (non-DPMI) in my time, but it all sort of disappeared with Win 95.

    In the end we could lower our data transfer rate to something closer to normal use. Can fairly easily be done in my FPGA code.

    The high speed is very seldom in use, so....

    In normal use the transfer rate is closer to 250 kHz, and around there we do have a fighting chance!