This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AHB_BUS toggle speed

Other Parts Discussed in Thread: TM4C123GH6PM

Hello guys. I want to reach toggle speed up to the CPU clock speed or at least the maximum achievable  toggle speed on I/O pin. I read that the AHB buss is the one I need to use for the fastest GPIO frequency. I tried to make a code that can toggle GPIO with the max possible speed, but I reached only 3.7Mhz at 40Mhz clock speed.

int main(void)
{
	SysCtlClockSet (SYSCTL_XTAL_16MHZ | SYSCTL_SYSDIV_5 | SYSCTL_USE_PLL | SYSCTL_OSC_MAIN);
    SysCtlPeripheralEnable(SYSCTL_PERIPH_GPIOF);
    SysCtlGPIOAHBEnable(SYSCTL_PERIPH_GPIOF);
    SysCtlDelay(20);

    GPIOPinTypeGPIOOutput(GPIO_PORTF_AHB_BASE, GPIO_PIN_1 | GPIO_PIN_2 | GPIO_PIN_3 );
    while(1)
    {
    	GPIO_PORTF_AHB_DATA_R  = 0x0E;
    	GPIO_PORTF_AHB_DATA_R  = 0x00;
    }
}

I believe the TM4C123G can go further.

Thank you in advance.

  • Does not your "while" loop degrade your GPIO toggle?   (Ans:  Surely!)

    Suggest

    while(1)
        {
        GPIO_PORTF_AHB_DATA_R  = 0x0E;
        GPIO_PORTF_AHB_DATA_R  = 0x00;

        GPIO_PORTF_AHB_DATA_R  = 0x0E;
        GPIO_PORTF_AHB_DATA_R  = 0x00;

        GPIO_PORTF_AHB_DATA_R  = 0x0E;
        GPIO_PORTF_AHB_DATA_R  = 0x00;

        GPIO_PORTF_AHB_DATA_R  = 0x0E;
        GPIO_PORTF_AHB_DATA_R  = 0x00;

        GPIO_PORTF_AHB_DATA_R  = 0x0E;
        GPIO_PORTF_AHB_DATA_R  = 0x00;

        GPIO_PORTF_AHB_DATA_R  = 0x0E;
        GPIO_PORTF_AHB_DATA_R  = 0x00;


        }

    Bet this will enhance I/'O rate.  You may wish to revert to the old/slow bus - just for comparison purposes.

    And - I know of NO MCUs which, "reach toggle speed up to the CPU clock speed!"  (my "knowledge" is unattackable - reality may prove different...)

    BTW - believe your usage of Direct Register is clever.  (properly bounded - used only where most needed!)

    As an experiment (and learning exercise) you may wish to employ assembler - as well as API Port Output function - for a complete "exe" comparison.

     

  • Unfortunately it doesn't help either. The problem is that the compiler performs read-modify-write operations for each GPIO state change. I want only to write to the output pin. Currently what the compiler does is to read than modify than write the changed data back to the pin. I want to write using one instruction, just write to the pin register. May be the compiler optimization are not properly set? I don't know.
  • Radoslav Marinov said:
    Unfortunately it doesn't help either.

    Sure does - w/our pro IAR!   Scope cap shows nice, "speed up."  (I'm almost in "disbelief" that the "relaxation" of the while loop - as shown - does not improve It has to!)

    Do check your compiler optimizations - I would not have made the effort on a whim nor guess...

    Suggested ASM as well - but that effort is yours...

  • Radoslav Marinov said:
    Currently what the compiler does is to read than modify than write the changed data back to the pin. I want to write using one instruction, just write to the pin register. May be the compiler optimization are not properly set? I don't know.

    On a newly created CCS project, in which the CCS Project Properties -> CCS Build -> ARM Compiler -> Optimization -> Optimization Level was set to "off" the compiler generated the following assembler for the while loop, with repeated register loads for the GPIO_PORTF_AHB_DATA_R peripheral address and values to be written:

    ||$C$L1||:    
    	.dwpsn	file "../main.c",line 23,column 9,is_stmt,isa 1
            LDR       A2, $C$CON4           ; [DPU_3_PIPE] |23| 
            MOVS      A1, #14               ; [DPU_3_PIPE] |23| 
            STR       A1, [A2, #0]          ; [DPU_3_PIPE] |23| 
    	.dwpsn	file "../main.c",line 24,column 9,is_stmt,isa 1
            LDR       A2, $C$CON4           ; [DPU_3_PIPE] |24| 
            MOVS      A1, #0                ; [DPU_3_PIPE] |24| 
            STR       A1, [A2, #0]          ; [DPU_3_PIPE] |24| 
    	.dwpsn	file "../main.c",line 21,column 11,is_stmt,isa 1
            B         ||$C$L1||             ; [DPU_3_PIPE] |21| 
            ; BRANCH OCCURS {||$C$L1||}      ; [] |21| 
    

    By simply changing the Optimization Level to "0 Register Optimizations" the compiler generated the following more efficient code for the while loop:

            LDR       A1, $C$CON4           ; [DPU_3_PIPE] |23| 
            MOVS      A2, #0                ; [DPU_3_PIPE] |24| 
            MOVS      A3, #14               ; [DPU_3_PIPE] |23| 
    ;* --------------------------------------------------------------------------*
    ;*   BEGIN LOOP ||$C$L1||
    ;*
    ;*   Loop source line                : 21
    ;*   Loop closing brace source line  : 25
    ;*   Known Minimum Trip Count        : 1
    ;*   Known Maximum Trip Count        : 4294967295
    ;*   Known Max Trip Count Factor     : 1
    ;* --------------------------------------------------------------------------*
    ||$C$L1||:    
    	.dwpsn	file "../main.c",line 23,column 9,is_stmt,isa 1
            STR       A3, [A1, #0]          ; [DPU_3_PIPE] |23| 
    	.dwpsn	file "../main.c",line 24,column 9,is_stmt,isa 1
            STR       A2, [A1, #0]          ; [DPU_3_PIPE] |24| 
    	.dwpsn	file "../main.c",line 21,column 11,is_stmt,isa 1
            B         ||$C$L1||             ; [DPU_3_PIPE] |21| 
            ; BRANCH OCCURS {||$C$L1||}      ; [] |21| 
    

  • cb1- said:
    I know of NO MCUs which, "reach toggle speed up to the CPU clock speed!

    It looks like the PRU-ICSS can achieve a toggle speed of the CPU clock speed - look at page 11 of http://processors.wiki.ti.com/images/3/34/Sitara_boot_camp_pru-module1-hw-overview.pdf 

  • Greetings Chester,

    Thanks for that info - I was being bit "provocative" w/that comment - but truly had never encountered an MCU able to toggle GPIO @ system clock speed.

    Have to wonder if some trade-off or other issue or sensitivity may arise (i.e. temperature, lot, aging) with such a capability.  Again thanks for your input.

    One final (original post) comment - do not you agree that poster's code - making that while "roll over" so dominant - has to restrict output toggle rate?

  • Chester Gillon said:
    cb1-
    I know of NO MCUs which, "reach toggle speed up to the CPU clock speed!

    It looks like the PRU-ICSS can achieve a toggle speed of the CPU clock speed - look at page 11 of http://processors.wiki.ti.com/images/3/34/Sitara_boot_camp_pru-module1-hw-overview.pdf 

     MSP430 with fast timer D can toggle faster than cpu clock,

     C2000 series with accelerator can toggle more faster than cpu clock and faster instruction cycle, EPWM too is faster than cpu clock.

     Same rule for FreeScale series starting from old good 68332.

     On 680x0 series a coprocessor interface where available to perform task too heavy for main cpu. ( I remember a big cad where an external accelerator was attached to main cpu to do simulation in real time, this was a lot of year ago maybe acsis or similar name)

     Yes it is not cpu code driven but toggle more faster than cpu clock.

  • I fear that we are (no longer) comparing "apples to apples."

    If we (truly) want high speed toggle - beyond (normal) MCU control means - does not FPGA/other better, "fill that bill?"
  • The TM4C123GH6PM data sheet says for the GPIO:

    Fast toggle capable of a change every clock cycle for ports on AHB, every two clock cycles for ports on APB

    cb1_mobile said:
    One final (original post) comment - do not you agree that poster's code - making that while "roll over" so dominant - has to restrict output toggle rate?

    From looking at the timing given for the 3.3.1. Cortex-M4 instructions, and annotating the expected cycle count of the instructions the unoptimized code gives:

    ||$C$L1||:    
        .dwpsn  file "../main.c",line 23,column 9,is_stmt,isa 1
            LDR       A2, $C$CON4           ; 2 cycles
            MOVS      A1, #14               ; 1 cycle
            STR       A1, [A2, #0]          ; 1 cycle
        .dwpsn  file "../main.c",line 24,column 9,is_stmt,isa 1
            LDR       A2, $C$CON4           ; 2 cycles
            MOVS      A1, #0                ; 1 cycle
            STR       A1, [A2, #0]          ; 1 cycle
        .dwpsn  file "../main.c",line 21,column 11,is_stmt,isa 1
            B         ||$C$L1||             ; 2 to 4 cycles (depending on the alignment and width of the target instruction, and whether the processor manages to speculate the address early)
            ; BRANCH OCCURS {||$C$L1||}      ; [] |21| 

    So, each loop will take 10 to 12 cycles, which at a 40MHz clock rate means a toggle rate of 3.3MHz to 4MHz - which is within the measured toggle rate of 3.7MHz in the first post.

    Whereas with the optimized code gives:

    ||$C$L1||:    
        .dwpsn  file "../main.c",line 23,column 9,is_stmt,isa 1
            STR       A3, [A1, #0]          ; 1 cycle
        .dwpsn  file "../main.c",line 24,column 9,is_stmt,isa 1
            STR       A2, [A1, #0]          ; 1 cycle
        .dwpsn  file "../main.c",line 21,column 11,is_stmt,isa 1
            B         ||$C$L1||             ; 2 to 4 cycles

    So, each loop is expected to take 4 to 6 cycles, which at a 40MHz clock rate means a toggle rate of 6.66MHz to 10 MHz. With this optimized code the overhead of the branch is more that the instructions to toggle the GPIO port, so I would expect multiple toggles per loop would increase the toggle rate.

    This would need to be tested and timed to confirm the effect, as I may have missed some detail of pipelining stages in the processor and/or the effect of the write buffer.

  • Again - thanks Chester.

    Perhaps year past - we employed nearly that same "while loop" as poster - noted the toggle rate and then increased the number of toggles by 10 fold. Indeed - as expected - the while loop's inhibition of the toggle rate was much reduced.

    Poster's test methodology "fails" as it significantly subjects the GPIO toggle to the limits of the while loop's execution  - which should (properly) be avoided.  (or at least minimized...)


  • Replied to wrong user

    Why would you want to toggle the GPIO so fast?

    You could use a Timer PWM output at 40Mhz.


    If you really just want 1 pulse and 1 pulse only just get a interrupt on the PWM falling edge and stop the timer output - at 40Mhz it's a bit hard to do that on time even with assembly (it's half the processor clock after all)
    Why not send a SPI data 0x1 or 0x80 at max speed - use the TX line for that.
    For TM4C1294 at 120Mhz i belive it was 60Mhz, for the TM4C123 at 80Mhz it's 40Mhz if i am not mistaken.



    Also, look up this forum thread, it's about the stellaris but it should apply
     http://forum.stellarisiti.com/topic/395-how-fast-is-the-io/

  • Luis Afonso said:
    Why would you want to toggle the GPIO so fast?

    Luis - should not you ask that of o.p.?  

    I identified major weakness in poster's "test" methodology - and suggested he modify as well as consider ASM for fastest toggle.

    As a (general) answer - creating a programmable, easily controllable, fine-grained, stable, variable frequency oscillator has long been a, "holy grail" of engineering.  Modern MCUs have made major strides - yet high-frequency & "fine-grain" (frequency resolution) remain (somewhat) of a challenge.

  • (shouts mean words to the forum)... now it's user specific reply? Well, can i come here once without getting annoyed at a new "feature"?

    I'm sorry cb1, i was not aware of this. Yes i think a FPGA or a dedicated circuit would be the best way.

    I will try to correct my reply so it's replying to the o.p
  • Now to the right user..


    Why would you want to toggle the GPIO so fast?

    You could use a Timer PWM output at 40Mhz.


    If you really just want 1 pulse and 1 pulse only just get a interrupt on the PWM falling edge and stop the timer output - at 40Mhz it's a bit hard to do that on time even with assembly (it's half the processor clock after all)
    Why not send a SPI data 0x1 or 0x80 at max speed - use the TX line for that.
    For TM4C1294 at 120Mhz i belive it was 60Mhz, for the TM4C123 at 80Mhz it's 40Mhz if i am not mistaken.



    Also, look up this forum thread, it's about the stellaris but it should apply
    http://forum.stellarisiti.com/topic/395-how-fast-is-the-io/
  • Thanks to all of you guys for guiding me. I forgot to mention that I am really newbie in this matter. So with the register's optimization I managed to reach speed of 10MHz as chester pointed out. What I wanted to acheive was to change the state of a GPIO with one instruction. I have, so thank you.