This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM2434: AM2434 GPIO Interrupt Latency

Part Number: AM2434

Hi expert,

     Customer has question about how to shorten interrupt latency. According to https://e2e.ti.com/support/microcontrollers/arm-based-microcontrollers-group/arm-based-microcontrollers/f/arm-based-microcontrollers-forum/1097703/lp-am243-low-latency-interrupts?tisearch=e2e-sitesearch&keymatch=GPIO%252525252520latency#

The interrupt latency can be less than 100ns. 

We modify the demo code gpio_input_interrupt with nortos.

1. use GPIO1_35 as interrupt trigger source.

2. tied GPIO1_36 to GPIO1_35 and toggle from low to high to trigger interrupt every second.

3. In ISR, just toggle GPIO1_8 to indicate first line of ISR is executed.

here is the test code.

gpio_latency_am243x-lp.zip

We manipulate registers directly:

static void __attribute__((section("isr_tcma"))) GPIO_bankIsrFxn(void *args)
{
*GPIO1_8_SET_ADDRESS = GPIO1_8_MASK;
*GPIO1_8_CLEAR_ADDRESS = GPIO1_8_MASK;
*GPIO1_35_INTSTAT_ADDRESS = GPIO1_35_INTSTAT_MASK;
gGpioIntrDone++;
}

ISR is set to FIQ and put to TCMA.

Since this is a simple test code, we put almost everything into TCMA and TCMB0 except .sysmem. We don't use malloc function so it should not be problem

GROUP {
.text.hwi: palign(8)
.text.cache: palign(8)
.text.mpu: palign(8)
.text.boot: palign(8)
.text:abort: palign(8) /* this helps in loading symbols when using XIP mode */
isr_tcma: palign(8)
} > R5F_TCMA

/* This is rest of code. This can be placed in DDR if DDR is available and needed */
GROUP {
.text: {} palign(8) /* This is where code resides */
.rodata: {} palign(8) /* This is where const's go */
} > R5F_TCMB0

/* This is rest of initialized data. This can be placed in DDR if DDR is available and needed */
GROUP {
.data: {} palign(8) /* This is where initialized globals and static go */
.bss: {} palign(8) /* This is where uninitialized globals go */
RUN_START(__BSS_START)
RUN_END(__BSS_END)
.stack: {} palign(8) /* This is where the main() stack goes */
} > R5F_TCMA

/* This is rest of uninitialized data. This can be placed in DDR if DDR is available and needed */
GROUP {
.sysmem: {} palign(8) /* This is where the malloc heap goes */\
} > MSRAM

/* This is where the stacks for different R5F modes go */
GROUP {
.irqstack: {. = . + __IRQ_STACK_SIZE;} align(8)
RUN_START(__IRQ_STACK_START)
RUN_END(__IRQ_STACK_END)
.fiqstack: {. = . + __FIQ_STACK_SIZE;} align(8)
RUN_START(__FIQ_STACK_START)
RUN_END(__FIQ_STACK_END)
.svcstack: {. = . + __SVC_STACK_SIZE;} align(8)
RUN_START(__SVC_STACK_START)
RUN_END(__SVC_STACK_END)
.abortstack: {. = . + __ABORT_STACK_SIZE;} align(8)
RUN_START(__ABORT_STACK_START)
RUN_END(__ABORT_STACK_END)
.undefinedstack: {. = . + __UNDEFINED_STACK_SIZE;} align(8)
RUN_START(__UNDEFINED_STACK_START)
RUN_END(__UNDEFINED_STACK_END)
} > R5F_TCMA

/* Sections needed for C++ projects */
GROUP {
.ARM.exidx: {} palign(8) /* Needed for C++ exception handling */
.init_array: {} palign(8) /* Contains function pointers called before main */
.fini_array: {} palign(8) /* Contains function pointers called after main */
} > R5F_TCMB0

Set to release mode and optimized level to fast. 

program to the device and measure the test result.

You can see the ISR is clean. It still takes 309ns. 

The customer is target to toggle GPIO in 200ns. Can you please advise what more we can do?

Regards

Andre

 

  • Hi Andre

    I will work on this thread with the internal team support and get back to you shortly

    Regards

    Sri Vidya

  • Hi Andre,

    Here is the screen capture of the AM243x Out-of-Box Benchmark demo. It shows the interrupt latency number is around 160ns:

      

    The only difference is that the benchmark demo is using the timer and the timer counter register to calculate the time from the interrupt trigger to the first instruction execution, while you are using GPIO interrupt and the GPIO output. The R5F interrupt is using the same VIM, so I do not think the extra latency is VIM or HwiP related. I would think it comes from the GPIO output. Can you measure the execution time of the *GPIO1_8_SET_ADDRESS = GPIO1_8_MASK using DPL function: CycleCounterP_getCount32.

    In fact the scope capture you have is sort of confirmed that the **GPIO1_8_CLEAR_ADDRESS = GPIO1_8_MASK" took about 200ns (from high to low).

    Best regards,

    Ming

  • Ming,

        Q1: What is the FIQ and IRQ interrupt latency (cycles count) on R5F?

        Q2: ~200ns GPIO latency is confirmed?

    Regards

    Andre

     

  • Hi Andre,

    Q1: The above measurement is for IRD interrupt latency. I have not measured the FIQ interrupt latency.

    Q2: We have not confirmed the ~200ns GPIO latency. According to the AM243x datasheet, the GPIO1 latency should be 2P+2.6, P is the GPIO functional clock period in ns which should be 500Mhz/4 = 125Mhz = 8ns, so it should 18.6ns.

    We think the extra delay 184ns-18.6ns =  165.4ns is most like the interrupt latency, because the "*GPIO1_8_SET_ADDRESS = GPIO1_8_MASK" in the GPIO_bankIsrFxn may cause another GPIO interrupt. Can you do a test for GPIO output without GPIO interrupt enabled?

    Best regards,

    Ming 

  • Ming,

      This is very simple test program: GPIO1_36 toggle GPIO1_35 every second. GPIO1_35 is only Interrupt in the system. There is no other interrupt enabled in this test program. ISR only do one thing toggle GPIO1_8 and clear the status. Please refer to source code I attached before.

        Please don't mix up with an other discussion thread. This thread we focus on interrupt latency. 

    You can see I used two channel to capture the result. GPIO1_36 and GPIO1_35 are connected together.

    1. GPIO1_36 rise and trigger FIQ in the channel1.

    2. GPIO1_8 toggle at channel2.

    3. We focus on the time between GPIO1_36's rising edge to GPIO1_8 rising edge. It takes 309ns.

    The mask only have one bit setup. 

    #define GPIO1_8_MASK          ((uint32_t)0x00000100)

    The system only have  one interrupt.  So I don't understand why you concluded  "*GPIO1_8_SET_ADDRESS = GPIO1_8_MASK" in the GPIO_bankIsrFxn may cause another GPIO interrupt.

    and again here, we focus on interrupt latency. base on your calculation.  The Interrupt latency should be 309ns - 18.6ns  ~= 290ns. 

    Would you please check anything wrong in code and how to improve interrupt latency to 100ns.

    Regards

    Andre

    Again don't mixed up with another thread which focus on the latency between set GPIO register to GPIO level change.

     

     

     

  • Hi AndreTseng,

    We already are in discussion for GPIO latency on a different thread.

    We need to confirm the issue in two stages -

    • Step 1 - We will confirm on the GPIO output latency.
    • Step 2 - We will confirm on the GPIO input latency as well.

    As you mentioned the performance on AM263X-LP in another thread, can you share the same results for AM263x-LP. It would be better for apple to apple comparison.

    Thanks and Regards,
    Aakash

  • Askash,

        Please refer to result on LP-AM2634, GPIO44 is interrupt source, GPIO65 is toggled in ISR. ISR is running in TCM, MPU for GPIO registers area is set to strongly ordered.

    As the measurement, Interrupt latency is 440ns, and GPIO latency is 34ns which is identical to the measurement in another thread.  https://e2e.ti.com/support/microcontrollers/arm-based-microcontrollers-group/arm-based-microcontrollers/f/arm-based-microcontrollers-forum/1122978/am2434-am2434-gpio-interrupt-latency

    here is test code:

    gpio_interrupt_latency_am263x-lp.zip

    For LP-AM243x measurement, even I minus GPIO latency, the interrupt latency = 309ns -184ns = 125ns. Looks like we still have some effort to do to achieve 100ns latency (if 100ns is minimum interrupt latency). Can we have some suggestion to customer?

    Regards

    Andre

          

  • Hi AndreTseng,

    As per our discussion on email, closing this thread.

    BR,
    Aakash