This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM5728: IPU-M4 Unexpected GPIO Latency

Part Number: AM5728

I am using IPU1 on the Sitara processor, where image is loaded at boot time. The IPU is operating in a bare metal aspect, and the code is particularly simple.

The IPU is operating at 212 MHz, according to running the command "omapconf show opp".

GPIO1_28 was removed from the Linux device tree, so no processor is initializing or controlling this GPIO besides what is shown in the samples below.

I am targetting the 0x6xxx_xxxx address space since the the IPU has access to physical memory, but it is found in 0x6 instead of 0x4.

I am driving GPIO1_28 via writing to the address to set and then clear the register, in rapid fire. Below is the code, and the configuration of the GPMC:

main() {

*(uint32_t*)0x6A003458 = 0x0000000E; //Set GPMC_A6 to GPIO1_28

*(uint32_t*)0x6AE10134 &= ~(1 << 28); //Configure to enable output on GPIO1_28

uint32_t shiftValue = 1 << 28;
while(1)
{
    *(uint32_t*) 0x6AE10194 = shiftValue; //Set the GPIO
    *(uint32_t*) 0x6AE10190 = shiftValue; //Clear the GPIO
}

}

Currently, I am probing the Gate of a 2n7002 with a Saleae logic analyzer, which is connected to GPIO1_28, and I am seeing at max toggling rate of 2.5 Mhz (200ns per set/clear). I would be expecting quite a bit faster than this when clocked at 212 MHz.

Is there anything I am missing in my configuration in order top to be able to drive this pin faster? Is there any documentation describing how quickly the GPIO's could be toggled / read? The end goal is to read a parallel bus at higher frequency than what can be done with writing here, but I have not seen any documentation describing how quickly these interfaces can be accessed from a real time system.

  • Hi Lucas,

    GPIO is connected to L4_PER1 that introduces delay, see TRM Figure 14-1. Interconnect Overview, and Figure 27-1. General-Purpose Interface Overview.

    Although it's measured with ARM and PRU ICSS core, the GPIO toggling benchmark result on page 11 might help: processors.wiki.ti.com/.../Sitara_boot_camp_pru-module1-hw-overview.pdf

    Regards,
    Garrett
  • Hi Garrett,

    Thanks for the response! The PRU document you described was useful for explaining how I could get to potentially 5 Mhz from my 2.5 if I needed to.

    Does this sort of metric hold up for reading GPIO's through the L4 interconnect? Meaning, if I were to reconfigure the clocking architecture to drive GPIO8 (for example) at its maximum clock rate, (200+ MHz) and reconfigure timing for the individual pads, would the IPU be able to read signal faster than the max output of 5 Mhz?

    For a clarifying point, Is the reason why the PRU is able to get to that output speed due to it bypassing the L4 interconnect?

  • Lucas,

    PRU ICSS has its dedicated enhanced GPIO that directly connects to PRU core, see Figure 30-1. PRU-ICSS Overview in TRM. The eGPIO pins are controlled through RPU R30 register bits as shown in the example code in the slides.

    I could not find the GPIO timing in AM572x data sheet, but think yes you should be to read signal faster if using maximum clock rate and reconfigure timings for the individual pads.

    Regards,
    Garrett
  • Hi Garrett,

    Thanks for the information so far. My understanding at this moment, is I can change the value in CONFIG_REG_2 (following the outline procedures for recal and updating it) to update the GPIO clock accordingly, and if I were to change the respective CFG_xxx_IN registers to change their timings, could I potentially start reading a 6 MHz signal? It looks like GPIO8_ICLK can be clocked at 266 Mhz using L4PER_L3_GICLK.

    Otherwise, it looks like the manual timings for VOUT3_MANUAL2 enables a 1.2ns delay for vin1_d8. Can I just enable the mode select in the PAD register and select MANUAL2 as the setting?
  • Lucas, you might be able to optimize the the GPIO toggle slightly by changing timings, but ultimately the bottleneck through the L4 interconnect is still there.  You would need something like write buffering (may be able to do this with MMU) or maybe a burst write with a DMA.  Single cycle accesses from the IPU will have very high latencies which will be hard to optimize. 

    For what you are trying to do, Garrett's suggestions of using the PRU is recommended.  Among other things, the PRU is exactly intended for low latency GPIO accesses.

    Regards,

    James 

  • After plenty of testing, I am in agreement. I don't think I could achieve faster than MAYBE 3Mhz through the L4 interconnect. I mostly peaked at around reading 1.5Mhz, which is a quarter of what I am looking to do. With the PRU, I should have no issues.