This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM6548: PCIe endpoint configuration

Part Number: AM6548

Dear TI support team,

we are using the PCIe subsystem of the AM6548 to establish a PCIe connection to an x86 CPU. In this setup, the AM6548 runs in Endpoint mode, our code is running on the R5f using TI-RTOS / Processor SDK 06.01.

Since the PCIe driver that comes with the processor SDK (pdk_am65xx_1_0_6\packages\ti\drv\pcie\src\v2) appears to be an example specifically tailored to the connection of two AM* devices connected together we've written our own code for setting up the PCIe endpoint.

The endpoint is detected by the RC and accesses to configuration space work. The lspci command executed on the x86 shows the default BAR layout:

# lspci -vvv -n -s 01:00.0
...
        Region 0: Memory at <unassigned> (64-bit, non-prefetchable) [size=1M]
        Region 2: Memory at <unassigned> (64-bit, non-prefetchable) [size=8M]
        Region 4: I/O ports at <unassigned> [disabled] [size=256]
        Region 5: Memory at <unassigned> (32-bit, non-prefetchable) [size=2G]
...

This layout matches with the layout from section 12.2.2.4.15 of the TRM (SPRUID7E) - even though this section is only valid for SR2.0. A corresponding section for SR1.0 does not exist.

  • Why is section 12.2.4.15 of the TRM only valid for SR2.0? How about SR1.0?

We want to change this layout. Changing the BAR type and memory type (to MEM 32-bit) works and is correctly detected by the RC.

However changing the BAR's size does not work as expected. According to the manual, the mask register is a shadow register at the address of the BAR register.

We performed the following steps to change it (as mentioned by the manual):

  • Set the field DBI_CS2 in PCIE_EP_CMD_STATUS (Bit 5 at 05500004h)
  • Write the new BAR mask (e.g. write 0000ffffh to register 05501024h)
  • Clear the field DBI_CS2 in PCIE_EP_CMD_STATUS

After that, the size of the BAR did not change.

When the value ffffffffh will be written to 5501024h and immediately read back, the returned value is still 80000000h. This corresponds to a BAR size of 2G (like in the default configuration).

The KeyStoneArchitecture PCI User Guide (which does not apply to the AM6548) suggests that Bit 0 of the BAR mask register enables or disables the BAR.
Indeed after setting DBI_CS2, writing the value fffffffeh and clearing DBI_CS2, the BAR will be disabled. This implies that writes to the mask register produce a result. However changing the BAR's size does not work.

There is code in the PCIe sample that comes with the processor SDK, but since that code never uses the content of the BAR registers that example wouldn't detect if programming of the BAR size went wrong.

  • How can the BAR size be changed?
  • Is there a BAR_ENABLE bit for the AM6548 too (The manual does not mention it, or rather it doesn't document the BAR mask registers at all)?
  • Which mask values correspond to which BAR sizes (our guess is: 00000fffh is 4k, 00001fffh is 8k, and so on)?

Best Regards,

Dominic

  • Hi Dominic,

    The 12.2.2.4.15 of the TRM (SPRUID7E) is just a clarification. The function is there for both SR1.0 and SR 2.0.

    According to the TRM

    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    12.2.2.5.1.79 PCIE_EP_BAR0_REG Register (Offset = 1010h) [reset = 4h]

    PCIE_EP_BAR0_REG is shown in Figure 12-803 and described in Table 12-1882.

    Return to Summary Table.

    BAR0 and BAR Mask. For a description of this standard PCIe register, see the PCI Express Specification.

    The mask for this BAR exists (if implemented) as a shadow register at this address. The assertion of CS2

    (that is, assert the dbi_cs2 input, or the CS2 address bit for the AXI bridge) is required to write to the

    second register at this address.

    -----------------------------------------------------------------------------------------------------------

    BAR0 and BAR0 mask is using the same register (0x1010). The DBI_CS2 is serving as the switch. If DBI_CSI2 is set to 1 the read/write to the register will be mask, otherwise it is the start address.

    • Set the field DBI_CS2 in PCIE_EP_CMD_STATUS (Bit 5 at 05500004h)
    • Write the new BAR mask (e.g. write 0000ffffh to register 05501024h)
    • Clear the field DBI_CS2 in PCIE_EP_CMD_STATUS

    will set the new BAR mask value, but to read it, you will need the following:

    • Set the field DBI_CS2 in PCIE_EP_CMD_STATUS (Bit 5 at 05500004h)
    • Read the new BAR mask (e.g. read 0000ffffh to register 05501024h)
    • Clear the field DBI_CS2 in PCIE_EP_CMD_STATUS

    Ming

  • Hello Ming,

    we're aware of BAR0 and BAR0 mask being accessed via the same address, and we've set the DBI_CS2 field, but we seem to be unable to read-back the value programmed into the BAR0 mask register.

    We're able to see an effect of writing to this register if we clear bit [0] which disables the corresponding BAR. If we clear bit [0] the bar shows up as disabled on the host Linux.

    No matter what we set the other bits [30]-[01] to, the size of the BAR is always determined to be the same (BAR0 1M, BAR2 8M, BAR5 2G).

    This matches the description from an older version of the DWC PCIe reference manual that says

    • BAR mask registers are writeable only, not readable
    • To disable any BAR, the application can write a 0 to bit 0 of the corresponding BAR Mask register.
    • If `BAR0_MASK_WRITABLE_N = 0, the BAR 0 Mask register is not writable through the DBI.

    Is there a chance that the PCIe controller in the AM65x SR1.0 was implemented with BARn_MASK_WRITEABLE_N = 0?

    Have you been able to verify that your description works for you? Are you able to read back the value written to the BAR mask register?

    Regards,

    Dominic

  • Hi Dominic,

    I have to check with the design team on this and get back to you soon.

    Ming

  • Hi Dominic,

    I got confirmation from our Linux team that the some BAR (0/1/2) size can be changed. Our RTOS code also changes the BAR0 and BAR 1 size in pcieCfgEP to PCIE_BAR_MASK (0x0FFFFFFF). Can you tried to change this to see whether it really works?

    Ming

  • Hello Ming,

    we've been further investigating this, and it seems that we're able to change the size of BARs 0, 2 and 5, but using the resizable BAR capability registers, not using the BAR mask registers.

    We don't fully understand the configurability of the BARs yet, but we've just performed a test where we configured BAR 0 to 8 MB and BAR 2 to 16 MB, and the BARs showed up in Linux with the desired size.

    We programmed PCIE_EP_RESBAR_CAP_REG_0_REG from 0x10 (1 MB) to 0x80 (8 MB) and PCIE_EP_RESBAR_CAP_REG_1_REG from 0xf0 (1, 2, 4 or 8 MB) to 0x100 (16 MB).

    I know that the RTOS code configures the BAR mask registers, but that example never relies on the content of the BARs. The example would work the same even if the BAR mask wasn't changed.

    Regards,

    Dominic

  • I've looked into this some more, and it seems that the setting of PCIE_EP_RESBAR_CAP_REG_0_REG controls the mask for BAR0. The largest size reported in the capability bits defines the number of bits in the BAR that can't be written from the host, i.e. if I write the capability register to 0xf0 (1, 2, 4 or 8 MB) the BAR register has bits [22:0] fixed to b0 (writing 0xffffffff yields 0xff800000).

    I checked if I could write the mask for BAR4, since that BAR doesn't have a resizeable BAR register, but it appears that this BAR is stuck at 256 byte.

    I'm still curious about how the PCI EP BARs are meant to be configured on the AM65x, since so far this is all trial and error.

    Ming Wei said:
    The 12.2.2.4.15 of the TRM (SPRUID7E) is just a clarification. The function is there for both SR1.0 and SR 2.0.

    I'm not sure what you're trying to tell me here. What is "the function" that you're referring to? What differences are there between SR1.0 and SR2.0?

    The first configuration appears to show the reset-default of the BAR configuration. The BAR registers configure the type of bar via bits [3:0] (prefetch, 32/64, mem/io), the RESBAR registers configure the supported sizes for BAR0, BAR2 and BAR5. This configuration should be just as valid for SR1.0, unless I'm missing something.

    The second configuration would be ONE possible configuration, if all BARs were configured as 32-bit, but I don't think it's the only possible configuration. BARs could be disabled and could have different types, BAR0/1 could be 32 bit while BAR2 might still be 64 bit, and so on.

    Is there a chance you could find out why the note limits this chapter to SR 2.0?

    Regards,

    Dominic

  • Hi Dominic,

    I am sorry for any confusion.

    I have confirmed that the 12.2.2.4.15 of the TRM (SPRUID7E) should be applicable to both SR1.0 and SR2.0. The SR 2.0 only is a typo.

    Ming

  • Hello Ming,

    we've made some progress, but I guess we still don't fully understand what options we have for configuring the PCIe EP.

    • What does TRGT0 and TRGT1 in Tables 12-1997 / 12-1998 refer to?
    • Which "application registers" are accessed via BAR0 (see 12.2.2.4.7.2 PCIe Inbound Address Translation, "BAR0 Exception for Inbound Address Translation")?
      • We assumed that maybe we would be able to access the PCIe EP register space by mapping BAR0 to 0x05500000, but apparently that didn't work
      • We realized that we can access our EP registers 0x000 to 0x200 at offset 0xD00 in our device's config space, but that's not using BAR0
      • We managed to map arbitrary (DDR) memory via BAR0, after setting the DEFAULT_TARGET bit in PCIE_EP_MISC_CONTROL_1_OFF, but we don't really understand WHY that works / what the implications of this bit are.
    • Is it possible to map arbitrary AM65x peripheral registers (e.g. PCIe, but also others) via an inbound iATU to allow them to be accessed from an external RC via PCIe?
    • The TRM tells us that we ought to be able to have 32 iATU regions in each direction (instead of 16) if we use only 32-bit addresses (see 12.2.2.4.7 PCIe Subsystem Address Translation) - how can we configure these 32 regions? The TRM only lists registers for 16 regions.
    • Table 12-1996 tells us that there is a "PCIe remote configuration space (Remote PF0)" at local offset 0xA0000 - what does that mean? We couldn't find any other reference to this offset.
      • Is it like a fixed outbound translation window to generate config space accesses?

    • What is the "PCIe IO configuration space (Remote IO)" at local offset 0xB0000 listed in Table 12-1996?
      • At first glance "IO configuration space" looks like a typo.
      • Is that like a fixed outbound translation window to generate I/O accesses?
    • What are the implicatons of the below table 12-1996, that tells us that we can't access the "PCIe local application registers" if the MSI-X address match is enabled?
      • We want the RC to be able to generate interrupts in the EP (AM65xx). As far as we understand we should use the PCIE_EP_MMRx_IRQ_* registers for this, but that means we need access to the "PCIe local application registers".
    • How does the "MSI-X address match" feature work? Chapter 12.2.2.4.4.1.3 MSI-X Interrupt Generation tells us that there is a "integrated MSI-X Transmit (iMSIX-TX) feature", but other than that there's not much information on this feature.
      • The description for the PCIE_EP_MSIX_ADDRESS_MATCH_* registers tells us that there is a "MSI-X Table RAM feature" if (MSIX_TABLE_EN=1), but I believe that MSIX_TABLE_EN is something that TI configured when integrating the PCIe IP core. Is the PCIe IP core implemented with that feature enabled in the AM65x?
      • Which address are we supposed to configure for the MSI-X address match? How does that address relate to the note below Table 12-1996?
    • How does the iATU "INVERT_MODE" work? The TRM tells us that if this bit is set for an iATU region an address match occurs if the address is outside of the defined range (base to limit), but how is the target address calculated in this case? For non-INVERT_MODE the target address is the offset in the defined range (base to limit) plus the configured target address, but that doesn't seem to make any sense for INVERT_MODE?
    • If we configure an inbound iATU for BAR match mode it seems that the base and limit are ignored, but the TRM tells us is if this is "working as intended" or if we're having some error in our configuration.
      • What we wanted to configure is a "scattered" mapping of a single BAR to multiple regions in our local address space, e.g.
        • BAR0, Offset 0x00000000 -> 0x80000000
        • BAR0, Offset 0x00100000 -> 0x90000000
      • A configuration like that seems to be possible if we don't use the BAR match mode but the address match mode instead, but that way we need some kind of synchronization with the RC, i.e. we need to wait for the RC to configure our BARs, then configure the inbound iATU with the address from the BAR.

    Regards,
    Dominic
  • Hi Dominic,

    • What does TRGT0 and TRGT1 in Tables 12-1997 / 12-1998 refer to?

    [MW] TRGT0 is used to access registers in the controller. TRGT1 is for data transfer

    • Which "application registers" are accessed via BAR0 (see 12.2.2.4.7.2 PCIe Inbound Address Translation, "BAR0 Exception for Inbound Address Translation")?
      • We assumed that maybe we would be able to access the PCIe EP register space by mapping BAR0 to 0x05500000, but apparently that didn't work
      • We realized that we can access our EP registers 0x000 to 0x200 at offset 0xD00 in our device's config space, but that's not using BAR0
      • We managed to map arbitrary (DDR) memory via BAR0, after setting the DEFAULT_TARGET bit in PCIE_EP_MISC_CONTROL_1_OFF, but we don't really understand WHY that works / what the implications of this bit are.

    [MW] Application registers is the PCIe controller registers (0x0000 to 0x0200). By default BAR0 is set to access TRGT0. Changing DEFAULT_TAREGT will route BAR0 access to TRGT1

    • Is it possible to map arbitrary AM65x peripheral registers (e.g. PCIe, but also others) via an inbound iATU to allow them to be accessed from an external RC via PCIe?

    [MW] Yes, if you look into the implementation of the MSI interrupt from EP to RC, you will notice that we are mapping the RC's GIC500 register using IATU 

    • The TRM tells us that we ought to be able to have 32 iATU regions in each direction (instead of 16) if we use only 32-bit addresses (see 12.2.2.4.7 PCIe Subsystem Address Translation) - how can we configure these 32 regions? The TRM only lists registers for 16 regions.

    [MW] This is an issue in the TRM. We only support 16 inbound iATU regions.

    • Table 12-1996 tells us that there is a "PCIe remote configuration space (Remote PF0)" at local offset 0xA0000 - what does that mean? We couldn't find any other reference to this offset.
    • Is it like a fixed outbound translation window to generate config space accesses?

    [MW] Yes.

    • Whatis the "PCIe IO configuration space (Remote IO)" at local offset 0xB0000 listed in Table 12-1996?
      • At first glance "IO configuration space" looks like a typo.
      • Is that like a fixed outbound translation window to generate I/O accesses? [MW] Yes

    [MW] This is the PCIe IO space – backwards compatibility to PCI

    • What are the implicatons of the below table 12-1996, that tells us that we can't access the "PCIe local application registers" if the MSI-X address match is enabled?
      • We want the RC to be able to generate interrupts in the EP (AM65xx). As far as we understand we should use the PCIE_EP_MMRx_IRQ_* registers for this, but that means we need access to the "PCIe local application registers".

    [MW] We are only using MemWr from EP to RC GIC to create interrupt and not PCIe MSI-X. In this case, RC to EP MemWr is possible to create interrupts using PCIE_EP_MMRx_IRQ

    • How does the "MSI-X address match" feature work? Chapter 12.2.2.4.4.1.3 MSI-X Interrupt Generation tells us that there is a "integrated MSI-X Transmit (iMSIX-TX) feature", but other than that there's not much information on this feature.
      • The description for the PCIE_EP_MSIX_ADDRESS_MATCH_* registers tells us that there is a "MSI-X Table RAM feature" if (MSIX_TABLE_EN=1), but I believe that MSIX_TABLE_EN is something that TI configured when integrating the PCIe IP core. Is the PCIe IP core implemented with that feature enabled in the AM65x?
      • Which address are we supposed to configure for the MSI-X address match? How does that address relate to the note below Table 12-1996?

    {MW] The MSI-X feature is implemented according to the PCIe standard. Please refer to the PCIe standard for details.

    • How does the iATU "INVERT_MODE" work? The TRM tells us that if this bit is set for an iATU region an address match occurs if the address is outside of the defined range (base to limit), but how is the target address calculated in this case? For non-INVERT_MODE the target address is the offset in the defined range (base to limit) plus the configured target address, but that doesn't seem to make any sense for INVERT_MODE?


    [MW] I do not know the iATU "INVERT_MODE" work.

    • If we configure an inbound iATU for BAR match mode it seems that the base and limit are ignored, but the TRM tells us is if this is "working as intended" or if we're having some error in our configuration.
      • What we wanted to configure is a "scattered" mapping of a single BAR to multiple regions in our local address space, e.g.
        • BAR0, Offset 0x00000000 -> 0x80000000
        • BAR0, Offset 0x00100000 -> 0x90000000
      • A configuration like that seems to be possible if we don't use the BAR match mode but the address match mode instead, but that way we need some kind of synchronization with the RC, i.e. we need to wait for the RC to configure our BARs, then configure the inbound iATU with the address from the BAR.

    [MW] Yes, the BAR matching mode is PCIe standard, the iATU is not. In order to use the address matching mode, you have to wait for the RC to configure the BARs.

    Ming