This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM6548: OSPI configuration

Part Number: AM6548


We're using the OSPI controller to connect to a quad SPI NOR flash, and we're trying to understand the Read Data Capture chapter and have some questions:

The OSPI module appears to be clocked with 133 MHz.

We want to run our flash device with 66MHz (or more) and the TRM chapter 12.3.2.4.2.1 states, that "The loopback mode (only for Quad flash devices) can work in two cases. [...] Thus SPI mode 0 is the first of two modes that support high MHz operation (greater than 50 MHz)."

  • We assume that we should use the loopback mode?
  • What clock excatly is looped back? Since the ICLK-Pin is missing on the AM65xx, is the OSPI_OCLK looped internally? We had a look at the K2G TRM, and it looks as if the OSPI chapter is (partially) a copy of the K2G QSPI documentation. The K2G has a dedicated QSPI_RCLK pin.

Further reading chapter 12.3.2.4.2.1.2 tells us that the DELAY_FLD bit field controls the additional number of read data capture cycles (this is the fast reference clock, running at least x4 of the device clock) that should be applied to the internal read data capture circuit.

  • Does this mean that the delay mode only works if we configure the Baud Rate divisior of the OSPI controller to /4 or more? In our case this would lead to ~33MHz Flash operation, which is less than the stated 50Mhz above and thus does not require the loopback mode.
  • Is the number of cycles of the DELAY_FLD always applied to the internal read data capture ciricuit or only if the BYPASS_FLD is set to 0x0?

Further comparing the AM65x TRM and the K2G TRM we saw that the K2G contained a paragraph that explained how the DELAY_FLD should be trained using Read ID STIG command, and that we should expect at least three valid settings, and use the middle one. In our tests we only ever had two working DELAY_FLD settings.

  • Are there any guidelines for training the DELAY_FLD value for the AM65x?

Finally we tried to increase the OSPI reference clock in order to verify our assumption about baud rate divisor > 4 (reference clock needs to be at least x4 device clock), but we were unable to configure the OSPI clock to something higher. We tried using the following code snippet:

    /* Fixup frequency of OSPI module */
    uint64_t ospiFunClk = (uint64_t)(266000000);
    Sciclient_pmSetModuleClkFreq(TISCI_DEV_MCU_FSS0, TISCI_DEV_MCU_FSS0_BUS_OSPI0_RCLK_CLK, ospiFunClk, TISCI_MSG_FLAG_AOP, SCICLIENT_SERVICE_WAIT_FOREVER);
    Sciclient_pmGetModuleClkFreq(TISCI_DEV_MCU_FSS0, TISCI_DEV_MCU_FSS0_BUS_OSPI0_RCLK_CLK, &ospiFunClk, SCICLIENT_SERVICE_WAIT_FOREVER);

The returned value is always 133 MHz.

  • Is there a way to find out about supported clock speeds?

Best Regards,

Dominic

  • Dear TI Team,

    do you have any feedback regarding our questions? Do you need additional information in order to answer our questions?

    Best Regards,

    Dominic

  • Dominic,

    What clock excatly is looped back? Since the ICLK-Pin is missing on the AM65xx, is the OSPI_OCLK looped internally? We had a look at the K2G TRM, and it looks as if the OSPI chapter is (partially) a copy of the K2G QSPI documentation. The K2G has a dedicated QSPI_RCLK pin.

    On the OSPI module, this is called OSPI1_DQS. I see that figure 12-2051 incorrectly implies that MCU_OSPI_DQS and MCU_OSPI_ICLK are different pins. I apologize for the confusion. But OSPI_DQS can serve as the input for a DQS signal provided by the flash device. Alternatively, the loop back clock output (MCU_OSPI1_LBCLKO) can be fed into this pin. You can also loop the clock back internally. It is recommended that you use DQS if your flash device provides it.

    Does this mean that the delay mode only works if we configure the Baud Rate divisior of the OSPI controller to /4 or more? In our case this would lead to ~33MHz Flash operation, which is less than the stated 50Mhz above and thus does not require the loopback mode.

    The maximum frequency of the reference clock is 200MHz. When using the tap generator, the Baud Rate divider must be /8 for DDR mode, or /4 for SDR mode. This corresponds to 25MHz DDR or 50MHz SDR.

    Is the number of cycles of the DELAY_FLD always applied to the internal read data capture ciricuit or only if the BYPASS_FLD is set to 0x0?

    Bypass_field is referring to bypassing the loopback clock, and is 1 at reset.  Clearing this field enables the internal or external loopback clock.  DELAY_FLD still applies.

    Further comparing the AM65x TRM and the K2G TRM we saw that the K2G contained a paragraph that explained how the DELAY_FLD should be trained using Read ID STIG command, and that we should expect at least three valid settings, and use the middle one. In our tests we only ever had two working DELAY_FLD settings.

    I see. To debug this it would help to know the following:
    Are you using DDR or SDR protocol?
    What baud divider are you using?
    What sampling clock are you using? There are four sampling clock options on the OSPI module: Delayed Ref_clk, Internal Loopback, External Loopback, and DQs.

    Are there any guidelines for training the DELAY_FLD value for the AM65x?

    Using the READ ID STIG command is still the correct way to train the DELAY_FLD value.

    Finally we tried to increase the OSPI reference clock in order to verify our assumption about baud rate divisor > 4 (reference clock needs to be at least x4 device clock), but we were unable to configure the OSPI clock to something higher. We tried using the following code snippet:

    /* Fixup frequency of OSPI module */
    [...]

    The returned value is always 133 MHz.

    I can't comment on this because I don't know what these functions are doing.  What frequency are you trying to set the clock to?  What registers are you modifying?

    Is there a way to find out about supported clock speeds?

    The maximum frequency for the reference clock is 200MHz.  When using the tap generator, you are limited to 25MHz DDR/50 MHz SDR.

     

  • Hello Zack,

    thanks a lot for your detailed replies.

    Zack Brown said:
    On the OSPI module, this is called OSPI1_DQS. I see that figure 12-2051 incorrectly implies that MCU_OSPI_DQS and MCU_OSPI_ICLK are different pins. I apologize for the confusion. But OSPI_DQS can serve as the input for a DQS signal provided by the flash device. Alternatively, the loop back clock output (MCU_OSPI1_LBCLKO) can be fed into this pin. You can also loop the clock back internally. It is recommended that you use DQS if your flash device provides it.

    Ok, thanks for clarifying this. We observed some weird behaviour of the clock line, but the analysis has been unconclusive so far. One issue that I'm currently facing is that our design loops back the MCU_OSPI0_CLK instead of MCU_OSPI0_LBCLKO. Can you tell me what the reason is for having a separate "loopback clock output"? Is it just to reduce the load on the clock line, or is there something else I'm missing?

    Zack Brown said:
    The maximum frequency of the reference clock is 200MHz. When using the tap generator, the Baud Rate divider must be /8 for DDR mode, or /4 for SDR mode. This corresponds to 25MHz DDR or 50MHz SDR.

    Ok, so is there any way to achieve higher OSPI frequencies? The data sheet suggests that we should be able to achieve ~142 MHz.

    The K2G TRM had some recommendations on how to set up the QSPI clocking:

    • < 25 MHz: no data training
    • >25 MHz < 50 MHz: data training
    • > 50 MHz: loop back clock + data training

    But that wouldn't be an option if the reference clock is really limited to 200 MHz. Also, how does the limit you mentioned relate to this sentence from the TRM:

    12.3.2.5.1 Configuring the OSPI Controller for Use After Reset
    [...] Assuming the reference clock is operating at 400 MHz after reset,
    Zack Brown said:
    I see. To debug this it would help to know the following:
    Are you using DDR or SDR protocol?
    What baud divider are you using?
    What sampling clock are you using? There are four sampling clock options on the OSPI module: Delayed Ref_clk, Internal Loopback, External Loopback, and DQs.
    • SDR
    • baud dividier is /2, which probably explains why we only have two working delay settings
    • I believe our current hardware design leaves us with three options for the sampling clock:
      • Delayed reference clock
      • Internal loopback (CTRLMMR_MCU_OSPI0_CLKSEL[4], LOOPCLK_SEL = 1)
      • External loopback (as long as its ok to loopback the actual clock, see above)
    Zack Brown said:
    Using the READ ID STIG command is still the correct way to train the DELAY_FLD value.
    Ok, I get that READ ID STIG is still the correct way. What I'm unsure about is the values for DELAY_FLD. If the reference is running at 4x the device clock, how can there be 16 delay values? Also, shouldn't the training include the SAMPLE_EDGE_SEL_FLD? 4x the clock and two edges would at least leave us with 8 sample points.
    Zack Brown said:
    I can't comment on this because I don't know what these functions are doing.  What frequency are you trying to set the clock to?  What registers are you modifying?
    As far as I understand we're not supposed to mess with registers for clocking, but have to use the Sciclient driver to call into DMSC firmware to make the necessary register settings for us. At least that's how most of the processor SDK RTOS code seems to operate.
    I was trying to set the clock to 266 MHz but I also tried 200 MHz, neither seemed to work. The parameters are taken from pdk_am65xx_1_0_6/packages/ti/boot/sbl/src/ospi/sbl_ospi.c around line ~294, where according to the comments the OSPI reference clock gets set to 133 MHz to work around an issue with "system firmware" configuring the OSPI clock to "unsupported values".
    Zack Brown said:
    The maximum frequency for the reference clock is 200MHz.  When using the tap generator, you are limited to 25MHz DDR/50 MHz SDR.
    I was referring to the Sciclient_pmSetModuleClkFreq API. Certainly the API wont be able to fulfill arbitrary requests, so I'm wondering if there's an API to query the API for possible clock speeds of a given peripheral/clock given the current system state, e.g. other clocks that have been configured before.
    Regards,
    Dominic
  • One issue that I'm currently facing is that our design loops back the MCU_OSPI0_CLK instead of MCU_OSPI0_LBCLKO. Can you tell me what the reason is for having a separate "loopback clock output"? Is it just to reduce the load on the clock line, or is there something else I'm missing?

    So currently you are looping back the clock to MCU_OSPI0_DQS?  Assuming that is correct, then you have a network where the MCU_OSPI0_CLK pin is driving the MCU_OSPI0_LBCLKO, with the flash device's clock input being mid flight.  Signals will reflect off of the MCU_OSPI0_DQS pin and interfere with the flash device's clock signal.  I recommend following the diagram in Figure 7-2 of the datasheet.

    Ok, so is there any way to achieve higher OSPI frequencies? The data sheet suggests that we should be able to achieve ~142 MHz.

    Yes, the Fmax I mentioned above are for using delay taps.  If you use the OSPI PHY, you can achieve up to 142 MHz, and there is no need for a baud divider.  The below information about taps does not apply to PHY mode.

    If the reference is running at 4x the device clock, how can there be 16 delay values? Also, shouldn't the training include the SAMPLE_EDGE_SEL_FLD? 4x the clock and two edges would at least leave us with 8 sample points.

    You can delay the data capture logic by 16 ref_clk cycles.  As you point out, a baud division of 4 means that each data window is 4 ref_clk cycles long.  That is why you expect at least 3, but no more than 4, passing taps.  If you find that you need to include the sampling edge in the training to get sufficient resolution, then that would give at most 8 passing values.  However, we have never found the need to use that option.

    so I'm wondering if there's an API to query the API for possible clock speeds of a given peripheral/clock given the current system state

    Does the source code you are looking at not have a list of supported ref_clk frequencies?  The supportable ref_clk frequencies should be independent of system state.

  • Zack Brown said:
    So currently you are looping back the clock to MCU_OSPI0_DQS?  Assuming that is correct, then you have a network where the MCU_OSPI0_CLK pin is driving the MCU_OSPI0_LBCLKO, with the flash device's clock input being mid flight.  Signals will reflect off of the MCU_OSPI0_DQS pin and interfere with the flash device's clock signal.  I recommend following the diagram in Figure 7-2 of the datasheet.

    Ok, thanks for the explanation. Since there is going to be a new version of the hardware I believe this can be changed.

    Can you tell me about the note above Figure 7-2? Is this relevant if we're using a) delay taps or b) the PHY module?

    Zack Brown said:
    Yes, the Fmax I mentioned above are for using delay taps.  If you use the OSPI PHY, you can achieve up to 142 MHz, and there is no need for a baud divider.  The below information about taps does not apply to PHY mode.

    Is it possible to use the OSPI PHY with a QSPI device?

    Zack Brown said:
    You can delay the data capture logic by 16 ref_clk cycles.  As you point out, a baud division of 4 means that each data window is 4 ref_clk cycles long.  That is why you expect at least 3, but no more than 4, passing taps.  If you find that you need to include the sampling edge in the training to get sufficient resolution, then that would give at most 8 passing values.  However, we have never found the need to use that option.

    Ok, thanks for the explanation.

    Zack Brown said:
    Does the source code you are looking at not have a list of supported ref_clk frequencies?  The supportable ref_clk frequencies should be independent of system state.

    The API ends up sending a TISCI_MSG_SET_FREQ message (http://downloads.ti.com/tisci/esd/18_08_00/2_tisci_msgs/pm/clocks.html#tisci-msg-set-freq). The call doesn't fail, but if I then query the reference clock it tells me it is still running at 133 MHz. Since all of that TISCI / DMSC is pretty much a black box I have no idea how to figure out what's going wrong.

    Regards,

    Dominic

  • Can you tell me about the note above Figure 7-2? Is this relevant if we're using a) delay taps or b) the PHY module?

    It applies to the PHY module, but there are software fixes that can help you improve hold time if you need them.  Use the standard routing of (A to B) = (E to F) = (C to D) / 2

    Is it possible to use the OSPI PHY with a QSPI device?

    Yes.

    As for your clock setting issue, I'm not really experienced with the API you're using.  Can you send a "TISCI_MSG_QUERY_FREQ" message?