This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM625: Issue with OSPI and spidev

Part Number: AM625

Tool/software:

Hello All

I have been working with our platform with OSPI interface and controlling a 256Mb SPI-NOR Quad device.

I have successfully initialsed the device with using the spi-cadence-quadspi driver as outlined in the Ti OSPI/QSPI guide:

https://software-dl.ti.com/processor-sdk-linux/esd/AM62X/08_03_00_19/exports/docs/linux/Foundational_Components/Kernel/Kernel_Drivers/QSPI.html

I understand that the above link is for the 08.03.00.19 release but still outlines the driver that is required in the kernel to support quad spi.

I am using the current 10.01.10 release (6.6 kernel)

https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/snapshot/ti-linux-kernel-10.01.10.tar.gz

The spi and cadence kernel drivers are enabled:

CONFIG_SPI=y
CONFIG_SPI_DEBUG=y
CONFIG_SPI_MASTER=y
CONFIG_SPI_MEM=y

#
# SPI Master Controller Drivers
#
CONFIG_SPI_BITBANG=y
CONFIG_SPI_CADENCE_QUADSPI=y
CONFIG_SPI_CADENCE_XSPI=y
CONFIG_SPI_GPIO=y
CONFIG_SPI_OMAP24XX=y

#
# SPI Protocol Masters
#
CONFIG_SPI_SPIDEV=m

And identifies the flash correctly when used as an mtd device:

[    1.188137] cadence-qspi fc40000.spi: couldn't determine phase-detect-selector
[    1.190538] spi-nor spi0.0: xxxx25nw (32768 Kbytes)

spi-nor debug give correct information about the device:

$ cat /sys/kernel/debug/spi-nor/spi0.0/capabilities
Supported read modes by the flashddir
 1S-1S-1S
  opcode        0x13
  mode cycles   0
  dummy cycles  0
 1S-4S-4S
  opcode        0xec
  mode cycles   2
  dummy cycles  6
 4S-4S-4S
  opcode        0xec
  mode cycles   2
  dummy cycles  6

Supported page program modes by the flash
 1S-1S-1S
  opcode        0x12
 1S-4S-4S
  opcode        0x3e                                                                                                   cat /sys/kernel/debug/spi-nor/spi0.0/params

$ cat /sys/kernel/debug/spi-nor/spi0.0/params
id              xx xx xx 00 00 00
size            32.0 MiB
write size      1
page size       256
address nbytes  4
flags           4B_OPCODES | HAS_4BAIT | HAS_16BIT_SR | SOFT_RESET

opcodes
 read           0xec
  dummy cycles  8
 erase          0xdc
 program        0x3e
 8D extension   invert

protocols
 read           1S-4S-4S
 write          1S-4S-4S
 register       1S-1S-1S

erase commands
 21 (4.00 KiB) [1]
 ff (32.0 KiB) [2]
 dc (64.0 KiB) [3]
 c7 (32.0 MiB)

sector map
 region (in hex)   | erase mask | flags
 ------------------+------------+----------
 00000000-01ffffff |     [ 123] |

Using flashrom I can dump the device without issue.  Having some issues writing to the device but will put this on a different ticket as I dont think its linked to my spidev issue.

When I enable the ospi interface and set the flash device to be spidev compatible I get a /dev/spidev0.0 node but cannot do anything with the interface.

Working my way through the driver layers it seems to break in the spi_async operation returning an ENOTSUPP error code.

dts config:

&ospi0 {
	bootph-all;
	status = "okay";
	pinctrl-names = "default";
	pinctrl-0 = <&ospi0_pins_default>;
    reg = <0x00 0x0fc40000 0x00 0x100>,
		  <0x05 0x00000000 0x01 0x2000000>;

	flash@0 {
		bootph-all;
		compatible = "micron,spi-authenta";
		reg = <0x0>;
		spi-tx-bus-width = <1>;
		spi-rx-bus-width = <1>;
		spi-max-frequency = <20000000>;
		cdns,tshsl-ns = <60>;
		cdns,tsd2d-ns = <60>;
		cdns,tchsh-ns = <60>;
		cdns,tslch-ns = <60>;
		cdns,read-delay = <4>;
	};	
};

I can get spidev to work with the single line interaces such as main_spi0, 1 and 2.

For some reason cannot get spidev to work with the ospi interface.

Is this possible?

I ask because our design requires the use of spidev for development.

Thank you for your assistance.

Marc

  • When I enable the ospi interface and set the flash device to be spidev compatible I get a /dev/spidev0.0 node but cannot do anything with the interface.

    spidev cannot be used with the OSPI peripheral, as the error message suggests. It is not intended and designed for "basic" SPI-type operation.

    If the reason you are trying to use this that you need additional SPI interfaces, you can do the following:

    1. Use the software GPIO-bitbang SPI driver already in the Kernel, or
    2. Use a SPI peripheral from a different domain (MCU domain), or
    3. Use the PRU accelerator to implement your own custom SPI peripheral (we don't have specific sample code for this right now unfortunately, but one could start off with a related PRU-based example or driver. I think there's one for UART)

    Also the other error you got of " cadence-qspi fc40000.spi: couldn't determine phase-detect-selector" points to an important missing device tree property. Make sure you are using the correct/updated device-specific device tree files associated with SDK v10.1, and not older/earlier ones.

    Regards, Andreas

  • Hi Andreas,

    As my colleague mentioned, we successfully established communication using GPIO-SPIDEV and MTD. However, the main issue is that GPIO mode is extremely slow.

    We are designing a board with a Secure SPI NOR Flash chip that requires specific read/write commands and only accepts data in a particular encrypted format. This also makes MTD mode challenging to use.

    I modified the device tree to use OSPI in SPIDEV mode. While Linux recognized it as spidev0.0, no data was transmitted or received. Since the generic SPIDEV driver was removed from the Linux kernel, I used a known compatible device (spi-authenta, as it is also an SPI NOR chip). Here is my device tree:

    &ospi0 {
        status = "okay";
        pinctrl-names = "default";
        pinctrl-0 = <&ospi0_pins_default>;
        ti,pindir-d0-out-d1-in = <1>;
        spidev@0 {
            compatible = "spi-authenta";
            status = "okay";
            reg = <0>;
            spi-tx-bus-width = <1>;
            spi-rx-bus-width = <4>;
            cs-gpios = <&main_gpio0 11 GPIO_ACTIVE_HIGH>;
            spi-max-frequency = <25000000>;
            cdns,tshsl-ns = <60>;
            cdns,tsd2d-ns = <60>;
            cdns,tchsh-ns = <60>;
            cdns,tslch-ns = <60>;
            cdns,read-delay = <4>;
            cdns,phy-mode;
            spi-nor,ddr;
        };
    };
    

    Could you advise whether it is possible to interface with this chip in QSPI or OSPI mode using SPIDEV while leveraging hardware acceleration instead of GPIO?

    If not, is there a way to configure GPIO to utilize OSPI and transfer data over 4 or 8 lines?

    Thanks in advance for your guidance.

    Best regards,
    Hossein

  • Could you advise whether it is possible to interface with this chip in QSPI or OSPI mode using SPIDEV while leveraging hardware acceleration instead of GPIO?

    No our QSPI/OSPI peripheral module cannot be used with spidev, this is not supported.

    If not, is there a way to configure GPIO to utilize OSPI and transfer data over 4 or 8 lines?

    I'm not sure what that means? Either you use GPIO, or OSPI. I'm not sure how GPIO "utilizing" OSPI would look like.

    leveraging hardware acceleration instead of GPIO?

    How about you use your external device in single-wire SPI mode? And then use one of the McSPI peripheral modules (which will work w/ spidev)? I think this way you can get it to work much faster than any kind of GPIO-based bit-bang approach even using multiple signals.

    Regards, Andreas

  • No our QSPI/OSPI peripheral module cannot be used with spidev, this is not supported.

    What would be required to support spidev with QSPI/OSPI ?

    I'm not sure what that means? Either you use GPIO, or OSPI. I'm not sure how GPIO "utilizing" OSPI would look like.

    It is referring to use of GPIO SPI bit-bang driver to run in quad or octal mode to speed up the transfer of data because we cannot use spidev driver with cadence_qspi driver.

    How about you use your external device in single-wire SPI mode? And then use one of the McSPI peripheral modules (which will work w/ spidev)? I think this way you can get it to work much faster than any kind of GPIO-based bit-bang approach even using multiple signals.

    This is difficult to do with the AM62x as the design is using the ability for the processor to boot from the QSPI.

    Booting is not a problem with accessing the SPI chip.  Our issue is around the access of the non standard "secure" registers in the SPI chip that there is no support for in the the kernel driver for.

    We have looked at implementing an MTD side channel to allow for raw data commands to be sent to the flash using the cadence_qspi.c driver but it all gets very difficult when dummy clock cycles are required to allow commands to execute in the SPI chip. The cadence_qspi driver supports a maximum of 31 dummy cycles where some "secure" commands require 40.

    The use of spidev to allow us to manage the chip is a tried and tested method as it gives us full control to data flow.

    We would really like to find a hardware accelerated method of interfacing to the SPI chip from linux kerenel instead of using the slow GPIO SPI driver.

  • Hi Marc,

    What would be required to support spidev with QSPI/OSPI ?

    There is a "legacy mode" in the OSPI controller that *might* be able to do what you need from a HW POV, but there is no software infrastructure in the current driver for this, and this is like not something that can be developed in a non-trivial manner. Also see here...

    https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1444100/am6421-more-flexible-use-of-the-ospi-interface/5538259#5538259
    https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1444100/am6421-more-flexible-use-of-the-ospi-interface/5540138#5540138

    It is referring to use of GPIO SPI bit-bang driver to run in quad or octal mode

    You would need to develop a completely custom bit-bang based driver for your memory device if you want to use more than 1-bit data. I don't think this can be achieved easily or cleanly by extending the existing GPIO bit-bang driver.

    Booting is not a problem with accessing the SPI chip.  Our issue is around the access of the non standard "secure" registers in the SPI chip that there is no support for in the the kernel driver for.

    The AM62x can technically boot from a regular SPI flash. So if you re-wire the connections that might work...

    1) Have ROM boot from your flash chip using the regular SPI interface,
    2) Once the Kernel is up, use spidev on top of the regular SPI interface

    If boot speed is a concern, you could do a hybrid approach as well, of wiring your flash chip to both the OSPI interface (for ROM boot) and the regular SPI interface (for post-boot usage) simultaneously.

    How does the vendor of this flash chip expect Linux users to use that device? Are there any drivers/reference code?

    We have looked at implementing an MTD side channel to allow for raw data commands to be sent to the flash using the cadence_qspi.c driver

    How did you do that? Using the "STIG" mode?

    Regards, Andreas

  • Hello Andreas,

    I would like to ask if it is possible to use the legacy mode for communication in Linux, or is this only feasible with FreeRTOS? If it is possible through Linux, please guide me on how to implement it, as I couldn't find relevant documentation. If it is not possible, please provide more references about STIG mode. Specifically, I would like to know if I can send commands directly in STIG mode and have more than 32 dummy cycles. Providing sample code for AM62x would be greatly helpful.

  • Hi Hossein,

    I would like to ask if it is possible to use the legacy mode for communication in Linux, or is this only feasible with FreeRTOS?

    As stated earlier in this thread, there is no software infrastructure in the current drivers for this (both Linux and RTOS environments), and this is like not something that can be quickly developed in a non-trivial manner.

    please provide more references about STIG mode. Specifically, I would like to know if I can send commands directly in STIG mode and have more than 32 dummy cycles. Providing sample code for AM62x would be greatly helpful.

    The existing Linux driver (./drivers/spi/spi-cadence-quadspi.c) already uses the STIG mode, so you can look at that. Note that STIG mode is meant to read Flash registers and SFDP data etc, and program Flash device registers. It’s not meant to be used to read large data. Hence the driver uses it exactly for those purposes.

    There's only 5 bits in the controller register (RD_DATA_CAPTURE_REG) to control the delay behavior, so you cannot use more than 32 dummy cycles.

    Regards, Andreas

  • Hi Andreas,

    I was able to use STIG from userspace by accessing the registers directly. However, since I need to support more than 32 dummy cycles and also transfer more than 16 bytes of data, STIG does not meet my requirements.

    Could you please let me know if it's possible to access OSPI in legacy mode from userspace? If so, is there any API available for this purpose?

    I also tried communicating with the controller using registers, specifically with cdence, but I wasn't able to read data either via ABH or indirect methods. I've included my code below for reference.

    The indirect_read function causes the system to reset, and DMA doesn't return any data since it's not possible to configure DMA properly from userspace.

    If there's any other recommended way to establish communication, I would really appreciate your guidance.

    Best regards, Hossein

    #include <stdio.h>
    #include <fcntl.h>
    #include <sys/mman.h>
    #include <unistd.h>
    #include <stdint.h>
    #include <string.h>
    #include <time.h>
    #include <stdbool.h>
    #include <errno.h>
    #include <stdlib.h>


    #include "spi-cadence-quadspi.h"
    typedef unsigned long long _loff_t;

    #define DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
    #define lower_32_bits(n) ((uint32_t)(((_loff_t)(n)) & 0xFFFFFFFFUL))
    #define upper_32_bits(n) ((uint32_t)((((_loff_t)(n)) >> 32) & 0xFFFFFFFFUL))
    #define IS_ALIGNED(x, a) (((x) & ((typeof(x))(a) - 1)) == 0)
    #define __round_mask(x, y) ((__typeof__(x))((y)-1))
    #define round_down(x, y) ((x) & ~__round_mask(x, y))

    #define min(x, y) ((x) < (y) ? (x) : (y))

    #define OSPI_BASE_ADDR 0x0FC40000 // Base address from DTB
    #define OSPI_SIZE 0x100 // Size of register space
    #define OSPI_CMD_WIDTH 0x0 // Command band width spi 0x0 dual 0x1 quad 0x2 octa 0x3;
    #define OSPI_ADDR_WIDTH 0x0 // Address band width spi 0x0 dual 0x1 quad 0x2 octa 0x3;
    #define OSPI_DATA_WIDTH 0x0 // Data band width spi 0x0 dual 0x1 quad 0x2 octa 0x3;
    #define OSPI_CPU_RELAX_time 1000
    #define OSPI_MAX_SPEED_HZ 5000000 // Max speed in Hz for using with digital analyzer can reduce to (OSPI_REF_CLK_HZ-1)/30 on AM6252 sancloude is 5MHz
    #define OSPI_CS 0x0 // Chip select CS0 =0X0 CS1=0X1 CS2=0X2 CS3=0X3
    #define OSPI_rclk_en 0x0
    /* \
    This value is based on the device tree. In our device tree, we inherited from BeagleBone. \
    OSPI0's clock is set with the following configuration: \
    \
    clocks = <&k3_clks 75 7>; \
    \
    On AM6252, this clock generates 166,666,666 Hz. You can retrieve this value using the following command: \
    \
    cat /sys/kernel/debug/clk/clk_summary | grep -e "75:7" -e "count" -e "---" \
    \
    The returned value for AM6252 on the SanCloud board is: \
    \
    clock count count count rate accuracy phase cycle enable consumer id \
    --------------------------------------------------------------------------------------------------------------------------------------------- \
    clk:75:7 0 0 0 166666666 0 0 50000 Y fc40000.spi no_connection_id \
    */
    #define OSPI_REF_CLK_HZ 166666666

    /* base on device tree
    cdns,tshsl-ns = <60>;
    cdns,tsd2d-ns = <60>;
    cdns,tchsh-ns = <60>;
    cdns,tslch-ns = <60>;
    cdns,read-delay = <4>;
    */
    #define OSPI_tshsl_ns 60
    #define OSPI_tchsh_ns 60
    #define OSPI_tslch_ns 60
    #define OSPI_tsd2d_ns 60
    #define OSPI_read_delay 4
    /*
    DUE to device tree reg
    ospi0: spi@fc40000 {
    compatible = "ti,am654-ospi", "cdns,qspi-nor";
    reg = <0x00 0x0fc40000 0x00 0x100>,
    <0x05 0x00000000 0x01 0x00000000>;
    interrupts = <GIC_SPI 139 IRQ_TYPE_LEVEL_HIGH>;
    cdns,fifo-depth = <256>;
    cdns,fifo-width = <4>;
    cdns,trigger-address = <0x0>;
    cdns,phase-detect-selector = <2>;
    clocks = <&k3_clks 75 7>;
    assigned-clocks = <&k3_clks 75 7>;
    assigned-clock-parents = <&k3_clks 75 8>;
    assigned-clock-rates = <166666666>;
    power-domains = <&k3_pds 75 TI_SCI_PD_EXCLUSIVE>;
    #address-cells = <1>;
    #size-cells = <0>;
    status = "disabled";
    };

    &ospi0 {
    bootph-all;
    status = "okay";
    pinctrl-names = "default";
    pinctrl-0 = <&ospi0_pins_default>;
    reg = <0x00 0x0fc40000 0x00 0x100>,
    <0x05 0x00000000 0x01 1000000>;


    flash@0 {
    bootph-all;
    compatible = "jedec,spi-nor";
    reg = <0x0>;
    spi-tx-bus-width = <1>;
    spi-rx-bus-width = <4>;
    spi-max-frequency = <25000000>;
    cdns,tshsl-ns = <60>;
    cdns,tsd2d-ns = <60>;
    cdns,tchsh-ns = <60>;
    cdns,tslch-ns = <60>;
    cdns,read-delay = <4>;
    };
    };

    reg = <0x00 0x0fc40000 0x00 0x100>,
    <0x05 0x00000000 0x01 1000000>;
    ➤ First Entry (registers):
    <0x00 0x0fc40000 0x00 0x100>
    → Address: 0x0FC40000
    → Size: 0x00000100 (256 bytes)
    ➤ Second Entry (AHB window):
    <0x05 0x00000000 0x01 0x100000> → (note: 0x100000 = 1048576 = 16 MiB)
    → Address: 0x5000000000 (large address, possibly due to a bus mapping)
    → Size: 0x00100000 (16 MiB)


    hexdump -c -C /sys/firmware/devicetree/base/bus@f0000/bus@fc00000/spi@fc40000/reg
    0000000 \0 \0 \0 \0 017 304 \0 \0 \0 \0 \0 \0 \0 \0 001 \0
    00000000 00 00 00 00 0f c4 00 00 00 00 00 00 00 00 01 00 |................|
    0000010 \0 \0 \0 005 \0 \0 \0 \0 \0 \0 \0 001 \0 017 B @ ***HOSSEIN ABH address***
    00000010 00 00 00 05 00 00 00 00 00 00 00 01 00 0f 42 40 |..............B@|
    00000020
    */

    #define OSPI_ABH_ADDR 0x50000000UL // AHB address space
    #define OSPI_ABH_SIZE 0x1000000UL // AHB size (16 MiB)
    #define OSPI_trigger_address 0x0
    #define OSPI_fifo_width 4


    /*
    * Device tree node for the shared DMA pool
    mcu_m4fss_memory_region: m4f-memory@9cc00000 {
    compatible = "shared-dma-pool";
    reg = <0x00 0x9cc00000 0x00 0xe00000>;
    no-map;
    };

    */
    #define OSPI_DMA_ADDR 0x9cc00000UL // DMA address space
    #define OSPI_DMA_SIZE 0xe00000UL // DMA size (2 MiB)

    #ifdef EXECUTE_TEST
    #define CQSPI_DEBUG 1
    #endif

    #ifdef CQSPI_DEBUG

    static void hex_dump(const void *src, size_t length, size_t line_size,
    char *prefix)
    {
    int i = 0;
    const unsigned char *address = src;
    const unsigned char *line = address;
    unsigned char c;

    printf("%s | ", prefix);
    if (length == 0)
    printf("\n");
    else
    while (length-- > 0)
    {
    printf("%02X ", *address++);
    if (!(++i % line_size) || (length == 0 && i % line_size))
    {
    if (length == 0)
    {
    while (i++ % line_size)
    printf("__ ");
    }
    printf(" |");
    while (line < address)
    {
    c = *line++;
    printf("%c", (c < 32 || c > 126) ? '.' : c);
    }
    printf("|\n");
    if (length > 0)
    printf("%s | ", prefix);
    }
    }
    }
    #endif

    volatile uint32_t *ospi_base = NULL;
    int fd = 0;

    static inline void write_reg(uint32_t offset, uint32_t value)
    {
    *(volatile uint32_t *)((uintptr_t)ospi_base + offset) = value;
    }

    static inline uint32_t read_reg(uint32_t offset)
    {
    return *(volatile uint32_t *)((uintptr_t)ospi_base + offset);
    }

    static inline uint8_t read_dma(uint32_t *base, uint32_t offset)
    {
    return *(volatile uint8_t *)((uintptr_t)base + offset);
    }

    void L_memcpy(void *dest, const void *src, size_t len)
    {
    for (size_t i = 0; i < len; i++)
    *((uint8_t *)(dest + i)) = *((uint8_t *)(src + i));
    }


    static int cqspi_waite_for_IRQ(uint32_t mask, unsigned long timeout)
    {

    struct timespec start_time, current_time;

    clock_gettime(CLOCK_MONOTONIC, &start_time);
    unsigned long timeout_ns = start_time.tv_sec * 1000000000L + start_time.tv_nsec +
    (timeout * 1000000L);
    unsigned int irq_status = 0;
    while (1)
    {
    irq_status = read_reg(CQSPI_REG_IRQSTATUS);
    if (irq_status & mask)
    return 0;

    clock_gettime(CLOCK_MONOTONIC, &current_time);
    unsigned long now_ns = current_time.tv_sec * 1000000000L + current_time.tv_nsec;
    if (now_ns > timeout_ns)
    return -ETIMEDOUT;

    usleep(OSPI_CPU_RELAX_time);
    }
    return 0;
    }

    static int cqspi_wait_for_bit(uint32_t offset, const uint32_t mask, bool clr)
    {
    /* Polling for completion. */
    struct timespec start_time, current_time;

    clock_gettime(CLOCK_MONOTONIC, &start_time);
    unsigned long timeout_ns = start_time.tv_sec * 1000000000L + start_time.tv_nsec +
    (CQSPI_TIMEOUT_MS * 1000000L);
    // waite for execute
    while (1)
    {
    uint32_t val = read_reg(offset);
    if (((clr ? ~val : val) & mask) == mask)
    return 0;
    usleep(OSPI_CPU_RELAX_time);
    clock_gettime(CLOCK_MONOTONIC, &current_time);
    unsigned long now_ns = current_time.tv_sec * 1000000000L + current_time.tv_nsec;
    if (now_ns > timeout_ns)
    {
    /* Timeout, in busy mode. */
    perror("QSPI is still busy after %dms timeout.\n");
    return -ETIMEDOUT;
    }
    }
    }

    static int cqspi_setup_opcode_ext(uint16_t cmd, uint32_t shift)
    {
    uint8_t ext;
    if (!(ext = cmd & 0XFF))
    return -2;
    uint32_t reg = read_reg(CQSPI_REG_OP_EXT_LOWER);
    #ifdef CQSPI_DEBUG
    printf("ext:0x%X\n", ext);
    #endif
    reg &= ~(0xff << shift);
    reg |= ext << shift;
    write_reg(CQSPI_REG_OP_EXT_LOWER, reg);
    return 0;
    }
    static bool cqspi_is_idle()
    {
    uint32_t reg = read_reg(CQSPI_REG_CONFIG);
    return reg & (1UL << CQSPI_REG_CONFIG_IDLE_LSB);
    }

    int cqspi_wait_idle()
    {
    const unsigned int poll_idle_retry = 3;
    unsigned int count = 0;
    struct timespec start_time, current_time;

    clock_gettime(CLOCK_MONOTONIC, &start_time);
    unsigned long timeout_ns = start_time.tv_sec * 1000000000L + start_time.tv_nsec +
    (CQSPI_TIMEOUT_MS * 1000000L);

    while (1)
    {
    if (cqspi_is_idle())
    count++;
    else
    count = 0;

    if (count >= poll_idle_retry)
    return 0;

    clock_gettime(CLOCK_MONOTONIC, &current_time);
    unsigned long now_ns = current_time.tv_sec * 1000000000L + current_time.tv_nsec;

    if (now_ns > timeout_ns)
    {
    /* Timeout, in busy mode. */
    perror("QSPI is still busy after %dms timeout.\n");
    return -ETIMEDOUT;
    }
    usleep(OSPI_CPU_RELAX_time);
    }
    }

    static int cqspi_enable_dtr(uint16_t cmd, uint32_t cmdSize, uint32_t shift)
    {
    unsigned int reg;
    int ret;

    reg = read_reg(CQSPI_REG_CONFIG);

    /*
    * We enable dual byte opcode here. The callers have to set up the
    * extension opcode based on which type of operation it is.
    */
    if (cmdSize > 1)
    {
    reg |= CQSPI_REG_CONFIG_DTR_PROTO;
    reg |= CQSPI_REG_CONFIG_DUAL_OPCODE;

    /* Set up command opcode extension. */
    ret = cqspi_setup_opcode_ext(cmd, shift);
    if (ret)
    return ret;
    }
    else
    {
    reg &= ~CQSPI_REG_CONFIG_DTR_PROTO;
    reg &= ~CQSPI_REG_CONFIG_DUAL_OPCODE;
    }

    write_reg(CQSPI_REG_CONFIG, reg);

    return cqspi_wait_idle();
    }

    static unsigned int cqspi_calc_rdreg()
    {
    uint32_t rdreg = 0;

    rdreg |= OSPI_CMD_WIDTH << CQSPI_REG_RD_INSTR_TYPE_INSTR_LSB;
    rdreg |= OSPI_ADDR_WIDTH << CQSPI_REG_RD_INSTR_TYPE_ADDR_LSB;
    rdreg |= OSPI_DATA_WIDTH << CQSPI_REG_RD_INSTR_TYPE_DATA_LSB;

    return rdreg;
    }
    static int cqspi_exec_flash_cmd(uint32_t reg)
    {

    /* Write the CMDCTRL without start execution. */
    write_reg(CQSPI_REG_CMDCTRL, reg);
    /* Start execute */
    reg |= CQSPI_REG_CMDCTRL_EXECUTE_MASK;
    write_reg(CQSPI_REG_CMDCTRL, reg);

    // /* Polling for completion. */
    int ret = cqspi_wait_for_bit(CQSPI_REG_CMDCTRL,
    CQSPI_REG_CMDCTRL_INPROGRESS_MASK, 1);
    if (ret)
    {
    perror("Flash command execution timed out.\n");
    return ret;
    }

    /* Polling QSPI idle status. */
    return cqspi_wait_idle();
    }

    /// Function to send a command and read data from the device. for less than 8 bytes
    int cqspi_STIG_read(uint16_t cmd, uint32_t cmdSize, uint32_t address, uint8_t address_len,
    uint8_t dummy_cycles, int read_len, uint8_t *rxbuf)
    {
    int status;

    status = cqspi_enable_dtr(cmd, cmdSize, CQSPI_REG_OP_EXT_STIG_LSB);
    if (status)
    return status;
    if (/*!read_len || read_len > CQSPI_STIG_DATA_LEN_MAX ||*/ !rxbuf)
    {
    perror("Invalid input argument len or rxbuf");
    return -EINVAL;
    }
    uint8_t opcode = (cmdSize <= 1) ? cmd : (cmd >> 8);
    // #ifdef CQSPI_DEBUG
    // printf("opcode:0x%X\n", opcode);
    // #endif
    uint32_t reg = opcode << CQSPI_REG_CMDCTRL_OPCODE_LSB;
    uint32_t rdreg = cqspi_calc_rdreg();
    write_reg(CQSPI_REG_RD_INSTR, rdreg);

    uint32_t exdummyBytes = 0;
    if (dummy_cycles > CQSPI_DUMMY_CLKS_MAX)
    {
    exdummyBytes = (dummy_cycles - CQSPI_DUMMY_CLKS_MAX) / CQSPI_DUMMY_CLKS_PER_BYTE;
    dummy_cycles = CQSPI_DUMMY_CLKS_MAX;
    if (address_len == 0)
    address = 0;
    if ((address_len + exdummyBytes - 1) <= CQSPI_REG_CMDCTRL_ADD_BYTES_MASK)
    {
    address_len += exdummyBytes;
    exdummyBytes = 0;
    }
    }

    if (dummy_cycles)
    reg |= (dummy_cycles & CQSPI_REG_CMDCTRL_DUMMY_MASK)
    << CQSPI_REG_CMDCTRL_DUMMY_LSB;
    if (read_len)
    {
    reg |= (0x1 << CQSPI_REG_CMDCTRL_RD_EN_LSB);

    /* 0 means 1 byte. */
    reg |= (((read_len - 1) & CQSPI_REG_CMDCTRL_RD_BYTES_MASK)
    << CQSPI_REG_CMDCTRL_RD_BYTES_LSB);
    }
    /* setup ADDR BIT field */
    if (address_len)
    {
    reg |= (0x1 << CQSPI_REG_CMDCTRL_ADDR_EN_LSB);
    reg |= ((address_len - 1) &
    CQSPI_REG_CMDCTRL_ADD_BYTES_MASK)
    << CQSPI_REG_CMDCTRL_ADD_BYTES_LSB;

    write_reg(CQSPI_REG_CMDADDRESS, address);
    }

    status = cqspi_exec_flash_cmd(reg);
    if (status)
    return status;

    // reading data
    int len = (read_len > 4) ? 4 : read_len;
    if (read_len)
    {
    reg = read_reg(CQSPI_REG_CMDREADDATALOWER);
    L_memcpy(rxbuf, &reg, len);
    }
    if (read_len > 4)
    {
    reg = read_reg(CQSPI_REG_CMDREADDATAUPPER);
    len = read_len - len;
    L_memcpy(rxbuf + 4, &reg, len);
    }

    /* Reset CMD_CTRL Reg once command read completes */
    write_reg(CQSPI_REG_CMDCTRL, 0);
    #ifdef CQSPI_DEBUG
    if (read_len)
    hex_dump(rxbuf, read_len, 32, " RX");
    #endif

    return cqspi_wait_idle();
    }

    static void cqspi_controller_enable(bool enable)
    {
    uint32_t reg;

    reg = read_reg(CQSPI_REG_CONFIG);

    if (enable)
    reg |= CQSPI_REG_CONFIG_ENABLE_MASK;
    else
    reg &= ~CQSPI_REG_CONFIG_ENABLE_MASK;

    write_reg(CQSPI_REG_CONFIG, reg);
    }
    static void cqspi_chipselect(uint32_t cs)
    {
    unsigned int chip_select = cs;
    unsigned int reg;

    reg = read_reg(CQSPI_REG_CONFIG);
    reg &= ~CQSPI_REG_CONFIG_DECODE_MASK;

    /* Convert CS if without decoder.
    * CS0 to 4b'1110
    * CS1 to 4b'1101
    * CS2 to 4b'1011
    * CS3 to 4b'0111
    */
    chip_select = 0xF & ~(1 << chip_select);

    reg &= ~(CQSPI_REG_CONFIG_CHIPSELECT_MASK
    << CQSPI_REG_CONFIG_CHIPSELECT_LSB);
    reg |= (chip_select & CQSPI_REG_CONFIG_CHIPSELECT_MASK)
    << CQSPI_REG_CONFIG_CHIPSELECT_LSB;
    write_reg(CQSPI_REG_CONFIG, reg);
    }
    static void cqspi_config_baudrate_div(uint32_t sclk)
    {
    uint32_t reg, div;

    /* Recalculate the baudrate divisor based on QSPI specification. */
    div = DIV_ROUND_UP(OSPI_REF_CLK_HZ, 2 * sclk) - 1;

    /* Maximum baud divisor */
    if (div > CQSPI_REG_CONFIG_BAUD_MASK)
    {
    div = CQSPI_REG_CONFIG_BAUD_MASK;
    perror("Unable to adjust clock ");
    }
    reg = read_reg(CQSPI_REG_CONFIG);
    reg &= ~(CQSPI_REG_CONFIG_BAUD_MASK << CQSPI_REG_CONFIG_BAUD_LSB);
    reg |= (div & CQSPI_REG_CONFIG_BAUD_MASK) << CQSPI_REG_CONFIG_BAUD_LSB;
    write_reg(CQSPI_REG_CONFIG, reg);
    }
    static unsigned int calculate_ticks_for_ns(const unsigned int ref_clk_hz,
    const unsigned int ns_val)
    {
    unsigned int ticks;

    ticks = ref_clk_hz / 1000; /* kHz */
    ticks = DIV_ROUND_UP(ticks * ns_val, 1000000);

    return ticks;
    }

    static void cqspi_delay(uint32_t sclk)
    {
    unsigned int tshsl, tchsh, tslch, tsd2d;
    unsigned int reg;
    unsigned int tsclk;

    /* calculate the number of ref ticks for one sclk tick */
    tsclk = DIV_ROUND_UP(OSPI_REF_CLK_HZ, sclk);

    tshsl = calculate_ticks_for_ns(OSPI_REF_CLK_HZ, OSPI_tshsl_ns);
    /* this particular value must be at least one sclk */
    if (tshsl < tsclk)
    tshsl = tsclk;

    tchsh = calculate_ticks_for_ns(OSPI_REF_CLK_HZ, OSPI_tchsh_ns);
    tslch = calculate_ticks_for_ns(OSPI_REF_CLK_HZ, OSPI_tslch_ns);
    tsd2d = calculate_ticks_for_ns(OSPI_REF_CLK_HZ, OSPI_tsd2d_ns);

    reg = (tshsl & CQSPI_REG_DELAY_TSHSL_MASK)
    << CQSPI_REG_DELAY_TSHSL_LSB;
    reg |= (tchsh & CQSPI_REG_DELAY_TCHSH_MASK)
    << CQSPI_REG_DELAY_TCHSH_LSB;
    reg |= (tslch & CQSPI_REG_DELAY_TSLCH_MASK)
    << CQSPI_REG_DELAY_TSLCH_LSB;
    reg |= (tsd2d & CQSPI_REG_DELAY_TSD2D_MASK)
    << CQSPI_REG_DELAY_TSD2D_LSB;
    write_reg(CQSPI_REG_DELAY, reg);
    }
    static void cqspi_readdata_capture(const bool bypass,
    const bool dqs, const unsigned int delay)
    {
    unsigned int reg;

    reg = read_reg(CQSPI_REG_READCAPTURE);

    if (bypass)
    reg |= (1 << CQSPI_REG_READCAPTURE_BYPASS_LSB);
    else
    reg &= ~(1 << CQSPI_REG_READCAPTURE_BYPASS_LSB);

    reg &= ~(CQSPI_REG_READCAPTURE_DELAY_MASK
    << CQSPI_REG_READCAPTURE_DELAY_LSB);

    reg |= (delay & CQSPI_REG_READCAPTURE_DELAY_MASK)
    << CQSPI_REG_READCAPTURE_DELAY_LSB;

    if (dqs)
    reg |= (1 << CQSPI_REG_READCAPTURE_DQS_LSB);
    else
    reg &= ~(1 << CQSPI_REG_READCAPTURE_DQS_LSB);

    write_reg(CQSPI_REG_READCAPTURE, reg);
    }

    static void cqspi_configure(uint32_t sclk, uint32_t cs)
    {
    static uint32_t current_cs = 0;
    static uint32_t current_sclk = 0;
    bool switch_cs = (current_cs != cs);
    bool switch_ck = (current_sclk != sclk);

    if (switch_cs || switch_ck)
    cqspi_controller_enable(0);

    /* Switch chip select. */
    if (switch_cs)
    {
    current_cs = cs;
    cqspi_chipselect(cs);
    }

    /* Setup baudrate divisor and delays */
    if (switch_ck)
    {
    current_sclk = sclk;
    cqspi_config_baudrate_div(current_sclk);
    cqspi_delay(current_sclk);
    cqspi_readdata_capture(!(OSPI_rclk_en), false,
    OSPI_read_delay);
    }

    cqspi_controller_enable(1);
    }

    int32_t cqspi_STIG_write(uint16_t cmd, uint32_t cmdSize, uint32_t address, uint8_t address_len,
    const uint8_t *txbuf, uint32_t n_tx)
    {
    uint8_t opcode;
    unsigned int reg;
    unsigned int data;
    size_t write_len;
    int ret;

    ret = cqspi_enable_dtr(cmd, cmdSize, CQSPI_REG_OP_EXT_STIG_LSB);
    if (ret)
    return ret;

    if (n_tx > CQSPI_STIG_DATA_LEN_MAX || (n_tx && !txbuf))
    {
    perror("Invalid input argument, cmdlen txbuf 0x\n");
    return -EINVAL;
    }

    reg = cqspi_calc_rdreg();
    write_reg(CQSPI_REG_RD_INSTR, reg);

    opcode = (cmdSize <= 1) ? cmd : (cmd >> 8);

    reg = opcode << CQSPI_REG_CMDCTRL_OPCODE_LSB;

    if (address_len)
    {
    reg |= (0x1 << CQSPI_REG_CMDCTRL_ADDR_EN_LSB);
    reg |= ((address_len - 1) &
    CQSPI_REG_CMDCTRL_ADD_BYTES_MASK)
    << CQSPI_REG_CMDCTRL_ADD_BYTES_LSB;

    write_reg(CQSPI_REG_CMDADDRESS, address);
    }

    if (n_tx)
    {
    reg |= (0x1 << CQSPI_REG_CMDCTRL_WR_EN_LSB);
    reg |= ((n_tx - 1) & CQSPI_REG_CMDCTRL_WR_BYTES_MASK)
    << CQSPI_REG_CMDCTRL_WR_BYTES_LSB;
    data = 0;
    write_len = (n_tx > 4) ? 4 : n_tx;
    memcpy(&data, txbuf, write_len);
    txbuf += write_len;
    write_reg(CQSPI_REG_CMDWRITEDATALOWER, data);

    if (n_tx > 4)
    {
    data = 0;
    write_len = n_tx - 4;
    memcpy(&data, txbuf, write_len);
    write_reg(CQSPI_REG_CMDWRITEDATAUPPER, data);
    }
    }

    ret = cqspi_exec_flash_cmd(reg);

    /* Reset CMD_CTRL Reg once command write completes */
    write_reg(CQSPI_REG_CMDCTRL, 0);
    #ifdef CQSPI_DEBUG
    printf("executed cmd:0x%X, cmdsize:[%d] ,address:0x%X, addresslen:[%d],datasize[%d] \n",
    cmd, cmdSize, address, address_len, n_tx);
    #endif

    return ret;
    }

    static int cqspi_read_setup(const uint16_t cmd, uint32_t cmdSize, uint32_t dummy_clk, uint32_t addressSize)
    {
    unsigned int reg;
    int ret;
    uint8_t opcode;

    ret = cqspi_enable_dtr(cmd, cmdSize, CQSPI_REG_OP_EXT_READ_LSB);
    if (ret)
    return ret;

    if (cmdSize > 1)
    opcode = cmd >> 8;
    else
    opcode = cmd;

    reg = opcode << CQSPI_REG_RD_INSTR_OPCODE_LSB;
    reg |= cqspi_calc_rdreg();

    /* Setup dummy clock cycles */
    if (dummy_clk > CQSPI_DUMMY_CLKS_MAX)
    dummy_clk = CQSPI_DUMMY_CLKS_MAX;

    if (dummy_clk)
    reg |= (dummy_clk & CQSPI_REG_RD_INSTR_DUMMY_MASK)
    << CQSPI_REG_RD_INSTR_DUMMY_LSB;

    write_reg(CQSPI_REG_RD_INSTR, reg);

    /* Set address width */
    reg = read_reg(CQSPI_REG_SIZE);
    reg &= ~CQSPI_REG_SIZE_ADDRESS_MASK;
    reg |= (addressSize - 1);
    write_reg(CQSPI_REG_SIZE, reg);
    read_reg(CQSPI_REG_SIZE); /* Flush posted write. */
    return 0;
    }

    static uint32_t cqspi_get_rd_sram_level()
    {
    uint32_t reg = read_reg(CQSPI_REG_SDRAMLEVEL);

    reg >>= CQSPI_REG_SDRAMLEVEL_RD_LSB;
    return reg & CQSPI_REG_SDRAMLEVEL_RD_MASK;
    }

    static inline uint32_t user_ioread32(volatile uint32_t *addr)
    {
    return *((volatile uint32_t *)(addr));
    }

    static inline void user_ioread32_rep(volatile uint32_t *addr, uint32_t *buf, unsigned int count)
    {
    for (unsigned int i = 0; i < count; i++)
    {
    buf[i] = *((volatile uint32_t *)(addr + i));
    }
    }

    static int cqspi_indirect_read_execute(uint8_t *rxbuf, _loff_t from_addr,
    const size_t n_rx)
    {

    uint32_t * ahb_base = mmap(NULL, OSPI_ABH_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, OSPI_ABH_ADDR);
    if (ahb_base == MAP_FAILED)
    {
    perror("mmap abh");
    return -1;
    }

    unsigned int remaining = n_rx;
    unsigned int mod_bytes = n_rx % 4;
    unsigned int bytes_to_read = 0;
    uint8_t *rxbuf_end = rxbuf + n_rx;
    int ret = 0;

    write_reg(CQSPI_REG_INDIRECTRDSTARTADDR, from_addr);
    write_reg(CQSPI_REG_INDIRECTRDBYTES, remaining);

    /* Clear all interrupts. */
    write_reg(CQSPI_REG_IRQSTATUS, CQSPI_IRQ_STATUS_MASK);

    write_reg(CQSPI_REG_IRQMASK, CQSPI_IRQ_MASK_WR);

    write_reg(CQSPI_REG_INDIRECTRD, CQSPI_REG_INDIRECTRD_START_MASK);

    read_reg(CQSPI_REG_INDIRECTRD); /* Flush posted write. */

    while (remaining > 0)
    {

    if (!(cqspi_waite_for_IRQ(CQSPI_IRQ_MASK_RD, CQSPI_TIMEOUT_MS)))
    {
    perror("Indirect timeout\n");
    ret = -ETIMEDOUT;
    }

    /*
    * Disable all read interrupts until
    * we are out of "bytes to read"
    */


    bytes_to_read = cqspi_get_rd_sram_level();

    if (ret && bytes_to_read == 0)
    {
    perror("Indirect read timeout, no bytes\n");
    goto failrd;
    }

    while (bytes_to_read != 0)
    {
    unsigned int word_remain = round_down(remaining, 4);

    bytes_to_read *= OSPI_fifo_width;
    bytes_to_read = bytes_to_read > remaining ? remaining : bytes_to_read;
    bytes_to_read = round_down(bytes_to_read, 4);
    /* Read 4 byte word chunks then single bytes */
    if (bytes_to_read)
    {
    user_ioread32_rep(ahb_base,(uint32_t *) rxbuf,
    (bytes_to_read / 4));
    }
    else if (!word_remain && mod_bytes)
    {
    unsigned int temp = user_ioread32(ahb_base);

    bytes_to_read = mod_bytes;
    memcpy(rxbuf, &temp, min((unsigned int)(rxbuf_end - rxbuf), bytes_to_read));
    }
    rxbuf += bytes_to_read;
    remaining -= bytes_to_read;
    bytes_to_read = cqspi_get_rd_sram_level();
    }

    if (remaining > 0)
    write_reg(CQSPI_REG_IRQSTATUS, CQSPI_IRQ_STATUS_MASK);
    }
    munmap(ahb_base, OSPI_ABH_SIZE);
    /* Check indirect done status */
    ret = cqspi_wait_for_bit(CQSPI_REG_INDIRECTRD,
    CQSPI_REG_INDIRECTRD_DONE_MASK, 0);
    if (ret)
    {
    perror("Indirect read completion error \n");
    goto failrd;
    }

    /* Disable interrupt */
    write_reg(CQSPI_REG_IRQMASK,0);

    /* Clear indirect completion status */
    write_reg(CQSPI_REG_INDIRECTRD, CQSPI_REG_INDIRECTRD_DONE_MASK);

    return 0;

    failrd:
    /* Disable interrupt */
    write_reg(CQSPI_REG_IRQMASK, 0);

    /* Cancel the indirect read */
    write_reg(CQSPI_REG_INDIRECTRD,CQSPI_REG_INDIRECTRD_CANCEL_MASK);
    return ret;
    }

    static int cqspi_versal_indirect_read_dma(uint8_t *rxbuf, _loff_t from_addr,
    size_t n_rx, uint32_t exdummy)
    {
    uint32_t reg, bytes_to_dma;
    _loff_t addr = from_addr;
    uint8_t *buf = rxbuf;
    uint32_t *dma_addr;
    uint8_t bytes_rem;
    int ret = 0;
    n_rx += exdummy;
    bytes_rem = n_rx % 4;
    bytes_to_dma = (n_rx - bytes_rem);

    if (!bytes_to_dma)
    goto nondmard;

    // ret = zynqmp_pm_ospi_mux_select(cqspi->pd_dev_id, PM_OSPI_MUX_SEL_DMA);
    // if (ret)
    // return ret;

    cqspi_controller_enable(false);

    reg = read_reg(CQSPI_REG_CONFIG);
    reg |= CQSPI_REG_CONFIG_DMA_MASK;
    write_reg(CQSPI_REG_CONFIG, reg);

    cqspi_controller_enable(true);

    dma_addr = mmap(NULL, OSPI_DMA_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, OSPI_DMA_ADDR);
    if (dma_addr == MAP_FAILED)
    {
    perror("mmap");
    return -1;
    }

    write_reg(CQSPI_REG_INDIRECTRDSTARTADDR, from_addr);
    write_reg(CQSPI_REG_INDIRECTRDBYTES, bytes_to_dma);
    write_reg(CQSPI_REG_INDTRIG_ADDRRANGE, CQSPI_REG_VERSAL_ADDRRANGE_WIDTH_VAL);

    /* Clear all interrupts. */
    write_reg(CQSPI_REG_IRQSTATUS, CQSPI_IRQ_STATUS_MASK);

    /* Enable DMA done interrupt */
    write_reg(CQSPI_REG_VERSAL_DMA_DST_I_EN, CQSPI_REG_VERSAL_DMA_DST_DONE_MASK);

    /* Default DMA periph configuration */
    write_reg(CQSPI_REG_DMA, CQSPI_REG_VERSAL_DMA_VAL);

    /* Configure DMA Dst address */
    write_reg(CQSPI_REG_VERSAL_DMA_DST_ADDR, lower_32_bits(OSPI_DMA_ADDR));
    write_reg(CQSPI_REG_VERSAL_DMA_DST_ADDR_MSB, upper_32_bits(OSPI_DMA_ADDR));

    /* Configure DMA Src address */
    write_reg(CQSPI_REG_VERSAL_DMA_SRC_ADDR, OSPI_trigger_address);

    /* Set DMA destination size */
    write_reg(CQSPI_REG_VERSAL_DMA_DST_SIZE, bytes_to_dma);

    /* Set DMA destination control */
    write_reg(CQSPI_REG_VERSAL_DMA_DST_CTRL, CQSPI_REG_VERSAL_DMA_DST_CTRL_VAL);

    write_reg(CQSPI_REG_INDIRECTRD, CQSPI_REG_INDIRECTRD_START_MASK);

    /* Wait for DMA done interrupt */
    if (!(cqspi_waite_for_IRQ(CQSPI_IRQ_MASK_RD, CQSPI_TIMEOUT_MS)))
    {
    perror("DMA timeout\n");
    ret = -ETIMEDOUT;
    goto failrd;
    }
    usleep(1000000);
    for (size_t i = exdummy; i < n_rx; i++)
    buf[i - exdummy] = read_dma(dma_addr, i);

    write_reg(CQSPI_REG_VERSAL_DMA_DST_I_DIS, 0x0);

    /* Clear indirect completion status */
    write_reg(CQSPI_REG_INDIRECTRD, CQSPI_REG_INDIRECTRD_DONE_MASK);

    munmap(dma_addr, OSPI_DMA_SIZE);

    cqspi_controller_enable(0);

    reg = read_reg(CQSPI_REG_CONFIG);
    reg &= ~CQSPI_REG_CONFIG_DMA_MASK;
    write_reg(CQSPI_REG_CONFIG, reg);

    cqspi_controller_enable(1);

    return 0;

    nondmard:
    if (bytes_rem)
    {
    addr += bytes_to_dma;
    buf += bytes_to_dma;
    ret = cqspi_indirect_read_execute(buf, addr,bytes_rem);
    if (ret)
    return ret;
    }
    return 0;

    failrd:
    /* Disable DMA interrupt */
    write_reg(CQSPI_REG_VERSAL_DMA_DST_I_DIS, 0x0);

    /* Cancel the indirect read */
    write_reg(CQSPI_REG_INDIRECTRD, CQSPI_REG_INDIRECTWR_CANCEL_MASK);

    munmap(dma_addr, OSPI_ABH_SIZE);

    reg = read_reg(CQSPI_REG_CONFIG);
    reg &= ~CQSPI_REG_CONFIG_DMA_MASK;
    write_reg(CQSPI_REG_CONFIG, reg);
    return ret;
    }

    static ssize_t cqspi_read(const uint16_t cmd, uint32_t cmdSize, uint32_t address, uint32_t addressSize,
    uint32_t dummyCycles, uint32_t dataInSize, uint8_t *dataIn)
    {
    _loff_t from = address;
    size_t len = dataInSize;
    uint8_t *buf = dataIn;
    // uint64_t dma_align = (uint64_t)(uintptr_t)buf;
    int ret;

    uint32_t extradummy = 0;
    if (dummyCycles > CQSPI_DUMMY_CLKS_MAX)
    {
    dummyCycles = CQSPI_DUMMY_CLKS_MAX;
    extradummy = (dummyCycles - CQSPI_DUMMY_CLKS_MAX) / CQSPI_DUMMY_CLKS_PER_BYTE;
    }
    ret = cqspi_read_setup(cmd, cmdSize, dummyCycles, addressSize);
    if (ret)
    return ret;


    //ret = cqspi_versal_indirect_read_dma(buf, from, len,

    ret = cqspi_indirect_read_execute(buf, from, len);
    #ifdef CQSPI_DEBUG
    hex_dump(buf, dataInSize, 32, " RX");
    #endif
    return ret;
    }

    #ifdef __cplusplus
    extern "C"
    {
    #endif

    int32_t SendBuffer_cqspi(const uint8_t *dataOutStream, uint32_t cmdSize, uint32_t addressSize,
    uint32_t dataOutSize, uint32_t dummyCycles, uint8_t *dataIn, uint32_t dataInSize)
    {
    fd = open("/dev/mem", O_RDWR | O_SYNC);
    if (fd < 0)
    {
    perror("open");
    return -1;
    }

    // Map the OSPI register space.
    ospi_base = mmap(NULL, OSPI_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, OSPI_BASE_ADDR);
    if (ospi_base == MAP_FAILED)
    {
    perror("mmap");
    close(fd);
    return -1;
    }
    // Enable OSPI controller
    cqspi_configure(OSPI_MAX_SPEED_HZ, OSPI_CS);
    uint16_t cmd = 0;
    uint32_t address = 0;
    for (uint32_t i = 0; i < cmdSize; i++)
    {
    cmd <<= 8;
    cmd |= dataOutStream[i];
    }
    for (uint32_t i = 0; i < addressSize; i++)
    {
    address <<= 8;
    address |= dataOutStream[cmdSize + i];
    }

    #ifdef CQSPI_DEBUG
    printf("cmd:0x%X,cmdsize:[%d] ,address:0x%X, addresslen:[%d],datasize[%d] ,DummyCycle:[%d],Datainsize:[%d]\n",
    cmd, cmdSize, address, addressSize, dataOutSize, dummyCycles, dataInSize);
    hex_dump(dataOutStream, cmdSize + addressSize + dataOutSize, 32, " TX");
    #endif

    if (dataInSize)
    {
    if (dataInSize <= CQSPI_STIG_DATA_LEN_MAX)
    cqspi_STIG_read(cmd, cmdSize, address, addressSize, dummyCycles, dataInSize, dataIn);
    else
    cqspi_read(cmd, cmdSize, address, addressSize, dummyCycles, dataInSize, dataIn);
    }
    else

    if (dataOutSize <= CQSPI_STIG_DATA_LEN_MAX)
    cqspi_STIG_write(cmd, cmdSize, address, addressSize, &dataOutStream[cmdSize + addressSize], dataOutSize);
    else
    {
    // cqspi_DMA_write(cmd, address, dummyCycles, CQSPI_STIG_DATA_LEN_MAX, dataOutSize, &dataOutStream[cmdSize + addressSize]);
    }

    cqspi_controller_enable(0);
    munmap((void *)ospi_base, OSPI_SIZE);
    close(fd);
    return 0;
    }

    #ifdef __cplusplus
    }
    #endif
    #ifdef EXECUTE_TEST
    // gcc -o qlib_platform_qspi qlib_platform_qspi.c -DEXECUTE_TEST=1 -g3
    void main()
    {
    uint8_t dataOutStream[16] = {0};
    uint8_t dataIn[16] = {0};
    // Example: Read JEDEC ID (opcode 0x9F) with 8 dummy cycles.
    dataOutStream[0] = 0xAB; // Opcode
    SendBuffer_cqspi(dataOutStream, 1 /*cmdsize*/, 0 /*address size*/, 0 /*dataoutsize*/, 0 /*Dummycycles*/, dataIn, 0 /*data in size*/);

    dataOutStream[0] = 0xB7; // Opcode
    SendBuffer_cqspi(dataOutStream, 1 /*cmdsize*/, 4 /*address size*/, 0 /*dataoutsize*/, 0 /*Dummycycles*/, dataIn, 0 /*data in size*/);

    dataOutStream[0] = 0x9F; // Opcode
    SendBuffer_cqspi(dataOutStream, 1 /*cmdsize*/, 0 /*address size*/, 0 /*dataoutsize*/, 0 /*Dummycycles*/, dataIn, 3 /*data in size*/);
    SendBuffer_cqspi(dataOutStream, 1 /*cmdsize*/, 0 /*address size*/, 0 /*dataoutsize*/, 0 /*Dummycycles*/, dataIn, 3 /*data in size*/);

    dataOutStream[0] = 0xA2; // Opcode
    SendBuffer_cqspi(dataOutStream, 1 /*cmdsize*/, 0 /*address size*/, 0 /*dataoutsize*/, 40 /*Dummycycles*/, dataIn, 20 /*data in size*/);
    }
    #endif

  • Hi Hossein,

    I was able to use STIG from userspace by accessing the registers directly. However, since I need to support more than 32 dummy cycles and also transfer more than 16 bytes of data, STIG does not meet my requirements.

    Could you please let me know if it's possible to access OSPI in legacy mode from userspace? If so, is there any API available for this purpose?

    I'm surprised how much effort you put into this but I think the general approach (and specifically so "legacy mode") doesn't really sound like much of a viable path with everything you need to do to manage the transfer, no? I suppose if you are using a "256Mb SPI-NOR Quad device" you have some minimum throughput requirements? Have you considered using a different NOR Flash that works right out of the box and is compatible with the controller? Why do you need to use this specific device? What's the part number? What do you want to use it for? To hold data/filesystem, etc.?

    Also as started earlier we don't have any example code on the "legacy mode" operation. I suppose it'll work a bit like a regular SPI peripheral (but giving you access to the entire width of the bus), but you'd need to manage everything manually, which will likely come with a big performance hit.

    Regards, Andreas

  • Certainly! Here's the updated version of your message with the new paragraph added and integrated smoothly into the overall tone:


    Hello Andreas,

    First of all, thank you for your previous guidance.

    Regarding the use of other available SPI ports, I would like to clarify that based on the TI AM625 SoC (as referenced in the technical reference manual, page 481, with further details on pages 482 to 485), the only interface supported for boot is OSPI. Therefore, we are unable to use McSPI for this purpose. Given this limitation, we need to be able to use OSPI in legacy mode.

    I would appreciate it if you could explain the general process for using OSPI in legacy mode. Is there any API support for this? Or is it necessary to access register addresses directly? Are there any other recommended approaches? If possible, I would be grateful if you could share an example algorithm or any relevant documentation for using OSPI in legacy mode.

    So far, the only official source I’ve been able to find for AM625 is the TRM. One of the biggest challenges I have with this document is that many register addresses are not clearly specified. For example, on page 1384, it states that RX-FIFO and TX-FIFO are accessible, but I could not find any actual addresses for accessing them. Additionally, many other registers have different names than those listed in Table 2-1. Memory Map (pages 32–46), which makes it even harder to correlate them. If there is a more complete or detailed reference—such as one that explicitly includes the address of OSPI_IND_AHB_ADDR_TRIGGER_REG—I would greatly appreciate it if you could point me to it.

    Secondly, I’ve made some progress on the code I previously shared. For data sizes larger than the STIG limit, it can now fetch data from the AHB space. However, the data is being received in a sparse manner. I’ve tried adjusting the read size and access method, but have not been able to resolve the issue.

    Here is the issue in more detail:

    • When I request 32 bytes, the digital analyzer shows that 64 bytes are received.

    • I tested with 16 and 48 bytes as well, and in every case, 2× the expected amount of data is received.

    The received buffer behaves as follows:

    • The first 8 bytes repeat the last byte from the previous request.

    • From the received 2x data:

      • The first x bytes are completely ignored.

      • In the second x bytes:

        • The first 8 bytes are ignored.

        • Of the remaining bytes, 4 are read correctly, and the next 4 are skipped.

        • This pattern continues for each 8-byte block.

    Please note that this code is a modified version of the Cadence driver. I've included the updated code, results, and digital analyzer captures below. Unfortunately, I couldn’t find any documentation from TI that explains how to use OSPI in legacy mode or provides a clear method for implementation.

    Thank you in advance for your assistance.

    Best regards,
    Hossein


    Let me know if you'd like this in PDF format or prepared for an email as well.

    #include <stdio.h>

    #include <fcntl.h>
    #include <sys/mman.h>
    #include <unistd.h>
    #include <stdint.h>
    #include <string.h>
    #include <time.h>
    #include <stdbool.h>
    #include <errno.h>
    #include <stdlib.h>

    #include "spi-cadence-quadspi.h"
    typedef unsigned long long _loff_t;

    #define DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
    #define min(x, y) ((x) < (y) ? (x) : (y))

    #define OSPI_BASE_ADDR 0x0FC40000 // Base address from DTB
    #define OSPI_SIZE 0x1000          // Size of register space
    #define OSPI_CMD_WIDTH 0x0        // Command band width spi 0x0 dual 0x1 quad 0x2 octa 0x3;
    #define OSPI_ADDR_WIDTH 0x0       // Address band width spi 0x0 dual 0x1 quad 0x2 octa 0x3;
    #define OSPI_DATA_WIDTH 0x0       // Data band width spi 0x0 dual 0x1 quad 0x2 octa 0x3;
    #define OSPI_CPU_RELAX_time 100
    #define OSPI_MAX_SPEED_HZ 12000000 // Max speed in Hz for using with digital analyzer can reduce to (OSPI_REF_CLK_HZ-1)/30 on AM6252 sancloude is 5MHz
    #define OSPI_CS 0x0                // Chip select CS0 =0X0 CS1=0X1 CS2=0X2 CS3=0X3
    #define OSPI_rclk_en 0x0
    /*                                                                                                                       \
    This value is based on the device tree. In our device tree, we inherited from BeagleBone.                                                                  \
    OSPI0's clock is set with the following configuration:                                                                                                     \
                                                                                                                                                               \
        clocks = <&k3_clks 75 7>;                                                                                                                              \
                                                                                                                                                               \
    On AM6252, this clock generates 166,666,666 Hz. You can retrieve this value using the following command:                                                   \
                                                                                                                                                               \
        cat /sys/kernel/debug/clk/clk_summary | grep -e "75:7" -e "count" -e "---"                                                                             \
                                                                                                                                                               \
    The returned value for AM6252 on the SanCloud board is:                                                                                                    \
                                                                                                                                                               \
    clock                          count    count    count        rate   accuracy phase  cycle    enable   consumer                         id                 \
    ---------------------------------------------------------------------------------------------------------------------------------------------              \
    clk:75:7                         0       0        0        166666666   0          0     50000      Y      fc40000.spi                     no_connection_id \
    */
    #define OSPI_REF_CLK_HZ 166666666

    /*  base on device tree
    cdns,tshsl-ns = <60>;
    cdns,tsd2d-ns = <60>;
    cdns,tchsh-ns = <60>;
    cdns,tslch-ns = <60>;
    cdns,read-delay = <4>;
    */
    #define OSPI_tshsl_ns 60
    #define OSPI_tchsh_ns 60
    #define OSPI_tslch_ns 60
    #define OSPI_tsd2d_ns 60
    #define OSPI_read_delay 4
    /*
     DUE to device tree reg
    ospi0: spi@fc40000 {
                compatible = "ti,am654-ospi", "cdns,qspi-nor";
                reg = <0x00 0x0fc40000 0x00 0x100>,
                      <0x05 0x00000000 0x01 0x00000000>;
                interrupts = <GIC_SPI 139 IRQ_TYPE_LEVEL_HIGH>;
                cdns,fifo-depth = <256>;
                cdns,fifo-width = <4>;
                cdns,trigger-address = <0x0>;
                cdns,phase-detect-selector = <2>;
                clocks = <&k3_clks 75 7>;
                assigned-clocks = <&k3_clks 75 7>;
                assigned-clock-parents = <&k3_clks 75 8>;
                assigned-clock-rates = <166666666>;
                power-domains = <&k3_pds 75 TI_SCI_PD_EXCLUSIVE>;
                #address-cells = <1>;
                #size-cells = <0>;
                status = "disabled";
            };



    &ospi0 {
        bootph-all;
        status = "okay";
        pinctrl-names = "default";
        pinctrl-0 = <&ospi0_pins_default>;
        reg = <0x00 0x0fc40000 0x00 0x100>,
              <0x05 0x00000000 0x01 1000000>;


        flash@0 {
            bootph-all;
            compatible = "winbond,w77q25nw","jedec,spi-nor";
            reg = <0x0>;
            spi-tx-bus-width = <1>;
            spi-rx-bus-width = <4>;
            spi-max-frequency = <25000000>;
            cdns,tshsl-ns = <60>;
            cdns,tsd2d-ns = <60>;
            cdns,tchsh-ns = <60>;
            cdns,tslch-ns = <60>;
            cdns,read-delay = <4>;
        };
    };

    reg = <0x00 0x0fc40000 0x00 0x100>,
          <0x05 0x00000000 0x01 1000000>;
    ➤ First Entry (registers):
        <0x00 0x0fc40000 0x00 0x100>
            → Address:     0x0FC40000
            → Size:        0x00000100 (256 bytes)
    ➤ Second Entry (AHB window):
        <0x05 0x00000000 0x01 0x100000>   → (note: 0x100000 = 1048576 = 16 MiB)
            → Address:     0x500000000 (large address, possibly due to a bus mapping)
            → Size:        0x00100000 (16 MiB)


    hexdump -c -C  /sys/firmware/devicetree/base/bus@f0000/bus@fc00000/spi@fc40000/reg
    0000000  \0  \0  \0  \0 017 304  \0  \0  \0  \0  \0  \0  \0  \0 001  \0
    00000000  00 00 00 00 0f c4 00 00  00 00 00 00 00 00 01 00  |................|
    0000010  \0  \0  \0 005  \0  \0  \0  \0  \0  \0  \0 001  \0 017   B   @    ***HOSSEIN ABH address***
    00000010  00 00 00 05 00 00 00 00  00 00 00 01 00 0f 42 40  |..............B@|
    00000020
            */
    #define OSPI_AHB_ADDR 0x500000000 // AHB address space
    #define OSPI_AHB_SIZE 0x102000000 // AHB size (16 MiB)
    #define OSPI_trigger_address 0x0
    #define OSPI_fifo_width 4
    #define OSPI_fifo_depth 256


    static void hex_dump(const void *src, size_t length, size_t line_size,
                         char *prefix)
    {
        int i = 0;
        const unsigned char *address = src;
        const unsigned char *line = address;
        unsigned char c;

        printf("%s | ", prefix);
        if (length == 0)
            printf("\n");
        else
            while (length-- > 0)
            {
                printf("%02X ", *address++);
                if (!(++i % line_size) || (length == 0 && i % line_size))
                {
                    if (length == 0)
                    {
                        while (i++ % line_size)
                            printf("__ ");
                    }
                    printf(" |");
                    while (line < address)
                    {
                        c = *line++;
                        printf("%c", (c < 32 || c > 126) ? '.' : c);
                    }
                    printf("|\n");
                    if (length > 0)
                        printf("%s | ", prefix);
                }
            }
    }

    volatile uint32_t *ospi_base = NULL;
    int fd = 0;

    static inline void write_reg(uint32_t offset, uint32_t value)
    {
        *(volatile uint32_t *)((uintptr_t)ospi_base + offset) = value;
    }

    static inline uint32_t read_reg(uint32_t offset)
    {
        return *(volatile uint32_t *)((uintptr_t)ospi_base + offset);
    }

    static inline uint8_t read_dma(uint32_t *base, uint32_t offset)
    {
        return *(volatile uint8_t *)((uintptr_t)base + offset);
    }

    void L_memcpy(void *dest, const void *src, size_t len)
    {
        for (size_t i = 0; i < len; i++)
            *((uint8_t *)(dest + i)) = *((uint8_t *)(src + i));
    }

    static int cqspi_wait_for_bit(uint32_t offset, const uint32_t mask, bool clr)
    {
        /* Polling for completion. */
        struct timespec start_time, current_time;

        clock_gettime(CLOCK_MONOTONIC, &start_time);
        unsigned long timeout_ns = start_time.tv_sec * 1000000000L + start_time.tv_nsec +
                                   (CQSPI_TIMEOUT_MS * 1000000L);
        // waite for execute
        while (1)
        {
            uint32_t val = read_reg(offset);
            if (((clr ? ~val : val) & mask) == mask)
                return 0;
            usleep(OSPI_CPU_RELAX_time);
            clock_gettime(CLOCK_MONOTONIC, &current_time);
            unsigned long now_ns = current_time.tv_sec * 1000000000L + current_time.tv_nsec;
            if (now_ns > timeout_ns)
            {
                /* Timeout, in busy mode. */
                perror("QSPI is still busy after %dms timeout.\n");
                return -ETIMEDOUT;
            }
        }
    }

    static int cqspi_setup_opcode_ext(uint16_t cmd, uint32_t shift)
    {
        uint8_t ext;
        if (!(ext = cmd & 0XFF))
            return -2;
        uint32_t reg = read_reg(CQSPI_REG_OP_EXT_LOWER);
        printf("ext:0x%X\n", ext);
        reg &= ~(0xff << shift);
        reg |= ext << shift;
        write_reg(CQSPI_REG_OP_EXT_LOWER, reg);
        return 0;
    }
    static bool cqspi_is_idle()
    {
        uint32_t reg = read_reg(CQSPI_REG_CONFIG);
        return reg & (1UL << CQSPI_REG_CONFIG_IDLE_LSB);
    }

    int cqspi_wait_idle()
    {
        const unsigned int poll_idle_retry = 3;
        unsigned int count = 0;
        struct timespec start_time, current_time;

        clock_gettime(CLOCK_MONOTONIC, &start_time);
        unsigned long timeout_ns = start_time.tv_sec * 1000000000L + start_time.tv_nsec +
                                   (CQSPI_TIMEOUT_MS * 1000000L);

        while (1)
        {
            if (cqspi_is_idle())
                count++;
            else
                count = 0;

            if (count >= poll_idle_retry)
                return 0;

            clock_gettime(CLOCK_MONOTONIC, &current_time);
            unsigned long now_ns = current_time.tv_sec * 1000000000L + current_time.tv_nsec;

            if (now_ns > timeout_ns)
            {
                /* Timeout, in busy mode. */
                perror("QSPI is still busy after %dms timeout.\n");
                return -ETIMEDOUT;
            }
            usleep(OSPI_CPU_RELAX_time);
        }
    }

    static int cqspi_enable_dtr(uint16_t cmd, uint32_t cmdSize, uint32_t shift)
    {
        unsigned int reg;
        int ret;

        reg = read_reg(CQSPI_REG_CONFIG);

        /*
         * We enable dual byte opcode here. The callers have to set up the
         * extension opcode based on which type of operation it is.
         */
        if (cmdSize > 1)
        {
            reg |= CQSPI_REG_CONFIG_DTR_PROTO;
            reg |= CQSPI_REG_CONFIG_DUAL_OPCODE;

            /* Set up command opcode extension. */
            ret = cqspi_setup_opcode_ext(cmd, shift);
            if (ret)
                return ret;
        }
        else
        {
            reg &= ~CQSPI_REG_CONFIG_DTR_PROTO;
            reg &= ~CQSPI_REG_CONFIG_DUAL_OPCODE;
        }

        write_reg(CQSPI_REG_CONFIG, reg);

        return cqspi_wait_idle();
    }

    static unsigned int cqspi_calc_rdreg()
    {
        uint32_t rdreg = 0;

        rdreg |= OSPI_CMD_WIDTH << CQSPI_REG_RD_INSTR_TYPE_INSTR_LSB;
        rdreg |= OSPI_ADDR_WIDTH << CQSPI_REG_RD_INSTR_TYPE_ADDR_LSB;
        rdreg |= OSPI_DATA_WIDTH << CQSPI_REG_RD_INSTR_TYPE_DATA_LSB;

        return rdreg;
    }
    static int cqspi_exec_flash_cmd(uint32_t reg)
    {

        /* Write the CMDCTRL without start execution. */
        write_reg(CQSPI_REG_CMDCTRL, reg);
        /* Start execute */
        reg |= CQSPI_REG_CMDCTRL_EXECUTE_MASK;
        write_reg(CQSPI_REG_CMDCTRL, reg);

        // /* Polling for completion. */
        int ret = cqspi_wait_for_bit(CQSPI_REG_CMDCTRL,
                                     CQSPI_REG_CMDCTRL_INPROGRESS_MASK, 1);
        if (ret)
        {
            perror("Flash command execution timed out.\n");
            return ret;
        }

        /* Polling QSPI idle status. */
        return cqspi_wait_idle();
    }

    /// Function to send a command and read data from the device. for less than 8 bytes
    int cqspi_STIG_read(uint16_t cmd, uint32_t cmdSize, uint32_t address, uint8_t address_len,
                        uint8_t dummy_cycles, int read_len, uint8_t *rxbuf)
    {
        int status;

        status = cqspi_enable_dtr(cmd, cmdSize, CQSPI_REG_OP_EXT_STIG_LSB);
        if (status)
            return status;
        if (!rxbuf)
        {
            perror("Invalid input argument len or rxbuf");
            return -EINVAL;
        }
        uint8_t opcode = (cmdSize <= 1) ? cmd : (cmd >> 8);
        uint32_t reg = opcode << CQSPI_REG_CMDCTRL_OPCODE_LSB;
        uint32_t rdreg = cqspi_calc_rdreg();
        write_reg(CQSPI_REG_RD_INSTR, rdreg);

        uint32_t exdummyBytes = 0;
        if (dummy_cycles > CQSPI_DUMMY_CLKS_MAX)
        {
            exdummyBytes = (dummy_cycles - CQSPI_DUMMY_CLKS_MAX) / CQSPI_DUMMY_CLKS_PER_BYTE;
            dummy_cycles = CQSPI_DUMMY_CLKS_MAX;
            if (address_len == 0)
                address = 0;
            if ((address_len + exdummyBytes - 1) <= CQSPI_REG_CMDCTRL_ADD_BYTES_MASK)
            {
                address_len += exdummyBytes;
                exdummyBytes = 0;
            }
        }

        if (dummy_cycles)
            reg |= (dummy_cycles & CQSPI_REG_CMDCTRL_DUMMY_MASK)
                   << CQSPI_REG_CMDCTRL_DUMMY_LSB;
        if (read_len)
        {
            reg |= (0x1 << CQSPI_REG_CMDCTRL_RD_EN_LSB);

            /* 0 means 1 byte. */
            reg |= (((read_len - 1) & CQSPI_REG_CMDCTRL_RD_BYTES_MASK)
                    << CQSPI_REG_CMDCTRL_RD_BYTES_LSB);
        }
        /* setup ADDR BIT field */
        if (address_len)
        {
            reg |= (0x1 << CQSPI_REG_CMDCTRL_ADDR_EN_LSB);
            reg |= ((address_len - 1) &
                    CQSPI_REG_CMDCTRL_ADD_BYTES_MASK)
                   << CQSPI_REG_CMDCTRL_ADD_BYTES_LSB;

            write_reg(CQSPI_REG_CMDADDRESS, address);
        }

        status = cqspi_exec_flash_cmd(reg);
        if (status)
            return status;

        // reading data
        int len = (read_len > 4) ? 4 : read_len;
        if (read_len)
        {
            reg = read_reg(CQSPI_REG_CMDREADDATALOWER);
            L_memcpy(rxbuf, &reg, len);
        }
        if (read_len > 4)
        {
            reg = read_reg(CQSPI_REG_CMDREADDATAUPPER);
            len = read_len - len;
            L_memcpy(rxbuf + 4, &reg, len);
        }

        /* Reset CMD_CTRL Reg once command read completes */
        write_reg(CQSPI_REG_CMDCTRL, 0);
    #ifdef CQSPI_DEBUG
        if (read_len)
            hex_dump(rxbuf, read_len, 32, "  RX");
    #endif

        return cqspi_wait_idle();
    }

    static void cqspi_controller_enable(bool enable)
    {
        uint32_t reg;

        reg = read_reg(CQSPI_REG_CONFIG);

        if (enable)
            reg |= CQSPI_REG_CONFIG_ENABLE_MASK;
        else
            reg &= ~CQSPI_REG_CONFIG_ENABLE_MASK;

        write_reg(CQSPI_REG_CONFIG, reg);
    }
    static void cqspi_chipselect(uint32_t cs)
    {
        unsigned int chip_select = cs;
        unsigned int reg;

        reg = read_reg(CQSPI_REG_CONFIG);
        reg &= ~CQSPI_REG_CONFIG_DECODE_MASK;

        /* Convert CS if without decoder.
         * CS0 to 4b'1110
         * CS1 to 4b'1101
         * CS2 to 4b'1011
         * CS3 to 4b'0111
         */
        chip_select = 0xF & ~(1 << chip_select);
        // }

        reg &= ~(CQSPI_REG_CONFIG_CHIPSELECT_MASK
                 << CQSPI_REG_CONFIG_CHIPSELECT_LSB);
        reg |= (chip_select & CQSPI_REG_CONFIG_CHIPSELECT_MASK)
               << CQSPI_REG_CONFIG_CHIPSELECT_LSB;
        write_reg(CQSPI_REG_CONFIG, reg);
    }
    static void cqspi_config_baudrate_div(uint32_t sclk)
    {
        uint32_t reg, div;

        /* Recalculate the baudrate divisor based on QSPI specification. */
        div = DIV_ROUND_UP(OSPI_REF_CLK_HZ, 2 * sclk) - 1;

        /* Maximum baud divisor */
        if (div > CQSPI_REG_CONFIG_BAUD_MASK)
        {
            div = CQSPI_REG_CONFIG_BAUD_MASK;
            perror("Unable to adjust clock ");
        }
        reg = read_reg(CQSPI_REG_CONFIG);
        reg &= ~(CQSPI_REG_CONFIG_BAUD_MASK << CQSPI_REG_CONFIG_BAUD_LSB);
        reg |= (div & CQSPI_REG_CONFIG_BAUD_MASK) << CQSPI_REG_CONFIG_BAUD_LSB;
        write_reg(CQSPI_REG_CONFIG, reg);
    }
    static unsigned int calculate_ticks_for_ns(const unsigned int ref_clk_hz,
                                               const unsigned int ns_val)
    {
        unsigned int ticks;

        ticks = ref_clk_hz / 1000; /* kHz */
        ticks = DIV_ROUND_UP(ticks * ns_val, 1000000);

        return ticks;
    }

    static void cqspi_delay(uint32_t sclk)
    {
        unsigned int tshsl, tchsh, tslch, tsd2d;
        unsigned int reg;
        unsigned int tsclk;

        /* calculate the number of ref ticks for one sclk tick */
        tsclk = DIV_ROUND_UP(OSPI_REF_CLK_HZ, sclk);

        tshsl = calculate_ticks_for_ns(OSPI_REF_CLK_HZ, OSPI_tshsl_ns);
        /* this particular value must be at least one sclk */
        if (tshsl < tsclk)
            tshsl = tsclk;

        tchsh = calculate_ticks_for_ns(OSPI_REF_CLK_HZ, OSPI_tchsh_ns);
        tslch = calculate_ticks_for_ns(OSPI_REF_CLK_HZ, OSPI_tslch_ns);
        tsd2d = calculate_ticks_for_ns(OSPI_REF_CLK_HZ, OSPI_tsd2d_ns);

        reg = (tshsl & CQSPI_REG_DELAY_TSHSL_MASK)
              << CQSPI_REG_DELAY_TSHSL_LSB;
        reg |= (tchsh & CQSPI_REG_DELAY_TCHSH_MASK)
               << CQSPI_REG_DELAY_TCHSH_LSB;
        reg |= (tslch & CQSPI_REG_DELAY_TSLCH_MASK)
               << CQSPI_REG_DELAY_TSLCH_LSB;
        reg |= (tsd2d & CQSPI_REG_DELAY_TSD2D_MASK)
               << CQSPI_REG_DELAY_TSD2D_LSB;
        write_reg(CQSPI_REG_DELAY, reg);
    }
    static void cqspi_readdata_capture(const bool bypass,
                                       const bool dqs, const unsigned int delay)
    {
        unsigned int reg;

        reg = read_reg(CQSPI_REG_READCAPTURE);

        if (bypass)
            reg |= (1 << CQSPI_REG_READCAPTURE_BYPASS_LSB);
        else
            reg &= ~(1 << CQSPI_REG_READCAPTURE_BYPASS_LSB);

        reg &= ~(CQSPI_REG_READCAPTURE_DELAY_MASK
                 << CQSPI_REG_READCAPTURE_DELAY_LSB);

        reg |= (delay & CQSPI_REG_READCAPTURE_DELAY_MASK)
               << CQSPI_REG_READCAPTURE_DELAY_LSB;

        if (dqs)
            reg |= (1 << CQSPI_REG_READCAPTURE_DQS_LSB);
        else
            reg &= ~(1 << CQSPI_REG_READCAPTURE_DQS_LSB);

        write_reg(CQSPI_REG_READCAPTURE, reg);
    }

    static void cqspi_configure(uint32_t sclk, uint32_t cs)
    {
        // struct cqspi_st *cqspi = f_pdata->cqspi;
        static uint32_t current_cs = 0;
        static uint32_t current_sclk = 0;
        bool switch_cs = (current_cs != cs);
        bool switch_ck = (current_sclk != sclk);

        if (switch_cs || switch_ck)
            cqspi_controller_enable(0);

        /* Switch chip select. */
        if (switch_cs)
        {
            current_cs = cs;
            cqspi_chipselect(cs);
        }

        /* Setup baudrate divisor and delays */
        if (switch_ck)
        {
            current_sclk = sclk;
            cqspi_config_baudrate_div(current_sclk);
            cqspi_delay(current_sclk);
            cqspi_readdata_capture(!(OSPI_rclk_en), false,
                                   OSPI_read_delay);
        }

        // if (switch_cs || switch_ck)
        cqspi_controller_enable(1);
    }

    int32_t cqspi_STIG_write(uint16_t cmd, uint32_t cmdSize, uint32_t address, uint8_t address_len,
                             const uint8_t *txbuf, uint32_t n_tx)
    {
        uint8_t opcode;
        unsigned int reg;
        unsigned int data;
        size_t write_len;
        int ret;

        ret = cqspi_enable_dtr(cmd, cmdSize, CQSPI_REG_OP_EXT_STIG_LSB);
        if (ret)
            return ret;

        if (n_tx > CQSPI_STIG_DATA_LEN_MAX || (n_tx && !txbuf))
        {
            perror("Invalid input argument, cmdlen txbuf 0x\n");
            return -EINVAL;
        }

        reg = cqspi_calc_rdreg();
        write_reg(CQSPI_REG_RD_INSTR, reg);

        opcode = (cmdSize <= 1) ? cmd : (cmd >> 8);

        reg = opcode << CQSPI_REG_CMDCTRL_OPCODE_LSB;

        if (address_len)
        {
            reg |= (0x1 << CQSPI_REG_CMDCTRL_ADDR_EN_LSB);
            reg |= ((address_len - 1) &
                    CQSPI_REG_CMDCTRL_ADD_BYTES_MASK)
                   << CQSPI_REG_CMDCTRL_ADD_BYTES_LSB;

            write_reg(CQSPI_REG_CMDADDRESS, address);
        }

        if (n_tx)
        {
            reg |= (0x1 << CQSPI_REG_CMDCTRL_WR_EN_LSB);
            reg |= ((n_tx - 1) & CQSPI_REG_CMDCTRL_WR_BYTES_MASK)
                   << CQSPI_REG_CMDCTRL_WR_BYTES_LSB;
            data = 0;
            write_len = (n_tx > 4) ? 4 : n_tx;
            memcpy(&data, txbuf, write_len);
            txbuf += write_len;
            write_reg(CQSPI_REG_CMDWRITEDATALOWER, data);

            if (n_tx > 4)
            {
                data = 0;
                write_len = n_tx - 4;
                memcpy(&data, txbuf, write_len);
                write_reg(CQSPI_REG_CMDWRITEDATAUPPER, data);
            }
        }

        ret = cqspi_exec_flash_cmd(reg);

        /* Reset CMD_CTRL Reg once command write completes */
        write_reg(CQSPI_REG_CMDCTRL, 0);
        printf("executed cmd:0x%X, cmdsize:[%d] ,address:0x%X, addresslen:[%d],datasize[%d] \n",
               cmd, cmdSize, address, address_len, n_tx);

        return ret;
    }

    static int cqspi_read_setup(const uint16_t cmd, uint32_t cmdSize, uint32_t dummy_clk, uint32_t addressSize)
    {
        unsigned int reg;
        int ret;
        uint8_t opcode;

        ret = cqspi_enable_dtr(cmd, cmdSize, CQSPI_REG_OP_EXT_READ_LSB);
        if (ret)
            return ret;

        if (cmdSize > 1)
            opcode = cmd >> 8;
        else
            opcode = cmd;

        reg = opcode << CQSPI_REG_RD_INSTR_OPCODE_LSB;
        reg |= cqspi_calc_rdreg();

        /* Setup dummy clock cycles */
        if (dummy_clk > CQSPI_DUMMY_CLKS_MAX)
            dummy_clk = CQSPI_DUMMY_CLKS_MAX;

        if (dummy_clk)
            reg |= (dummy_clk & CQSPI_REG_RD_INSTR_DUMMY_MASK)
                   << CQSPI_REG_RD_INSTR_DUMMY_LSB;

        write_reg(CQSPI_REG_RD_INSTR, reg);

        /* Set address width */
        reg = read_reg(CQSPI_REG_SIZE);
        reg &= ~CQSPI_REG_SIZE_ADDRESS_MASK;
        reg |= (addressSize - 1);
        write_reg(CQSPI_REG_SIZE, reg);
        read_reg(CQSPI_REG_SIZE); /* Flush posted write. */
        return 0;
    }

    static uint32_t cqspi_get_rd_sram_level()
    {
        uint32_t reg = read_reg(CQSPI_REG_SDRAMLEVEL);

        reg >>= CQSPI_REG_SDRAMLEVEL_RD_LSB;
        return reg & CQSPI_REG_SDRAMLEVEL_RD_MASK;
    }

    static int cqspi_indirect_read_execute(uint8_t *rxbuf, _loff_t from_addr,
                                           const size_t n_rx)
    {
        volatile uint32_t *ahb_base = NULL;
        ahb_base = mmap(NULL, OSPI_AHB_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, OSPI_AHB_ADDR);
        if (ahb_base == MAP_FAILED)
        {
            perror("mmap abh");
            return -1;
        }
        unsigned int remaining = n_rx;
        // unsigned int mod_bytes = n_rx % 4;
        unsigned int bytes_to_read = 0;
        // uint8_t *rxbuf_end = rxbuf + n_rx;
        int ret = 0;
        write_reg(CQSPI_REG_INDIRECTRDSTARTADDR, from_addr);
        write_reg(CQSPI_REG_INDIRECTRDBYTES, n_rx);

        /* Clear all interrupts. */
        write_reg(CQSPI_REG_IRQSTATUS, CQSPI_IRQ_STATUS_MASK);

        write_reg(CQSPI_REG_IRQMASK, CQSPI_REG_IRQ_WATERMARK);

        write_reg(CQSPI_REG_INDIRECTRD, CQSPI_REG_INDIRECTRD_START_MASK);

        read_reg(CQSPI_REG_INDIRECTRD); /* Flush posted write. */

        struct timespec start_time, current_time;

        clock_gettime(CLOCK_MONOTONIC, &start_time);
        unsigned long timeout_ns = start_time.tv_sec * 1000000000L + start_time.tv_nsec +
                                   (CQSPI_TIMEOUT_MS * 1000000L);
        // waite for execute
        while (1)
        {

            if (cqspi_get_rd_sram_level() * OSPI_fifo_width >= remaining)
                break;
            usleep(OSPI_CPU_RELAX_time);
            clock_gettime(CLOCK_MONOTONIC, &current_time);
            unsigned long now_ns = current_time.tv_sec * 1000000000L + current_time.tv_nsec;
            if (now_ns > timeout_ns)
            {
                /* Timeout, in busy mode. */
                printf("read direct timeout\n");
                goto failrd;
            }
        }
     
        while (remaining > 0)
        {
            /*
             * Disable all read interrupts until
             * we are out of "bytes to read"
             */
            bytes_to_read = cqspi_get_rd_sram_level();
            while (bytes_to_read != 0)
            {
                bytes_to_read = min(remaining, bytes_to_read * OSPI_fifo_width);

                uint32_t t = 0;
                bytes_to_read = min(bytes_to_read, sizeof(t));

                t = *(volatile uint32_t *)((uintptr_t)ahb_base);
                memcpy(rxbuf, &t, bytes_to_read);

                rxbuf += bytes_to_read;
                remaining -= bytes_to_read;
                bytes_to_read = cqspi_get_rd_sram_level();
            }

            if (remaining > 0)
                write_reg(CQSPI_REG_IRQSTATUS, CQSPI_IRQ_STATUS_MASK);
        }
        /* Check indirect done status */
        ret = cqspi_wait_for_bit(CQSPI_REG_INDIRECTRD,
                                 CQSPI_REG_INDIRECTRD_DONE_MASK, 0);
        if (ret)
        {
            perror("Indirect read completion error \n");
            goto failrd;
        }

        /* Disable interrupt */
        write_reg(CQSPI_REG_IRQMASK, 0);

        /* Clear indirect completion status */
        write_reg(CQSPI_REG_INDIRECTRD, CQSPI_REG_INDIRECTRD_DONE_MASK);
        munmap((void *)ahb_base, OSPI_AHB_SIZE);
        // close(uio_fd);
        return 0;

    failrd:
        /* Disable interrupt */
        write_reg(CQSPI_REG_IRQMASK, 0);

        /* Cancel the indirect read */
        write_reg(CQSPI_REG_INDIRECTRD, CQSPI_REG_INDIRECTRD_CANCEL_MASK);
        munmap((void *)ahb_base, OSPI_AHB_SIZE);
        return ret;
    }

    static ssize_t cqspi_read(const uint16_t cmd, uint32_t cmdSize, uint32_t address, uint32_t addressSize,
                              uint32_t dummyCycles, uint32_t dataInSize, uint8_t *dataIn)
    {
        _loff_t from = address;
        size_t len = dataInSize;
        uint8_t *buf = dataIn;
        int ret;

        uint32_t extradummy = 0;
        if (dummyCycles > CQSPI_DUMMY_CLKS_MAX)
        {
            dummyCycles = CQSPI_DUMMY_CLKS_MAX;
            extradummy = (dummyCycles - CQSPI_DUMMY_CLKS_MAX) / CQSPI_DUMMY_CLKS_PER_BYTE;
        }
        ret = cqspi_read_setup(cmd, cmdSize, dummyCycles, addressSize);
        if (ret)
            return ret;
        (void)extradummy;
        ret = cqspi_indirect_read_execute(buf, from, len);

        hex_dump(buf, dataInSize, 32, "  RX");
        return ret;
    }

    static void cqspi_controller_init()
    {
        uint32_t reg;

        cqspi_controller_enable(0);

        /* Configure the remap address register, no remap */
        write_reg(CQSPI_REG_REMAP, 0);

        /* Disable all interrupts. */
        write_reg(CQSPI_REG_IRQMASK, 0);

        /* Configure the SRAM split to 1:1 . */
        write_reg(CQSPI_REG_SRAMPARTITION, OSPI_fifo_depth / 2);

        /* Load indirect trigger address. */
        write_reg(CQSPI_REG_INDIRECTTRIGGER, OSPI_trigger_address);

        /* Program read watermark -- 1/2 of the FIFO. */
        write_reg(CQSPI_REG_INDIRECTRDWATERMARK, OSPI_fifo_depth * OSPI_fifo_width / 2);
        /* Program write watermark -- 1/8 of the FIFO. */
        write_reg(CQSPI_REG_INDIRECTWRWATERMARK, OSPI_fifo_depth * OSPI_fifo_width / 8);

        /* Disable direct access controller */
        reg = read_reg(CQSPI_REG_CONFIG);
        reg &= ~CQSPI_REG_CONFIG_ENB_DIR_ACC_CTRL;
        write_reg(CQSPI_REG_CONFIG, reg);
     
        /* Disable DMA interface */
        reg = read_reg(CQSPI_REG_CONFIG);
        reg &= ~CQSPI_REG_CONFIG_DMA_MASK;
        write_reg(CQSPI_REG_CONFIG, reg);

        cqspi_controller_enable(1);
    }

    int32_t SendBuffer_cqspi(const uint8_t *dataOutStream, uint32_t cmdSize, uint32_t addressSize,
                             uint32_t dataOutSize, uint32_t dummyCycles, uint8_t *dataIn, uint32_t dataInSize)
    {
        fd = open("/dev/mem", O_RDWR | O_SYNC);
        if (fd < 0)
        {
            perror("open");
            return -1;
        }

        // Map the OSPI register space.
        ospi_base = mmap(NULL, OSPI_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, OSPI_BASE_ADDR);
        if (ospi_base == MAP_FAILED)
        {
            perror("mmap");
            close(fd);
            return -1;
        }

        cqspi_configure(OSPI_MAX_SPEED_HZ, OSPI_CS);
        cqspi_controller_init();
        uint16_t cmd = 0;
        uint32_t address = 0;
        for (uint32_t i = 0; i < cmdSize; i++)
        {
            cmd <<= 8;
            cmd |= dataOutStream[i];
        }
        for (uint32_t i = 0; i < addressSize; i++)
        {
            address <<= 8;
            address |= dataOutStream[cmdSize + i];
        }

        printf("cmd:0x%X,cmdsize:[%d] ,address:0x%X, addresslen:[%d],datasize[%d] ,DummyCycle:[%d],Datainsize:[%d]\n",
               cmd, cmdSize, address, addressSize, dataOutSize, dummyCycles, dataInSize);
        hex_dump(dataOutStream, cmdSize + addressSize + dataOutSize, 32, "  TX");

        if (dataInSize)
        {
            if (dataInSize <= CQSPI_STIG_DATA_LEN_MAX)
                cqspi_STIG_read(cmd, cmdSize, address, addressSize, dummyCycles, dataInSize, dataIn);
            else
                cqspi_read(cmd, cmdSize, address, addressSize, dummyCycles, dataInSize, dataIn);
        }
        else

            if (dataOutSize <= CQSPI_STIG_DATA_LEN_MAX)
            cqspi_STIG_write(cmd, cmdSize, address, addressSize, &dataOutStream[cmdSize + addressSize], dataOutSize);
        else
        {
            // cqspi_DMA_write(cmd, address, dummyCycles, CQSPI_STIG_DATA_LEN_MAX, dataOutSize, &dataOutStream[cmdSize + addressSize]);
        }

        cqspi_controller_enable(0);
        munmap((void *)ospi_base, OSPI_SIZE);
        close(fd);
        return 0;
    }

    // gcc -o qlib_platform_qspi  qlib_platform_qspi.c -DEXECUTE_TEST=1 -g3
    int main()
    {
        uint8_t dataOutStream[16] = {0};
        uint8_t dataIn[100] = {0};

        dataOutStream[0] = 0xA2; // Opcode
        SendBuffer_cqspi(dataOutStream, 1 /*cmdsize*/, 0 /*address size*/, 0 /*dataoutsize*/, 0 /*Dummycycles*/, dataIn, 16 /*data in size*/);
        return 0;
    }

    Code`s execute result 

    cmd:0xA2,cmdsize:[1] ,address:0x0, addresslen:[0],datasize[0] ,DummyCycle:[0],Datainsize:[16]
    TX | A2 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ |.|
    RX | 07 07 07 07 07 07 07 07 DE 29 53 B5 8B 05 1A 44 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ |.........)S....D|

    Digital analyzer result

    Time [s] Packet ID MOSI MISO
    -2.2E-05 0xA2 0xFF
    -2.2E-05 0x00 0xFE
    -2.1E-05 0x00 0xF3
    -2E-05 0x00 0x1A
    -2E-05 0x00 0x4D
    -1.9E-05 0x00 0x2A
    -1.8E-05 0x00 0x08
    -1.7E-05 0x00 0xC8
    -1.7E-05 0x00 0xB2
    -1.6E-05 0x00 0x84
    -1.5E-05 0x00 0xB5
    -1.5E-05 0x00 0x24
    -1.4E-05 0x04 0xDD
    -1.3E-05 0x00 0xEB
    -1.3E-05 0x00 0xAF
    -1.2E-05 0x00 0x6A
    -1.1E-05 0x00 0xA0
    -1.1E-05 0x00 0x90
    -1E-05 0x00 0xE8
    -9.4E-06 0x00 0x12
    -8.7E-06 0x00 0xD8
    -8.1E-06 0x00 0xDE
    -7.4E-06 0x00 0x29
    -6.7E-06 0x00 0x53
    -6.1E-06 0x00 0xB5
    -5.4E-06 0x00 0xA8
    -4.7E-06 0x00 0xD6
    -4E-06 0x00 0x08
    -3.4E-06 0x00 0xE9
    -2.7E-06 0x00 0x8B
    -2E-06 0x00 0x05
    -1.4E-06 0x00 0x1A
    -6.8E-07 0x00 0x44

  • I'm working on some critical assignments right now so it won't be until next week before I can look at this again.

    Regards, Andreas

  • Hi Andreas,

    I hope you’ve successfully completed your assignment.

    I just wanted to let you know that I was able to resolve most of the issues I was previously facing by porting the OSPI-related code from MCU+ SDK to Linux user space. I'm currently using version 11.00.00.16, and all the functions I’m calling are from ospi.h. With this approach, I’ve managed to solve more than 90% of the problems.

    However, I still have two major issues:

    1. Delayed Read Start (Indirect Read):
      The indirect read always starts with a one-byte delay. In other words, the data is not read immediately after the address is sent.
    2. Write Behavior with Larger Addresses:
      When writing data via indirect mode, especially to larger addresses (e.g., addresses above 0x66xxxxxx), the data is sometimes split unexpectedly. The controller asserts and de-asserts the CS line in the middle of the transfer, resends the command and address, and then continues sending the remaining data.

    Additionally, I’ve noticed that when sending more than 256 bytes in a single command, it causes issues for both read and write operations. It seems necessary to keep each transaction under 256 bytes, though this is a manageable limitation.

    Could you please advise:

    • Is there a way (similar to disabling Write Ready CMD) to disable address verification during write operations?
    • How can I ensure the data transfer starts immediately after the address phase in indirect reads?

    For reference, I’ve already disabled the MOD bit, and this delayed read issue doesn’t occur in STIG mode.

    Below is the function I use for this implementation.

    Thank you in advance for your support!

    Best regards,

    Hossein

        uint8_t temp[cashsize] = {0};

        int32_t SendBuffer_ospi(const uint8_t *dataOutStream, uint32_t cmdSize, uint32_t addressSize,

                                uint32_t dataOutSize, uint32_t dummyCycles, uint8_t *dataIn, uint32_t dataInSize)

        {

     

            int32_t status = SystemP_SUCCESS;

            // Flash_NorOspiObject *obj = (Flash_NorOspiObject *)(config->object);

     

            uint16_t cmd = 0;

     

            uint32_t address = addressSize ? 0 : OSPI_CMD_INVALID_ADDR;

            for (uint32_t i = 0; i < cmdSize; i++)

            {

                cmd <<= 8;

                cmd |= dataOutStream[i];

            }

            for (uint32_t i = 0; i < addressSize; i++)

            {

                address <<= 8;

                address |= dataOutStream[cmdSize + i];

            }

     

            while ((dummyCycles > CSL_OSPI_FLASH_CFG_FLASH_CMD_CTRL_REG_NUM_DUMMY_CYCLES_FLD_MAX) &&

                   (addressSize < 4))

            {

                address <<= 8;

                addressSize++;

                dummyCycles -= 8;

            }

            uint32_t exdummy = 0;

     

            while (dummyCycles > CSL_OSPI_FLASH_CFG_FLASH_CMD_CTRL_REG_NUM_DUMMY_CYCLES_FLD_MAX)

            {

                dummyCycles -= 8;

                exdummy++;

            }

            if (exdummy)

                memset(temp, 0, cashsize);

     

    #ifdef OSPIAM62X_DEBUG

            static uint32_t count = 0;

            printf("%d:cmd:0x%X,cmdsize:[%d] ,address:0x%X, addresslen:[%d],datasize[%d] ,DummyCycle:[%d],Datainsize:[%d]\n",

                   count++, cmd, cmdSize, address, addressSize, dataOutSize, dummyCycles + (exdummy * 8), dataInSize);

            hex_dump(dataOutStream, cmdSize + addressSize + dataOutSize, 32, "  TX");

     

    #endif

     

            if ((dataInSize > 0) && (dataOutSize > 0))

            {

                status = SystemP_FAILURE;

                perror("dataInSize and dataOutSize should not be both > 0");

                exit(-1);

            }

     

            if (dataInSize + exdummy)

            {

                if (((dataInSize + exdummy) - 1) <= CSL_OSPI_FLASH_CFG_FLASH_COMMAND_CTRL_MEM_REG_NB_OF_STIG_READ_BYTES_FLD_MAX)

                {

                    OSPI_ReadCmdParams rdParams;

                    OSPI_ReadCmdParams_init(&rdParams);

                    rdParams.cmd = cmd;

                    rdParams.cmdAddr = address;

                    rdParams.numAddrBytes = addressSize;

                    rdParams.rxDataBuf = (0 == exdummy) ? dataIn : temp;

                    rdParams.rxDataLen = dataInSize + exdummy;

                    rdParams.dummyBits = dummyCycles;

                    status = OSPI_readCmd(OspiHandle, &rdParams);

                    if (exdummy)

                        memcpy(dataIn, &temp[exdummy], dataInSize);

                }

                else

                {

                    /* read INDIRECT FOR BIG DATA READ*/

    #ifdef OSPIAM62X_DEBUG

                    if (dummyCycles == 0)

                    {

                        printf("\033[0;33m READ WITH 0 dummy !!!!\n\033[0m");

                        // exit(1);

                    }

    #endif

                    OSPI_setXferOpCodes(OspiHandle, cmd, 0);

                    OSPI_setReadDummyCycles(OspiHandle, dummyCycles);

                    OSPI_Transaction transaction;

                    OSPI_phyReadTunedVal(OspiHandle);

                    OSPI_Transaction_init(&transaction);

                    OSPI_setNumAddrBytes(OspiHandle, addressSize);

                    transaction.addrOffset = address;

                    transaction.buf = (void *)((0 == exdummy) ? dataIn : temp);

                    transaction.count = dataInSize + exdummy;

                    transaction.dmaCopyLowerLimit = OSPI_NOR_DMA_COPY_LOWER_LIMIT;

                    status = OSPI_readIndirect(OspiHandle, &transaction);

                    // OSPI_setXferOpCodes(OspiHandle, 0x03, 0x02);

                    OSPI_setReadDummyCycles(OspiHandle, 0);

                    OSPI_setNumAddrBytes(OspiHandle, 3);

                    if (exdummy)

                        memcpy(dataIn, &temp[exdummy], dataInSize);

                }

     

    #ifdef OSPIAM62X_DEBUG

                if (status != SystemP_SUCCESS)

                    printf("OSPI_readCmd failed: %d\n", status);

                // else

                hex_dump(dataIn, dataInSize, 32, "  RX");

    #endif

            }

            else if ((dataOutSize + exdummy) <= (1 + CSL_OSPI_FLASH_CFG_FLASH_COMMAND_CTRL_MEM_REG_NB_OF_STIG_READ_BYTES_FLD_MAX))

            {

                OSPI_WriteCmdParams wrParams;

                OSPI_WriteCmdParams_init(&wrParams);

                wrParams.cmd = cmd;

                wrParams.cmdAddr = address;

                wrParams.numAddrBytes = addressSize;

     

                if (exdummy)

                {

                    wrParams.txDataBuf = (void *)&temp[0];

                    memcpy(&temp[exdummy], &dataOutStream[cmdSize + addressSize], dataOutSize);

                }

                else

                    wrParams.txDataBuf = (void *)&dataOutStream[cmdSize + addressSize];

                wrParams.txDataLen = dataOutSize + exdummy;

     

                status = OSPI_writeCmd(OspiHandle, &wrParams);

            }

            else

            {

     

                OSPI_setXferOpCodes(OspiHandle, 0x03, cmd);

                OSPI_setWritedDummyCycles(OspiHandle, dummyCycles);

     

                OSPI_Transaction transaction;

     

                OSPI_Transaction_init(&transaction);

     

                OSPI_setNumAddrBytes(OspiHandle, addressSize);

                transaction.addrOffset = address;

                transaction.buf = (void *)&dataOutStream[cmdSize + addressSize];

     

                transaction.count = dataOutSize;

                // disable wrn

                const OSPI_Attrs *attrs = &gOspiAttrs[CONFIG_OSPI0];

                const CSL_ospi_flash_cfgRegs *pReg = (const CSL_ospi_flash_cfgRegs *)attrs->baseAddr;

                CSL_REG32_FINS(&pReg->DEV_INSTR_WR_CONFIG_REG,

                               OSPI_FLASH_CFG_DEV_INSTR_WR_CONFIG_REG_WEL_DIS_FLD,

                               1);

                // send command

                status = OSPI_writeIndirect(OspiHandle, &transaction);

                OSPI_setXferOpCodes(OspiHandle, 0x03, 0x02);

                OSPI_setWritedDummyCycles(OspiHandle, 0);

            }

     

            return status;

        }

  • I'm on PTO this week but I will be picking up E2E support again next week. Thanks.