This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM6442: ChipSelect and WriteEnable assertion is one cycle too long during GPMC write bursts

Part Number: AM6442

Tool/software:

Hello,

as it turns out, during burst write transactions the GPMC asserts both ChipSelect and WriteEnable by one clock cycle longer than anticipated.

I'm not observing this extra cycle during single-word write operations.

The overall configuration of the GPMC interface is as follows:

I have to admit that I'm using a very tight timing regime. A single-word write operation is just taking 2 clock cycles - one cycle for the address and another for the data word. Both ChipSelect and WriteEnable are active for these two cycles only. This appears to work fine. However, when the GPMC interface is configured to allow burst write transactions and is generating one, both ChipSelect and WriteEnable are found to be active for one cycle too long. I.e., a burst write with four words should have activated ChipSelect and WriteEnable for 5 clocks. In fact, both are activated for 6 clocks. The same for bursts with 8 words, which should be 9 clocks in length but are found to be 10 clock cycles long. It might be worth mentioning that the data word that is being transferred within that extra-cycle is the same word that is the formal last word of the transaction. I.e. when there are to be transmitted 8 words indexed 0-7, the 9th "phantom" word is word number 7 duplicated.

It seems there is no explanation for this behavior within the TRM (which can also mean that I did just not find it...).  

For reference, I'm providing the relevant Linux device tree settings for the GPMC here:

&gpmc0 {
        pinctrl-names = "default";
        pinctrl-0 = <&gpmc0_pins_default>;
        assigned-clocks = <&k3_clks 80 0>;
        assigned-clock-parents = <&k3_clks 80 1>;

        assigned-clock-rates = <33333333>;

        gpmc,num-cs = <4>;
        gpmc,num-waitpins = <2>;
        ranges = <0 0 0x00 0x50000000 0x01000000>, /* CS0 space. Min partition = 16MB */
                 <1 0 0x00 0x51000000 0x01000000>, /* CS1 space. Min partition = 16MB */
                 <2 0 0x00 0x52000000 0x01000000>, /* CS2 space. Min partition = 16MB */
                 <3 0 0x00 0x53000000 0x01000000>; /* CS3 space. Min partition = 16MB */
        status = "okay";

        ....

        fpga@2,0{
                reg = <2 0 0x001000000>;
                gpmc,mux-add-data = <2>;        // AD multiplexed mode with a single address phase.
                gpmc,device-width = <2>;        // 16 Bit
                gpmc,burst-length = <16>;       // 16 words resp. 32 bytes burst length.
                gpmc,wait-on-read = <0>;        // No wait on read.
                gpmc,wait-on-write = <0>;       // No wait on write.
                //gpmc,sync-read;
                gpmc,sync-write;
                gpmc,burst-read;
                gpmc,burst-write;

                gpmc,sync-clk-ps = <30000>;
                gpmc,page-burst-access-ns = <30>; // 1 clock ticks. PAGEBURSTACCESSTIME - The number of clocks per word transmitted.
                gpmc,cs-on-ns = <0>;            // 0 clock ticks. CSONTIME - Immediate ChipSelect assertion without any delay.
                gpmc,cs-rd-off-ns = <150>;      // 5 clock ticks. CSRDOFFTIME - The clock number CS will be deasserted during read resp. when the first word has been captured.
                gpmc,cs-wr-off-ns = <60>;       // 2 clock ticks. CSWROFFTIME - The clock number CS will be deasserted during write resp. when the first word has been captured.
                gpmc,adv-on-ns = <0>;           // 0 clock ticks. ADVONTIME - The clock number the ADV signal will be asserted.
                gpmc,adv-rd-off-ns = <30>;      // 1 clock ticks. ADVRDOFFTIME - The clock number the ADV signal will be deasserted during read.
                gpmc,adv-wr-off-ns = <30>;      // 1 clock ticks. ADVWROFFTIME - The clock number the ADV signal will be deasserted during write.
                gpmc,wr-data-mux-bus-ns = <30>; // 1 clock ticks. WRDATAONADMUXBUS - The clock number the written data will appear or the address is removed.
                gpmc,oe-on-ns = <120>;          // 4 clock ticks. OEONTIME - The clock number the OE will be asserted during read.
                gpmc,oe-off-ns = <150>;         // 5 clock ticks. OEOFFTIME - The clock number the OE will be deasserted during read resp. when the first word has been captured.
                gpmc,we-on-ns = <0>;            // 0 clock ticks. WEONTIME - The clock number the WE will be asserted during write.
                gpmc,we-off-ns = <60>;          // 2 clock ticks. WEOFFTIME - The clock number the WE will be deasserted during write resp. when the first word has been captured.
                gpmc,rd-cycle-ns = <180>;       // 6 clock ticks. RDCYCLETIME - The number of clocks a read takes resp. until one cycle after the first word has been captured.
                gpmc,wr-cycle-ns = <60>;        // 2 clock ticks. WRCYCLETIME - The number of clocks a write takes resp. until the first word has been captured.
                gpmc,wr-access-ns = <60>;       // 2 clock ticks. WRACCESSTIME - The number of clocks a write takes resp. until the first word has been captured.
                gpmc,access-ns = <150>;         // 5 clock ticks. RDACCESSTIME - The clock number the first data word will be sampled during read.
                gpmc,bus-turnaround-ns = <30>;  // 1 clock tick. BUSTURNAROUND - The number of clock ticks inserted between accesses.
                gpmc,cycle2cycle-delay-ns = <30>;
                gpmc,cycle2cycle-diffcsen;
                gpmc,cycle2cycle-samecsen;
        };

};

This is a particular setting for a moderate 33MHz GPMC clock.

The transactions are generated through the processor by mmapping the according GPMC window from /dev/mem and then simply accessing it. I.e.something like

        fd = open("/dev/mem", O_RDWR | O_SYNC);

        .....

        fpga_fast_ram_space = (volatile int64_t *) mmap(NULL, FPGA_RAM_SPACE_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, FPGA_FAST_RAM_SPACE_ADDR);

        .....

        memcpy( (void*)fpga_fast_ram_space, (void*) buffer1, 64);

It should be noted that this effect can be seen without an actual backend-device attached to the GPMC, since one can also write "into the air"...

Btw., the errata document for the AM6442, namely https://www.ti.com/lit/er/sprz457i/sprz457i.pdf, does not list anything that is going into that direction. I'm still using silicon revision 1.0 of the AM6442 but will upgrade soon to SR2.0. It might be an issue that has not been discovered before. It might also be that I'm making some mistake in my configuration.

Are there any ideas/suggestions?

Thanks,

Mario

  • I made a lot of additional experiments with the timings in order to get rid of this additional clock cycle at the end of a write burst transaction - without any success.

    I should also add that the very same is happening in a different GPMC window where I'm using two cycles per data word (gpmc,page-burst-access-ns set to two clock periods). There is also beeing added a single extra clock cycle (btw.: not two extra clock cycles here).

    Based on the theory, that the GPMC module might have troubles with the tight timing settings by means of a very short addressing phase I did also make various attempts to shift the start of the data phase by one clock cycle into the future - in hope that the burst at least aligns with the end of the ChipSelect/WriteEnable activity. However, this was without any success.

    I also came across these  CYCLEOPTIMIZATION and ENABLEOPTIMIZEDACCESS fields within GPMC_PREFETCH_CONFIG1. Although this is part of the NAND Flash access mechanism and should have no impact here, I did play a bit with these settings as they are intended to shorten some timing parameters by some amount. As expected, nothing did happen here.

    Since I'm not seeing that extra-cycle during single-word transactions and the behavior there is exactly as expected, it seems that the timing parameters I'm using here are just fine, essentially.

    Momentarily I'm seeing the following options to deal with that situation:

    1. Accept that a burst is transmitting a final "scrap-word" that will be written into the target memory as well. As long as larger blocks are copied bottom-up in multiple bursts, this won't be a problem. One should only take care about the fact that the word that is following the copied block will be overwritten as well. This might be acceptable in some scenarios.
    2. Assume a fixed burst length in the target device. With a fixed length the device can cut off the transaction right in time, hence avoiding overwriting the next word in memory. However, I believe that this is critical since it might not be guaranteed that the host system is always generating one and the same burst lengths.
    3. Increase the depth of the write-pipeline within the target device so that there is proper time to react on the situation and wipe out the final "phantom write" just before it is taking place within internal memory of the device.

    Probably the best solution is 3. 

    Of course, there is one general drawback: This additional cycle on the GPMC interface is still present and is eating up potential bandwidth... 

  • I just want to confirm that the issue is also present with silicon revision SR2.0 of the AM6442.