This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM3352: Problems trying to get ONFI Page Read parameters using SPL

Part Number: AM3352

Hello Everyone!

We are using a custom board based on the AM3352 processor with a Macronix MX30UF2G18AC 2G-bit NAND Flash Memory.

We received a board with a problem trying to boot, which stops in the SPL bootloader.

We were using the Blackhawk USB560v2 STM Emulator to load to SPL and U-Boot binaries directly into RAM using the JTAG using the CCS 8.3.1.

Analyzing the issue, we find out that we are having problems when trying to read the ONFI Parameter Page Read. There is a CRC error when calculating the message received. We discovered the problem when we received the field Parameter Page Signature. We should had received the string "ONFI" but instead we received "ONGI" in the first read:

then the CRC error occurs. But in the second read we received the "ONFI" string:

And then the U-Boot console appears.

Using the document Open Nand Flash Interface Specification, page 94, we find out that this is a field that never changes (like a static header):

We don't know if this is a problem with the flash memory device, something related to the gpmc configurations or to some "noisy" in the communication channel...

Actually we have approximately 20,000 boards in field and we are worried that this problems could occur in more boards. We can send the schematics and also any more information about the code that could help.

Could you please send any suggestions that could help us understand what's going on?

  • Is the same NAND device (Macronix MX30UF2G18AC 2G-bit) the boot media on the board?

    Leonardo Amorim said:
    We received a board with a problem trying to boot, which stops in the SPL bootloader.

    If yes, do we know if SPL starts to run from on-chip RAM or not from your statement above?

  • Hi Hong,

    In a normal operation, the NAND device is used to boot the board and to load the binaries (SPL, U-Boot, Kernel, rootfs), but in this scenario when we use the SPL image that is loaded from the NAND than we got the error:

    U-Boot SPL 1 - 2013.01.01.UCC3.R18 (Apr 28 2020 - 14:55:29)
    ERROR! NAND - CRC mismatch! Expected 0x5b16, got 0x65e9
    ### ERROR ### Please RESET the board ###

    This way, we were thinking that maybe the SPL image in the FLASH was corrupted then we pluged our JTAG trace and we loaded the SPL and U-Boot images directly into the RAM, but we still got the same problem without using the NAND device to load the binaries.

  • Hello everyone,

    After some changes in the SPL/U-Boot Code, we find out a way to boot the Kernel Linux, but still we got the same problem that is somehow related to ONFI Parameter Page Read. Here is our kernel log message:

    [2020-07-30 11:38:32.569] Using UCC3's environment
    [2020-07-30 11:38:32.569] 
    [2020-07-30 11:38:32.571] 
    [2020-07-30 11:38:40.490] Driver Version: 4.9.206.UCC3.R06.01 - 2020-04-28 15:35:23 -0300
    [2020-07-30 11:38:48.632] [    0.000000] Booting Linux on physical CPU 0x0
    [2020-07-30 11:38:48.638] [    0.000000] Linux version 4.9.206.UCC3.R06 (leonardo.amorim@ABELHA2665-VM-UBUNTU) (gcc version 8.3.0 (crosstool-NG 1.24.0.7-8f74cd1) ) #128 PREEMPT Tue Apr 28 15:35:45 -03 2020
    [2020-07-30 11:38:48.642] [    0.000000] GIT release: 4.9.206.UCC3.R06 (864cfdfaeca0)
    [2020-07-30 11:38:48.648] [    0.000000] GIT tag: 4.9.206.UCC3.R06
    [2020-07-30 11:38:48.656] [    0.000000] GIT ultimo commit: 864cfdfaeca0
    [2020-07-30 11:38:48.660] [    0.000000] Debug symbols NOT compiled into kernel
    [2020-07-30 11:38:48.664] [    0.000000] CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), cr=10c5387d
    [2020-07-30 11:38:48.669] [    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
    [2020-07-30 11:38:48.676] [    0.000000] OF: fdt:Machine model: Autotrac UCC3G
    [2020-07-30 11:38:48.681] [    0.000000] Memory policy: Data cache writeback
    [2020-07-30 11:38:48.687] [    0.000000] CPU: All CPU(s) started in SVC mode.
    [2020-07-30 11:38:48.691] [    0.000000] AM335X ES2.1 (neon)
    [2020-07-30 11:38:48.695] [    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 260352
    [2020-07-30 11:38:48.709] [    0.000000] Kernel command line: console=ttyO0,115200n8 root=/dev/mtdblock9 ro rootfstype=jffs2 noinitrd omap_wdt.early_enable=1
    [2020-07-30 11:38:48.714] [    0.000000] PID hash table entries: 4096 (order: 2, 16384 bytes)
    [2020-07-30 11:38:48.721] [    0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
    [2020-07-30 11:38:48.728] [    0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
    [2020-07-30 11:38:48.742] [    0.000000] Memory: 1031784K/1047552K available (4222K kernel code, 237K rwdata, 696K rodata, 1024K init, 171K bss, 15768K reserved, 0K cma-reserved, 261120K highmem)
    [2020-07-30 11:38:48.756] [    0.000000] Virtual kernel memory layout:
    [2020-07-30 11:38:48.756] [    0.000000]     vector  : 0xffff0000 - 0xffff1000   (   4 kB)
    [2020-07-30 11:38:48.756] [    0.000000]     fixmap  : 0xffc00000 - 0xfff00000   (3072 kB)
    [2020-07-30 11:38:48.769] [    0.000000]     vmalloc : 0xf0800000 - 0xff800000   ( 240 MB)
    [2020-07-30 11:38:48.769] [    0.000000]     lowmem  : 0xc0000000 - 0xf0000000   ( 768 MB)
    [2020-07-30 11:38:48.781] [    0.000000]     pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)
    [2020-07-30 11:38:48.781] [    0.000000]     modules : 0xbf000000 - 0xbfe00000   (  14 MB)
    [2020-07-30 11:38:48.790] [    0.000000]       .text : 0xc0008000 - 0xc051fb68   (5215 kB)
    [2020-07-30 11:38:48.791] [    0.000000]       .init : 0xc0600000 - 0xc0700000   (1024 kB)
    [2020-07-30 11:38:48.805] [    0.000000]       .data : 0xc0700000 - 0xc073b660   ( 238 kB)
    [2020-07-30 11:38:48.816] [    0.000000]        .bss : 0xc073b660 - 0xc0766348   ( 172 kB)
    [2020-07-30 11:38:48.816] [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
    [2020-07-30 11:38:48.823] [    0.000000] Preemptible hierarchical RCU implementation.
    [2020-07-30 11:38:48.834] [    0.000000] Build-time adjustment of leaf fanout to 32.
    [2020-07-30 11:38:48.840] [    0.000000] NR_IRQS:16 nr_irqs:16 16
    [2020-07-30 11:38:48.840] [    0.000000] IRQ: Found an INTC at 0xfa200000 (revision 5.0) with 128 interrupts
    [2020-07-30 11:38:48.844] [    0.000000] OMAP clockevent source: timer2 at 24000000 Hz
    [2020-07-30 11:38:48.851] [    0.000018] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478484971ns
    [2020-07-30 11:38:48.865] [    0.000044] clocksource: timer1: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635851949 ns
    [2020-07-30 11:38:48.869] [    0.000059] OMAP clocksource: timer1 at 24000000 Hz
    [2020-07-30 11:38:48.873] [    0.000100] Calibrating delay loop... 595.96 BogoMIPS (lpj=1191936)
    [2020-07-30 11:38:48.878] [    0.022906] pid_max: default: 32768 minimum: 301
    [2020-07-30 11:38:48.885] [    0.023027] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes)
    [2020-07-30 11:38:48.893] [    0.023042] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes)
    [2020-07-30 11:38:48.898] [    0.023753] CPU: Testing write buffer coherency: ok
    [2020-07-30 11:38:48.902] [    0.023822] CPU0: Spectre v2: using BPIALL workaround
    [2020-07-30 11:38:48.909] [    0.024144] Setting up static identity map for 0x80100000 - 0x80100058
    [2020-07-30 11:38:48.913] [    0.025863] devtmpfs: initialized
    [2020-07-30 11:38:48.923] [    0.039401] VFP support v0.3: implementor 41 architecture 3 part 30 variant c rev 3
    [2020-07-30 11:38:48.933] [    0.039881] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
    [2020-07-30 11:38:48.940] [    0.039912] futex hash table entries: 256 (order: -1, 3072 bytes)
    [2020-07-30 11:38:48.940] [    0.040007] pinctrl core: initialized pinctrl subsystem
    [2020-07-30 11:38:48.953] [    0.041066] NET: Registered protocol family 16
    [2020-07-30 11:38:48.958] [    0.041733] DMA: preallocated 256 KiB pool for atomic coherent allocations
    [2020-07-30 11:38:48.963] [    0.130582] pinctrl-single 44e10800.pinmux: 142 pins at pa f9e10800 size 568
    [2020-07-30 11:38:48.966] [    0.137731] OMAP GPIO hardware version 0.1
    [2020-07-30 11:38:48.972] [    0.164454] omap-gpmc 50000000.gpmc: GPMC revision 6.0
    [2020-07-30 11:38:48.976] [    0.165167] Skipping CS0 timing configuration
    [2020-07-30 11:38:48.984] [    0.209295] edma 49000000.edma: TI EDMA DMA engine driver
    [2020-07-30 11:38:48.989] [    0.213090] SCSI subsystem initialized
    [2020-07-30 11:38:48.994] [    0.213418] usbcore: registered new interface driver usbfs
    [2020-07-30 11:38:48.998] [    0.213575] usbcore: registered new interface driver hub
    [2020-07-30 11:38:49.002] [    0.213739] usbcore: registered new device driver usb
    [2020-07-30 11:38:49.012] [    0.216049] omap_i2c 44e0b000.i2c: bus 0 rev0.11 at 400 kHz
    [2020-07-30 11:38:49.012] [    0.217501] Bluetooth: Core ver 2.22
    [2020-07-30 11:38:49.017] [    0.217628] NET: Registered protocol family 31
    [2020-07-30 11:38:49.023] [    0.217640] Bluetooth: HCI device and connection manager initialized
    [2020-07-30 11:38:49.028] [    0.217666] Bluetooth: HCI socket layer initialized
    [2020-07-30 11:38:49.034] [    0.217681] Bluetooth: L2CAP socket layer initialized
    [2020-07-30 11:38:49.041] [    0.217697] Bluetooth: SCO socket layer initialized
    [2020-07-30 11:38:49.046] [    0.219916] clocksource: Switched to clocksource timer1
    [2020-07-30 11:38:49.051] [    0.222433] NET: Registered protocol family 2
    [2020-07-30 11:38:49.057] [    0.223526] TCP established hash table entries: 8192 (order: 3, 32768 bytes)
    [2020-07-30 11:38:49.063] [    0.223647] TCP bind hash table entries: 8192 (order: 3, 32768 bytes)
    [2020-07-30 11:38:49.071] [    0.223757] TCP: Hash tables configured (established 8192 bind 8192)
    [2020-07-30 11:38:49.082] [    0.223874] UDP hash table entries: 512 (order: 1, 8192 bytes)
    [2020-07-30 11:38:49.082] [    0.224611] UDP-Lite hash table entries: 512 (order: 1, 8192 bytes)
    [2020-07-30 11:38:49.087] [    0.224827] NET: Registered protocol family 1
    [2020-07-30 11:38:49.094] [    0.228395] workingset: timestamp_bits=30 max_order=18 bucket_order=0
    [2020-07-30 11:38:49.101] [    0.241897] ntfs: driver 2.1.32 [Flags: R/O].
    [2020-07-30 11:38:49.106] [    0.242396] jffs2: version 2.2. (NAND) (SUMMARY)  © 2001-2006 Red Hat, Inc.
    [2020-07-30 11:38:49.111] [    0.257350] bounce: pool size: 64 pages
    [2020-07-30 11:38:49.117] [    0.257690] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 251)
    [2020-07-30 11:38:49.122] [    0.257710] io scheduler noop registered
    [2020-07-30 11:38:49.127] [    0.257905] io scheduler cfq registered (default)
    [2020-07-30 11:38:49.136] [    0.261166] Serial: 8250/16550 driver, 32 ports, IRQ sharing disabled
    [2020-07-30 11:38:49.143] [    0.274695] 44e09000.serial: ttyO0 at MMIO 0x44e09000 (irq = 158, base_baud = 3000000) is a OMAP UART0
    [2020-07-30 11:38:49.147] [    0.806004] console [ttyO0] enabled
    [2020-07-30 11:38:49.160] [    0.810531] 48022000.serial: ttyO1 at MMIO 0x48022000 (irq = 159, base_baud = 3000000) is a OMAP UART1
    [2020-07-30 11:38:49.173] [    0.821153] 48024000.serial: ttyO2 at MMIO 0x48024000 (irq = 160, base_baud = 3000000) is a OMAP UART2
    [2020-07-30 11:38:49.179] [    0.831866] 481a6000.serial: ttyO3 at MMIO 0x481a6000 (irq = 161, base_baud = 3000000) is a OMAP UART3
    [2020-07-30 11:38:49.192] [    0.842519] 481a8000.serial: ttyO4 at MMIO 0x481a8000 (irq = 162, base_baud = 3000000) is a OMAP UART4
    [2020-07-30 11:38:49.200] [    0.853094] 481aa000.serial: ttyO5 at MMIO 0x481aa000 (irq = 163, base_baud = 3000000) is a OMAP UART5
    [2020-07-30 11:38:49.212] [    0.867613] nand: Could not find valid ONFI parameter page; aborting
    [2020-07-30 11:38:49.219] [    0.875119] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xaa
    [2020-07-30 11:38:49.225] [    0.881885] nand: Macronix NAND 256MiB 1,8V 8-bit
    [2020-07-30 11:38:49.234] [    0.886864] nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
    [2020-07-30 11:38:49.245] [    0.894943] nand: Identificando o BCH pelos parametros ONFI.
    [2020-07-30 11:38:49.245] [    0.900933] omap2-nand 8000000.nand: BCH auto falhou! O numero de ecc_bits lido foi de 0 bits. O valor esperado deveria ser de 4 ou 8
    [2020-07-30 11:38:49.261] [    0.916352] vcan: Virtual CAN interface driver
    [2020-07-30 11:38:49.264] [    0.921394] CAN device driver interface
    [2020-07-30 11:38:49.273] [    0.927135] c_can_platform 481d0000.can: c_can_platform device registered (regs=fa1d0000, irq=168)
    [2020-07-30 11:38:49.278] [    0.937169] i2c /dev entries driver
    [2020-07-30 11:38:49.287] [    0.942318] omap_wdt: OMAP Watchdog Timer Rev 0x01: initial timeout 60 sec
    [2020-07-30 11:38:49.295] [    0.953170] NET: Registered protocol family 17
    [2020-07-30 11:38:49.302] [    0.958233] can: controller area network core (rev 20120528 abi 9)
    [2020-07-30 11:38:49.306] [    0.964889] NET: Registered protocol family 29
    [2020-07-30 11:38:49.313] [    0.969622] can: raw protocol (rev 20120528)
    [2020-07-30 11:38:49.319] [    0.974185] can: broadcast manager protocol (rev 20161123 t)
    [2020-07-30 11:38:49.323] [    0.980172] lib80211: common routines for IEEE802.11 drivers
    [2020-07-30 11:38:49.331] [    0.986265] omap_voltage_late_init: Voltage driver support not added
    [2020-07-30 11:38:49.335] [    0.993243] ThumbEE CPU extension supported.
    [2020-07-30 11:38:49.344] [    1.000227] List of all partitions:
    [2020-07-30 11:38:49.349] [    1.004095] No filesystem could mount root, tried: [    1.009103]  jffs2
    [2020-07-30 11:38:49.352] [    1.011210] 
    [2020-07-30 11:38:49.359] [    1.012864] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
    [2020-07-30 11:38:49.368] [    1.021541] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
    [2020-07-30 11:38:49.682] [    1.339912] random: fast init done
    [2020-07-30 11:39:42.172] [   53.819965] random: crng init done

    Regards,

    Leonardo Amorim

    1. Did you modify SPL/u-boot nand driver code to work around the original reported NDND ID read issue?
    2. Were you be able to boot SPL from the same NAND device consistently every time?

    If your answer is yes for Q2, SPL copy from NAND to on-chip SRAM by on-chip rom code works as expected. This would give us some reference points to configure GPMC timings to match the NAND device timing specifications.

     

    Best,

    -Hong

  • Hi Hong,

    Thanks for your reply.

    1. Yes, I have modified the SPL/U-Boot code in the nand driver code. Earlier, we were reading the nand ONFI Page Parameter just once. We weren't trying to read it in a second, or a third time. Now I have modified it to try to read at least three times and after this it could fail. Because our problem only occurs in the first read, now the SPL/U-Boot seems OK.

    2. Well, when I was using that version of SPL/U-Boot that only reads one time the ONFI Page Parameter then the boot always failed. Now that I've modified the SPL/U-Boot code, I sent the binaries from the SPL/U-Boot using the JTAG to the RAM memory, and then using the U-Boot console, I wrote the SPL/U-Boot binaries to the nand flash. So my answer is yes, because now I can boot the board consistently every time (using the nand flash) and I can also load the Kernel, but I got those messages that I showed and the panic.

    Regards,

    Leonardo Amorim

  • Hi Leonardo,

     

    From your descriptions,

    1. SPL copy from NAND to on-chip SRAM by on-chip rom code works every time => NAND device itself would be good.

    Please note that NAND device detection mechanism in rom code is described below in TRM 26.1.8.4.2 Initialization and Detection:

    The ONFI Read ID (command 90h /address 20h) is sent to the NAND device. If it replies with the ONFI signature (4 bytes) then a Read parameters page (command ECh) is sent. If the parameters page does not have the ONFI signature, then the ONFI identification fails. If the ONFI Read ID command fails (it will be the case with any device not supporting ONFI) then the device is reset again with polling for device to be ready (with 200ms timeout). Then, the standard Read ID (command 90h / address 00h) is sent. If the Device ID (2nd byte of the ID byte stream) is recognized as being a supported device then the device parameters are extracted from an internal ROM Code table.

    2. u-boot copy from NAND to DDR by SPL looks good with the workaround (adding 2nd ONFI ID signature read).

    3. 

    Leonardo Amorim said:
    So my answer is yes, because now I can boot the board consistently every time (using the nand flash) and I can also load the Kernel

    Is the kernel copied from NAND to DDR by u-boot before kernel booting?

    Best,

    -Hong

  • Hi Hong,

    Thanks again for your reply.

    Yes, the kernel was copied from NAND to DDR by u-boot using the console.

    Analyzing the 26.1.8.4.2 Initialization and Detection, I would ask, do you think that maybe the NAND device is working properly ?

    Regards,

    Leonardo Amorim

  • Hi Leonardo,

    The on-board NAND device itself looks good based on testing run so far:

    1/. SPL copy from NAND to on-chip SRAM by rom code

    2/. u-boot copy from NAND to DDR by SPL (w/ the workaround)

    3/. kernel copy from NAND to DDR by u-boot

     

    From your kernel log, it shows NAND device parameters were read by kernel driver.

    [2020-07-30 11:38:49.212] [    0.867613] nand: Could not find valid ONFI parameter page; aborting

    [2020-07-30 11:38:49.219] [    0.875119] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xaa

    [2020-07-30 11:38:49.225] [    0.881885] nand: Macronix NAND 256MiB 1,8V 8-bit

    [2020-07-30 11:38:49.234] [    0.886864] nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64

     

    The kernel panic in your log seems caused by rootfs.

    [2020-07-30 11:38:49.359] [    1.012864] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)

     

    Best,

    -Hong

  • Hi Hong,

    Thanks again for your reply.

    After you said that nand flash appears to be working fine, I erased the kernel and rootfs binaries that was in the flash memory and I loaded the flash with another binaries of kernel and rootfs, and surprinsingly the kernel panic is not occuring anymore. Now I can login into the buildroot correctly.

    I believe that this kernel panic was not caused by the onfi page read error but by a a known failure in the rootfs remote update that we already know. In this case, the rootfs image was corrupted. We have another boards that are facing the same issue right now.

    Then I could say that we faced two problems at the same time: The first one is this with the onfi parameter page read and the second, which is not related with the flash, a corrupted rootfs image.

    What still seems strange to me is that we first read the Parameter Page Signature with a error and then we read that correctly. What could be the cause of this strange behavior in the nand flash? And should we be concerned about this error ?

    Regards,

    Leonardo Amorim

  • Hi Leonardo, 

    It is good to know that NAND booting works for you now.

    Regarding stranger behavior of ONFI parameter table signature reading, have you checked with NAND vendor?

    From ONFI spec www.onfi.org/.../onfi_2_0_gold.pdf

    >>>> 

    5.5. Read ID Definition

    When issuing Read ID in the source synchronous data interface, each data byte is received

    twice. The host shall only latch one copy of each data byte. See section 4.3.2.4.

    <<<< 

    It seems to me that ONFI page signature is read correctly by rom code when loading SPL from NAND.

    Best,

    -Hong

  • Hi Hong, 

    Thanks for your reply.

    After out team had discussed here, we decided that is highly problably that our SPL/U-Boot and Kernel codes are working fine and we think that the NAND device is working fine also, but we don't know yet about this behavior.

    We haven't checked yet with the NAND vendor about this, but we'll do this soon. For now, the information you gave us is enough to close this topic.

    Thank you once again for your informations!



    Regards,

    Leonardo Amorim