This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

"Memory error" on kernel boot



 

Hi,

We developed our own DM355 board and boot it successfully and run u-boot.

But when I try to boot the kernel, sometimes it prints "Memory error" at the output.

Complete console output is as follows -->

DM355 MVK # bootm
## Booting image at 80700000 ...
   Image Name:   Linux-2.6.18_pro500-davinci_evm-
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    2024316 Bytes =  1.9 MB
   Load Address: 80008000
   Entry Point:  80008000
   Verifying Checksum ... OK
OK

Starting kernel ...

Uncompressing Linux...

Memory error

 -- System halted

 

 

When I try with Linux-2.6.10 kernel (Montavista) the output is --->

DM355 MVK # bootm
## Booting image at 80700000 ...
   Image Name:   Linux-2.6.10_mvl401_IPNC-1.0.7
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    1498644 Bytes =  1.4 MB
   Load Address: 80008000
   Entry Point:  80008000
   Verifying Checksum ... OK
OK

Starting kernel ...

U

 

and stucks here without "Memory error" message.

 

And exactly the same binaries (u-boot, kernel 2.6.18, kernel 2.6.10) runs successfully on our DM355 EVM board.

I can't figure out the difference, the only difference between the EVM and our board is, EVM has DM355ZCE216 and our board has DM355ZCE270, our board runs with a 24Mhz XTAL as well. I don't know if this couses the issue.

 

Now I have run some tests to be sure;

1. For DDR memory integrity check, I booted the board with SD card (SD card boot and flashing tool for DM355 and DM365 by Constantine Shulyupin) and run "test first 16MB of RAM" test and the test was completed successfully. Then through u-boot console I've tested further parts of the RAM by mw and md commands and it was successfull as well.

2. I came up with this wiki page --> http://processors.wiki.ti.com/index.php/FAQ_for_DaVinci_Linux so I've checked u-boot bootargs and kernel configuration "Low-level debug console UART --> UART0" is selected.

 

Thanks and regards.

 

Selim TEMUR

 

 

  •  

    Hi,

    I have just modified SD card boot and flashing tool (by Constantine Shulyupin) to test the

    entire DDR memory (128MB) and the test was successfull, I think the problem is not

    about the DDR memory but I can't figure out a logical solution for this issue...

     

    Selim


  • Enable early_printk in the linux .config and then add "earlyprintk" to your bootargs.  This may print out additional error messages to help with debugging.

  • Hi Marcus,

    I have Montavista Linux 2.6.10 and 2.6.18 kernels supplied from TI but there is no configuration option of CONFIG_EARLY_PRINTK  in .config file.

    Here --> http://cateee.net/lkddb/web-lkddb/EARLY_PRINTK.html   it says CONFIG_EARLY_PRINTK depends on CONFIG_DEBUG_LL. So I enabled CONFIG_DEBUG_LL option but CONFIG_EARLY_PRINTK was not available in both kernels.

    I think these kernels does not have support for CONFIG_EARLY_PRINTK, so do you know any patch for this feature, or should I use a newer kernel ?

    Regards

    Selim

  •  

    Marcus hi,

    Today I've downloaded the 2.6.38 kernel from DaVinci GIT Linux Kernel tree and compiled with Sourcery G++ Lite 2009q1-203 compiler. Then tested the binary with DM355 EVM and it was booting fine. Then I've tried the same binary on our board, it has stucked like below;

     

    DM355 MVK # bootm
    ## Booting image at 80700000 ...
       Image Name:   Linux-2.6.38
       Image Type:   ARM Linux Kernel Image (uncompressed)
       Data Size:    1630380 Bytes =  1.6 MB
       Load Address: 80008000
       Entry Point:  80008000
       Verifying Checksum ... OK
    OK

    Starting kernel ...

    Uncompressing Linux...

     

    Then I enabled EARLY_PRINTK which is already supported by this kernel, and tried on our board with u-boot boot arguements = "console=ttyS0,115200n8 ip=off eth=00:0C:0C:A0:01:48 root=/dev/mtdblock3 rw rootfstype=cramfs mem=80M earlyprintk=serial,ttyS0,115200n8", but at the end nothing changed. Boot process stopped at "Uncompressing Linux..." again.

    Then I tried the binary on EVM and the boot log is as follows;

    DM355 MVK # bootm
    ## Booting image at 80700000 ...
       Image Name:   Linux-2.6.38
       Image Type:   ARM Linux Kernel Image (uncompressed)
       Data Size:    1630380 Bytes =  1.6 MB
       Load Address: 80008000
       Entry Point:  80008000
       Verifying Checksum ... OK
    OK

    Starting kernel ...

    Uncompressing Linux... done, booting the kernel.
    Linux version 2.6.38 (root@debian-ipnc) (gcc version 4.3.3 (Sourcery G++ Lite 2009q1-203) ) #1 PREEMPT Fri Apr 29 14:04:07 EEST 2011
    CPU: ARM926EJ-S [41069265] revision 5 (ARMv5TEJ), cr=00053177
    CPU: VIVT data cache, VIVT instruction cache
    Machine: DaVinci DM355 EVM
    bootconsole [earlycon0] enabled
    Memory policy: ECC disabled, Data cache writeback
    DaVinci dm355 variant 0x0
    ....

     

    Looking at the boot log , the only difference of the EARLY_PRINTK option seems to be the "bootconsole [earlycon0] enabled" line on the log.

    So I think EARLY_PRINTK option doesn't give further info in our case.


  • If i remember correctly the console port identifier has changes for Linux 2.6.37 and above. 

    The issue you're seeing may be because the console port the kernel logs to may be different from what's

    provide in the bootargs.

    Can you modify your boot args so that the console value is:

    console=ttyO2

     

  • Would it be possible to get the u-boot environment variables from the EVM and your custom board. It would be the output from "printenv". Maybe try transferring over the env vars from the EVM to your custom board. If both are the same, I would hazard to guess it is a DDR HW or layout problem. DDR is difficult to test completely.

  • Hi Norman ,

    Our boot arguements is;

    "console=ttyS0,115200n8 ip=off eth=00:0C:0C:A0:01:48 root=/dev/mtdblock3 rw rootfstype=cramfs mem=80M earlyprintk=serial,ttyS0,115200n8"

    on both baords.

     

    You've mentioned about DDR HW or layout issue, but as I've pointed earlier, I've modified the SD card boot and flashing tool (by Constantine Shulyupin) to test the entire DDR memory (128MB) and the test was successfull. In the source code for each memory cell, the application writes an increasing pattern then checks this pattern by reading the cell and then writes XOR of the same pattern and again checks it. At the end my board passed this test without a single error, so do you think there may be some internal issues with the DDR?

     

    Thanks and regards

    Selim

     

  •  

    Hi Marcus,

    I've changed the u-boot bootargs as;

    "console=ttyO2,115200n8 ip=off eth=00:0C:0C:A0:01:48 root=/dev/mtdblock3 rw rootfstype=cramfs mem=80M earlyprintk=serial,ttyS0,115200n8"

    but nothing changed, EVM boots fine and our borard does not and printing nothing different.

     

    And just to go further; I've looked into the kernel source code and find where it stucks, i.e;

    in file "arch/arm/boot/compressed/misc.c"

    unsigned long decompress_kernel(unsigned long output_start,unsigned long free_mem_ptr_p,
            unsigned long free_mem_ptr_end_p,
            int arch_id)
    {
        unsigned char *tmp;

        output_data        = (unsigned char *)output_start;
        free_mem_ptr        = free_mem_ptr_p;
        free_mem_end_ptr    = free_mem_ptr_end_p;
        __machine_arch_type    = arch_id;

        arch_decomp_setup();

        tmp = (unsigned char *) (((unsigned long)input_data_end) - 4);
        output_ptr = get_unaligned_le32(tmp);

        putstr("Uncompressing Linux...");
        do_decompress(input_data, input_data_end - input_data,
                output_data, error);
        putstr(" done, booting the kernel.\n");
        return output_ptr;
    }

     

    by the above code "do_decompress" method stucks the kernel somehow.

     

    Now I am digging to find the exact code segment which is responsible for the issue and when I am done I will inform you.

     

    Regards,

    Selim

     

  • Other u-boot environment variables can change the way it setup for load and boot of Linux. That was my original thought. On further thought, probably not.

    Some speculation on my part. The bootup sequence might be:

    1. Copy uImage from NAND/EEPROM/TFTP/etc to 80700000
    2. Extract from uImage(80700000) to zImage(80008000).
    3. Jump to 80008000
    4. Linux boot takes over, see head.S.
    5. Code in head.S figures out where to decompress the kernel. Uses it own location and own header for that calculation. Trys to fit it above itself?
    6. Decompresses kernel. See misc.c.
    7. Jump to kernel entry.

    Assuming the bootm checksum message is about uImage and not about the extracted zImage, the uImage(80700000) appears okay.
    Maybe the zImage(80008000) is corrupted. That means both Linux boot code and compressed data could be corrupt. Or that the much larger uncompressed kernel is corrupt. That would explain the erratic behaviour you are seeing.

    DRAM has many failure modes. It's difficult to cover them all. If a test fails, for sure you have bad DRAM. If test passes, it might mean you haven't run the test that would fail.

    Maybe try backing off the DRAM timing.

     

  •  

    Hi all,

    Here we have some improvements I want to share for your opinion,

    first of all, we assembled a new PCB using DM355-216 and all the other components are the same and this time kernel 2.6.38 boots successfully, kernel 2.6.10 and 2.618 boots only if it is uncompressed. And also, sometimes the kernel does not boot it stucks at "Starting kernel ..." u-boot print.

    But even if the kernel boots, it stucks at mounting the file system for ex;

    this is the kernel boot log of 2.6.10 (uncompressed) kernel and ext2 INITRD (uncompressed) filesystem located at 0x82000000 ---->

    ....

    dm_spi.0: dm355 SPI Controller driver at 0xc5866000 (irq = 42)
    mmc mmc.0: Supporting 4-bit mode
    mmc mmc.0: Using DMA mode
    Registering platform device 'davinci-audio.0'. Parent at platform
    MMC cmd.resp[0] = aa orc=0
    MMC cmd.resp[0] = aa orc=300000
    MMC: selected 50.000MHz transfer rate
    MMC: selected 25.000MHz transfer rate
    mmcblk0: mmc0:e624 SD01G 992000KiB
     mmcblk0: p1 p2
    RAMDISK: ext2 filesystem found at block 0
    RAMDISK: Loading 16384KiB [1 disk] into ram disk... -

     

    when I gzip the ext2 filesystem it stucks like this as well ---->

    ...

    MMC: selected 25.000MHz transfer rate
    mmcblk0: mmc0:e624 SD01G 992000KiB
     mmcblk0: unknown partition table
    RAMDISK: Compressed image found at block 0

     

    We can't go any further, so I need to ask if somebody knows;

    - Why the uncompressed kernel boots and compressed does not, what kind of hardware failure may cause this?

    - Why the uncompressed kernel sometimes boots and sometimes not, again is this a hw failure like voltage glitches on the power supply may cause these sort of things? I really can't figure out, anyway any help suggestion will be appreciated.

     

    Thanks all and Regards

     

    Selim

     

     

  •  

    Hi all,

    We have some more improvements, we have successfully mounted ext2 initrd image at last. But the device is still unstable, i.e. kernel does not boot sometimes or even if the kernel boots it still stucks on mounting ramdisk.

    And also when linux boots successfully, it waits too long in two places, to make it clear I am attaching the boot log with timestamps below;

    [11:09:32.542]NAND:  NAND device: Manufacturer ID: 0x2c, Chip ID: 0xd3 (Micron NAND 1GiB 3,3V 8-bit)
    [11:09:32.542]Bad block table found at page 524096, version 0x01
    [11:09:32.542]Bad block table found at page 524032, version 0x01
    [11:09:32.582]NAND device: Manufacturer ID: 0x2c, Chip ID: 0xd3 (Micron NAND 1GiB 3,3V 8-bit)
    [11:09:32.582]Bad block table found at page 524096, version 0x01
    [11:09:32.582]Bad block table found at page 524032, version 0x01
    [11:09:32.618]2048 MiB
    [11:09:32.618]In:    serial
    [11:09:32.618]Out:   serial
    [11:09:32.618]Err:   serial
    [11:09:32.667]ARM Clock :- 216MHz
    [11:09:32.667]DDR Clock :- 171MHz
    [11:09:34.718]Hit any key to stop autoboot:  0
    [11:09:34.903]DM355 EVM #
    [11:09:36.326]DM355 EVM # bootm
    [11:09:36.326]## Booting image at 80700000 ...
    [11:09:36.326]   Image Name:   linux-2.6.10_uncompressed
    [11:09:36.326]   Image Type:   ARM Linux Kernel Image (uncompressed)
    [11:09:36.326]   Data Size:    2175456 Bytes =  2.1 MB
    [11:09:36.326]   Load Address: 80008000
    [11:09:36.326]   Entry Point:  80008000
    [11:09:37.400]   Verifying Checksum ... OK
    [11:09:38.174]OK
    [11:09:38.174]
    [11:09:38.174]Starting kernel ...
    [11:09:38.174]
    [11:09:49.086]Linux version 2.6.10_mvl401_IPNC-1.0.7 (root@debian-ipnc) (gcc version 3.4.3 (MontaVista 3.4.3-25.0.104.0600975 2006-07-06)) #5 Thu Apr 28 10:06:11 EEST 2011

    [11:09:49.086]CPU: ARM926EJ-Sid(wb) [41069265] revision 5 (ARMv5TEJ)
    [11:09:49.086]CPU0: D VIVT write-back cache
    [11:09:49.086]CPU0: I cache: 16384 bytes, associativity 4, 32 byte lines, 128 sets
    [11:09:49.116]CPU0: D cache: 8192 bytes, associativity 4, 32 byte lines, 64 sets
    [11:09:49.116]Machine: DaVinci DM355 IPNetCam
    [11:09:49.116]Memory policy: ECC disabled, Data cache writeback
    [11:09:49.116]DM0350
    [11:09:49.116]Built 1 zonelists
    [11:09:49.116]Kernel command line: console=ttyS0,115200n8 ip=off root=/dev/ram0 rw initrd=0x82000000,16M
    [11:09:49.116]PID hash table entries: 1024 (order: 10, 16384 bytes)
    [11:09:49.305]Console: colour dummy device 80x30
    [11:09:49.305]Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
    [11:09:49.305]Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
    [11:09:49.305]Memory: 128MB = 128MB total
    [11:09:49.305]Memory: 111104KB available (1688K code, 429K data, 136K init)
    [11:09:49.305]Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
    [11:09:49.305]CPU: Testing write buffer coherency: ok
    [11:09:49.305]spawn_desched_task(00000000)
    [11:09:49.305]desched cpu_callback 3/00000000
    [11:09:49.305]ksoftirqd started up.
    [11:09:49.305]desched cpu_callback 2/00000000
    [11:09:49.305]checking if image is initramfs...desched thread 0 started up.
    [11:09:49.305]it isn't (no cpio magic); looks like an initrd
    [11:09:49.305]Freeing initrd memory: 16384K
    [11:09:49.305]Linux NoNET1.0 for Linux 2.6
    [11:09:49.305]Registering platform device 'serial8250.0'. Parent at platform
    [11:09:49.305]Registering platform device 'nand_davinci.0'. Parent at platform
    [11:09:49.305]Registering platform device 'mmc.0'. Parent at platform
    [11:09:49.305]DaVinci I2C DEBUG: 10:05:43 Apr 28 2011
    [11:09:49.305]Registering platform device 'i2c'. Parent at platform
    [11:09:49.305]SCSI subsystem initialized
    [11:09:49.305]Registering platform device 'dm_spi.0'. Parent at platform
    [11:09:49.305]NetWinder Floating Point Emulator V0.97 (double precision)
    [11:09:49.305]JFFS2 version 2.2. (NAND) (C) 2001-2003 Red Hat, Inc.
    [11:09:49.305]yaffs Apr 28 2011 10:05:24 Installing.
    [11:09:49.305]Initializing Cryptographic API
    [11:09:49.305]Registering platform device 'dm355fb.0'. Parent at platform
    [11:09:49.305]Console: switching to colour frame buffer device 90x30
    [11:09:49.305]watchdog: TI DaVinci Watchdog Timer: timer margin 64 sec
    [11:09:49.523]Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
    [11:09:49.523]Registering platform device 'serial8250'. Parent at platform
    [11:09:49.523]ttyS0 at MMIO 0x1c20000 (irq = 40) is a 16550A
    [11:09:49.523]ttyS1 at MMIO 0x1c20400 (irq = 41) is a 16550A
    [11:09:49.523]ttyS2 at MMIO 0x1e06000 (irq = 14) is a 16550A
    [11:09:49.523]io scheduler noop registered
    [11:09:49.523]io scheduler anticipatory registered
    [11:09:49.523]RAMDISK driver initialized: 1 RAM disks of 32768K size 1024 blocksize
    [11:09:49.523]loop: loaded (max 8 devices)
    [11:09:49.523]i2c /dev entries driver
    [11:09:49.523]Linux video capture interface: v1.00
    [11:09:49.523]Registering platform device 'vpfe.1'. Parent at platform
    [11:09:50.806]ipipe major#: 254, minor# 0
    [11:09:50.806]Registering platform device 'dm355_ipipe.2'. Parent at platform
    [11:09:50.806]ipipe driver registered
    [11:09:50.806]aew major#: 253, minor# 0
    [11:09:50.996]Registering platform device 'dm355_aew.2'. Parent at platform
    [11:09:50.996]elevator: using anticipatory as default io scheduler
    [11:09:50.996]nand_davinci nand_davinci.0: Using 4-bit hardware ECC
    [11:09:50.996]No NAND device found!!!
    [11:09:50.996]nand_davinci nand_davinci.0: no nand device detected
    [11:09:50.996]drivers/spi/spi.c:scan_boardinfo:304
    [11:09:50.996] modias AIC26 irq 0 max_speed 2000000 bus_num 65535 chip_select 0 mode 0
    [11:09:50.996]dm_spi.0: dm355 SPI Controller driver at 0xc8860000 (irq = 42)
    [11:09:50.996]mmc mmc.0: Supporting 4-bit mode
    [11:09:50.996]mmc mmc.0: Using DMA mode
    [11:09:50.996]Registering platform device 'davinci-audio.0'. Parent at platform
    [11:09:50.996]MMC cmd.resp[0] = aa orc=0
    [11:09:51.064]MMC cmd.resp[0] = aa orc=300000
    [11:09:51.244]MMC: selected 50.000MHz transfer rate
    [11:09:51.244]MMC: selected 25.000MHz transfer rate
    [11:09:51.244]mmcblk0: mmc0:e624 SD01G 992000KiB
    [11:09:51.244] mmcblk0: unknown partition table
    [11:09:51.244]RAMDISK: Compressed image found at block 0
    [11:10:01.678]EXT2-fs warning: checktime reached, running e2fsck is recommended

    [11:10:01.678]VFS: Mounted root (ext2 filesystem).
    [11:10:01.678]Freeing init memory: 136K
    [11:10:13.339]cmemk: disagrees about version of symbol struct_module
    [11:10:13.423]dm350mmap: disagrees about version of symbol struct_module
    [11:10:13.571]sbull: disagrees about version of symbol struct_module
    [11:10:15.656]musb_hdrc: disagrees about version of symbol struct_module
    [11:10:19.206]FAT: bogus number of reserved sectors
    [11:10:19.206]VFS: Can't find a valid FAT filesystem on dev mmcblk0.
    [11:10:19.206]FAT: bogus number of reserved sectors
    [11:10:19.206]VFS: Can't find a valid FAT filesystem on dev mmcblk0.
    [11:10:19.206]yaffs: dev is 266338304 name is "mmcblk0"
    [11:10:19.236]yaffs: Attempting MTD mount on 254.0, "mmcblk0"
    [11:10:19.236]yaffs: dev is 266338304 name is "mmcblk0"
    [11:10:19.236]yaffs: Attempting MTD mount on 254.0, "mmcblk0"
    [11:10:19.855]davinci-McBSP: McBSP2  io_base: 0xe1204000
    [11:10:21.236]g_file_storage: disagrees about version of symbol struct_module
    [11:10:28.265]INIT: Entering runlevel: 3
    [11:10:28.461]Starting internet superserver: inetd.
    [11:10:29.560]
    [11:10:29.560]MontaVista(R) Linux(R) Professional Edition 4.0.1 (0502020)
    [11:10:29.560]
    [11:10:34.590](none) login: root
    [11:10:34.646]
    [11:10:34.646]
    [11:10:34.646]Welcome to MontaVista(R) Linux(R) Professional Edition 4.0.1 (0502020).

     

    I have marked the two stages which takes too long (maked with red).

    Now I would like to ask, why it takes too long at these stages and why the device is unstable (sometimes boots and sometimes not), what kind of hw failure may cause these effects.

    Please any kind of suggestion will be appreciated.

    Regards,

    Selim TEMUR

  • Haven't worked on this device OR this linux version, i would recommend you to review these portions of the log:

    [11:09:32.667]ARM Clock :- 216MHz
    [11:09:32.667]DDR Clock :- 171MHz

    [11:09:49.305]CPU: Testing write buffer coherency: ok
    [11:09:49.305]spawn_desched_task(00000000)
    [11:09:49.305]desched cpu_callback 3/00000000
    [11:09:49.305]ksoftirqd started up.
    [11:09:49.305]desched cpu_callback 2/00000000

    [11:09:49.305]checking if image is initramfs...desched thread 0 started up.

    [11:09:51.244]mmcblk0: mmc0:e624 SD01G 992000KiB
    [11:09:51.244] mmcblk0: unknown partition table

    [11:10:19.206]FAT: bogus number of reserved sectors
    [11:10:19.206]VFS: Can't find a valid FAT filesystem on dev mmcblk0.
    [11:10:19.206]FAT: bogus number of reserved sectors
    [11:10:19.206]VFS: Can't find a valid FAT filesystem on dev mmcblk0.

    [11:10:19.206]yaffs: dev is 266338304 name is "mmcblk0"
    [11:10:19.236]yaffs: Attempting MTD mount on 254.0, "mmcblk0"
    [11:10:19.236]yaffs: dev is 266338304 name is "mmcblk0"
    [11:10:19.236]yaffs: Attempting MTD mount on 254.0, "mmcblk0"

    And later some errors like the one below:

    [11:10:21.236]g_file_storage: disagrees about version of symbol struct_module

    Have you applied RT patches?

  •  

    Hi,

    First of all these binariies (u-boot, kernel, rootfs) works successfully on DM355 EVM.

    Then,

    [11:09:51.244]mmcblk0: mmc0:e624 SD01G 992000KiB
    [11:09:51.244] mmcblk0: unknown partition table

    [11:10:19.206]FAT: bogus number of reserved sectors
    [11:10:19.206]VFS: Can't find a valid FAT filesystem on dev mmcblk0.
    [11:10:19.206]FAT: bogus number of reserved sectors
    [11:10:19.206]VFS: Can't find a valid FAT filesystem on dev mmcblk0.

    [11:10:19.206]yaffs: dev is 266338304 name is "mmcblk0"
    [11:10:19.236]yaffs: Attempting MTD mount on 254.0, "mmcblk0"
    [11:10:19.236]yaffs: dev is 266338304 name is "mmcblk0"
    [11:10:19.236]yaffs: Attempting MTD mount on 254.0, "mmcblk0"


    these are about the inserted SD CARD  and it is not important for now, and

     

    [11:09:32.667]ARM Clock :- 216MHz
    [11:09:32.667]DDR Clock :- 171MHz

    ...

    [11:09:49.305]checking if image is initramfs...desched thread 0 started up.

     

    these lines are exactly the same on the DM355 EVM boot log.


    Thanks,

    Selim