This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/AM3352: Network startup differences between SPI boot and NFS boot

Part Number: AM3352

Tool/software: Linux

Hi

I am encountering some differences in behaviour depending on whether I am booting from a JFFS2 rootfs on flash or over a NFS boot.
I am using a board derived from the BeagleBoneBlack running the latest SDK.

The ROOTFS I am trying to use is a cut down version of core-image-minimal created via the Arago Project
- our project is very space constained so I have configured the bitbake build to use busybox for as much as possible
e.g.

VIRTUAL-RUNTIME_dev_manager_forcevariable = "busybox-mdev"
VIRTUAL-RUNTIME_login_manager_forcevariable = "busybox"
VIRTUAL-RUNTIME_init_manager_forcevariable = "busybox"
VIRTUAL-RUNTIME_initscripts_forcevariable = "initscripts"

I have converted this image to JFFS2 for running on a micron SPI NOR flash chip: 

- the JFFS2 packed version is identical to the folder I have exported for use for NFS boot.

I can run happily with this as my rootfs over NFS but when running from the SPI Flash there are a couple of problems.
These may well be related to each other.

The first thing I notice is that there is no eth0 created on boot running from SPI flash
- I guess this is triggered in the NFS boot case because of the requirement to find the rootfs?
- the missing section in the boot looks like

[ 1.285988] net eth0: initializing cpsw version 1.12 (0)
[ 1.290956] net eth0: initialized cpsw ale version 1.4
[ 1.295724] net eth0: ALE Table size 1024
[ 1.386304] net eth0: phy found : id is : 0x7c131
[ 1.390755] libphy: PHY 4a101000.mdio:01 not found
[ 1.395207] net eth0: phy "4a101000.mdio:01" not found on slave 1, err -19
[ 1.430579] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 4.386685] cpsw 4a100000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
[ 4.405973] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 4.426114] Sending DHCP requests .,
[ 4.515934] OK
[ 4.545932] IP-Config: Got DHCP answer from 10.255.255.254, my address is 10.255.0.152
[ 4.554293] IP-Config: Complete:
[ 4.557374] device=eth0, hwaddr=7c:66:9d:14:11:a6, ipaddr=10.255.0.152, mask=255.255.0.0, gw=10.255.255.250
[ 4.566861] host=10.255.0.152, domain=mydomain.com.au, nis-domain=(none)
[ 4.573724] bootserver=0.0.0.0, rootserver=10.255.0.148, rootpath=
[ 4.579779] nameserver0=10.255.255.1, nameserver1=10.255.255.2

once booted with the rootfs on SPI flash I appear to be able to start the eth0 using "udhcpc -i eth0"

root@elle-board:/etc# udhcpc -i eth0
udhcpc (v1.24.1) started
[20115.764058] net eth0: initializing cpsw version 1.12 (0)
[20115.769100] net eth0: initialized cpsw ale version 1.4
[20115.773871] net eth0: ALE Table size 1024
[20115.856229] net eth0: phy found : id is : 0x7c131
[20115.860682] libphy: PHY 4a101000.mdio:01 not found
[20115.865132] net eth0: phy "4a101000.mdio:01" not found on slave 1, err -19
[20115.878488] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
Sending discover...
[20118.856628] cpsw 4a100000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
[20118.864124] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Sending discover...
Sending select for 10.255.0.152...
Lease of 10.255.0.152 obtained, lease time 86400
/etc/udhcpc.d/50default: Adding DNS 10.255.255.1
/etc/udhcpc.d/50default: Adding DNS 10.255.255.2

- is it simply a case of finding the correct place in an init script to kick this off?


Secondly dropbear fails to start citing: " error while loading shared libraries: /lib/libdl.so.2: invalid ELF header"

init started: BusyBox v1.24.1 (2017-03-22 16:26:43 AEDT)
starting pid 67, tty '': '/etc/init.d/rcS'
Starting Dropbear SSH server: dropbearkey: error while loading shared libraries: /lib/libdl.so.2: invalid ELF header
starting pid 81, tty '/dev/ttyS0': '/sbin/getty 115200 ttyS0'

vs

init started: BusyBox v1.24.1 (2017-02-09 12:59:11 AEDT)
starting pid 67, tty '': '/etc/init.d/rcS'
Starting Dropbear SSH server: [ 5.915631] random: dropbear: uninitialized urandom read (32 bytes read, 91 bits of entropy available)
dropbear.
starting pid 82, tty '/dev/ttyS0': '/sbin/getty 115200 ttyS0'

This I find very confusing.

/lib/libdl.so.2 is a link to lib/libdl-2.21.so and comparing the contents at the start of this file between the two cases show the same bytes
- the contents themselves look like

root@elle-board:/lib# dd if=/lib/libdl-2.21.so bs=256 count=2 | hexdump -C
2+0 records in
2+0 records out
00000000 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|
00000010 03 00 28 00 01 00 00 00 3c 09 00 00 34 00 00 00 |..(.....<...4...|
00000020 f0 76 01 00 02 04 00 05 34 00 20 00 06 00 28 00 |.v......4. ...(.|
00000030 25 00 22 00 01 00 00 00 00 00 00 00 00 00 00 00 |%.".............|
00000040 00 00 00 00 68 14 00 00 68 14 00 00 05 00 00 00 |....h...h.......|
00000050 00 00 01 00 01 00 00 00 e4 1e 00 00 e4 1e 01 00 |................|
00000060 e4 1e 01 00 a4 01 00 00 d4 01 00 00 06 00 00 00 |................|
00000070 00 00 01 00 02 00 00 00 f8 1e 00 00 f8 1e 01 00 |................|
00000080 f8 1e 01 00 08 01 00 00 08 01 00 00 06 00 00 00 |................|
00000090 04 00 00 00 04 00 00 00 f4 00 00 00 f4 00 00 00 |................|
000000a0 f4 00 00 00 20 00 00 00 20 00 00 00 04 00 00 00 |.... ... .......|
000000b0 04 00 00 00 51 e5 74 64 00 00 00 00 00 00 00 00 |....Q.td........|
000000c0 00 00 00 00 00 00 00 00 00 00 00 00 06 00 00 00 |................|
000000d0 10 00 00 00 52 e5 74 64 e4 1e 00 00 e4 1e 01 00 |....R.td........|
000000e0 e4 1e 01 00 1c 01 00 00 1c 01 00 00 04 00 00 00 |................|
000000f0 01 00 00 00 04 00 00 00 10 00 00 00 01 00 00 00 |................|
00000100 47 4e 55 00 00 00 00 00 02 00 00 00 06 00 00 00 |GNU.............|
00000110 20 00 00 00 12 00 00 00 1d 00 00 00 04 00 00 00 | ...............|
00000120 07 00 00 00 98 00 11 00 00 42 00 00 82 00 00 0a |.........B......|
00000130 93 28 00 d8 1d 00 00 00 00 00 00 00 1e 00 00 00 |.(..............|
00000140 00 00 00 00 00 00 00 00 1f 00 00 00 21 00 00 00 |............!...|
00000150 23 00 00 00 24 00 00 00 25 00 00 00 00 00 00 00 |#...$...%.......|
00000160 26 00 00 00 00 00 00 00 27 00 00 00 00 00 00 00 |&.......'.......|
00000170 00 00 00 00 00 00 00 00 00 00 00 00 af c4 4d 0f |..............M.|
00000180 91 21 fc f8 c0 53 80 18 d9 3d 6c f6 94 b3 5f 19 |.!...S...=l..._.|
00000190 05 e8 07 f9 7f 9e d0 18 61 a2 92 06 eb 16 a9 18 |........a.......|
000001a0 61 af 00 f9 06 02 04 f9 fb 33 fb 0f 00 00 00 00 |a........3......|
000001b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000001c0 40 08 00 00 00 00 00 00 03 00 0a 00 00 00 00 00 |@...............|
000001d0 f4 1e 01 00 00 00 00 00 03 00 13 00 60 01 00 00 |............`...|
000001e0 00 00 00 00 00 00 00 00 12 00 00 00 df 00 00 00 |................|
000001f0 00 00 00 00 00 00 00 00 20 00 00 00 f4 00 00 00 |........ .......|
00000200

even after manually starting the eth0 using udhcpc any attempt to launch dropbear results in the same error

any suggestions for what I can try to diagnose this further would be gratefully appreciated

Best regards,
Richard

  • Hi Richard,

    cpsw 4a100000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx


    I see that kernel actally initializes eth0.. so this should be working.

    - is it simply a case of finding the correct place in an init script to kick this off?


    Does your filesystem use systemd or udev starts the init scripts from /etc/init.d?

    As for the library.. this is very strange... is it built against the kernel version you use

    Best Regards,
    Yordan
  • Hi Yordan

    apologies if I was unclear - the lines I posted about eth0 were missing from the SPI boot (but present in the NFS boot)

    I am using  busybox (or the mdev within busybox) to run the init scripts - or at least that is my intention via the BitBake  variable forces I posted

    I am sure everything is built for the correct kernel: as when I run with the same files as an NFS boot everything works, and i have not had to modify the files from how bitbake (or the arago project) created them to get it to work

    I am beginning to suspect that there might be some corruption happening in the creation of/conversion to a JFFS2 image (I had to patch mkfs.jffs2 to allow the small erase block for the chip we are using)

    - I will see what I can uncover in that area

    Any other suggestions you might have for tracking down the possible causes of the  library error would be really useful

    Thanks for the help

    - Richard

  • Hi

    the first issue is solved if I throw "ip=dhcp" at the kernel in the boot arguments: this makes sense

    the second is quite a bit more confusing:

    it appears to be due to differences in the the mkfs.jffs2 app used to create the rootfs image

    I can create a mountable, reliable, image if I do so via patching the mkfs.jffs2 used in the arago project build

    however If I try to create the image using the mkfs.jffs2 I have built  externally it will misbehave.

    The versions (and those of the compression dependencies) are different:

    working: mtd-utils 1.5.2  zlib-1.2.8  lzo-2.9

    failing:  mtd-utils 2.0.0   zlib-1.2.11  lzo-2.10

    The patch I apply the trivial change to allow eraseblocks < 4k: this is the same for both versions of mkfs.jffs2.c 

    So it would appear that something has happened since the versions referred to in the arago project build to compromise images for the settings I am using?

    The settings used are: 

      --faketime  -q --pad -l --eraseblock=0x1000 --no-cleanmarkers

    I will make sure I use the mkfs.jffs2 created via the arago project build; if I get some spare time I will try and revisit this to work out where the issue precisely is

    Thanks for the help

    - Ricahrd