This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux crash when SW5 off and network PCIe card present.

Hello,

I have two Davinci ARM devel board behaving the same way.

I received the latest devel board yesterday and it has all defaults, SD-card 2.6.37 512864-0001A.

I have two new Network PCIe 1x card based on RTL8168B, one behaves badly even on PC Linux, but the other is detected by lspci on PC Linux.

When I plug that "behaving" network card on the ARM PCIe slot, it stops Linux booting - last lines:

Running bootscript from MMC/SD to set the ENV...
## Executing script at 80900000
reading uImage
2362144 bytes read
## Booting kernel from Legacy Image at 80009000 ...
   Image Name:   Arago/2.6.37-psp04.00.00.10/dm81
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    2362080 Bytes = 2.3 MiB
   Load Address: 80008000
   Entry Point:  80008000
   Verifying Checksum ... OK
   Loading Kernel Image ... OK
OK
Starting kernel ...
Uncompressing Linux... done, booting the kernel.

If PCIe RST is ON, then Linux boots but I never achieve to get anything in "ls /sys/bus/pci/devices", even after "echo 1 > /sys/bus/pci/rescan".

My aim is to be sure PCIe work, my company is planning to use Altera PCIe EP on this ARM PCIe RC on one single PCIe lane.

Any idea on what can be the problem?

Regards,

Etienne.

  • A bit more information...

    The behaviour I am seeing is only happening  on the older board, I just noticed it is a "revision B" written on the white box on the PCB.

    The new board, revision E, do not boot (stop at the same point) whatever the position of SW5.

    I have to say that the older devel board (which is working "better") has had the daughter board removed - so I also just removed the daughter board of the new board, but that did not change anything.

    Is the kernel able to handle "hot plugging" of PCIe cards?

    Obviously the system boots without problem when I just remove the PCIe cards.

    I also have a completely rebuilt Linux kernel and environment which is behaving the same - we are not planning to use a complete distribution.

    Regards,

    Etienne.

  • I am very interested too in more information on the state of PCIe (PCI express) support for the DM8168. My questions are:

    • Hot plugging of PCIe is not supported presently in the latest EZSDK with the latest EVM according to the PSP docs and my practical tests. Is this a software limitation (a Linux driver issue) or is this because the hardware itself does not support PCIe?
    • When are there plans to support PCIe hot plugging?

    We would find a PCI hot plug capability very useful.

    Look forward to hearing back from someone at TI.

    Thanks,
    Ralph

  • Again more information...

    I have an Altera board with PCIe interface fully working on PC (Debian32,64 and Ubuntu32,64 - only Ubuntu 64 LTS is not working because the kernel is too old for PCIe).

    I boot (most of the times) with the revision E of spectrum digital 389X EVM with SW5 PCIe RST OFF (opposite of r195), and it seems to boot too when PCIe RST is ON (I just tried once).

    My problem is that the lspci is wrong, not detecting BAR0 (which is perfectly detected on a PC), with the ezsdk:

    root@dm816x-evm:~# LD_PRELOAD=/etienne/libpci.so.3 /etienne/lspci -vv -s 01:00.0
    01:00.0 Class ff00: Device 1172:e001 (rev 01)
        Subsystem: Device a106:2100
        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Interrupt: pin A routed to IRQ 48
        Capabilities: [50] MSI: Enable- Count=1/4 Maskable- 64bit+
            Address: 0000000000000000  Data: 0000
        Capabilities: [78] Power Management version 3
            Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
            Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [80] Express (v2) Endpoint, MSI 00
            DevCap:    MaxPayload 512 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
            DevCtl:    Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                MaxPayload 128 bytes, MaxReadReq 512 bytes
            DevSta:    CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
            LnkCap:    Port #1, Speed 5GT/s, Width x1, ASPM L0s, Latency L0 unlimited, L1 unlimited
                ClockPM- Surprise- LLActRep- BwNot-
            LnkCtl:    ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
                ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
            LnkSta:    Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
            DevCap2: Completion Timeout: Range ABCD, TimeoutDis+
            DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
            LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
                 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                 Compliance De-emphasis: -6dB
            LnkSta2: Current De-emphasis Level: -3.5dB
        Capabilities: [100 v1] Virtual Channel
            Caps:    LPEVC=0 RefClk=100ns PATEntryBits=1
            Arb:    Fixed- WRR32- WRR64- WRR128-
            Ctrl:    ArbSelect=Fixed
            Status:    InProgress-
            VC0:    Caps:    PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
                Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
                Ctrl:    Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
                Status:    NegoPending- InProgress-

    That is using the "official" ezsdk/psp/linux-2.6.37-psp04.00.00.10, if I use an official arm linux.com kernel I get a better lspci:

    # /lspci -vv -s 01:00.0
    01:00.0 Class ff00: Device 1172:e001 (rev 01)
        Subsystem: Device a106:2100
        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Interrupt: pin A routed to IRQ 48
        Region 0: Memory at 20000000 (32-bit, non-prefetchable) [disabled] [size=256M]
        Region 1: Memory at 30000000 (32-bit, non-prefetchable) [disabled] [size=256K]
        Region 2: Memory at 30040000 (32-bit, non-prefetchable) [disabled] [size=256K]
        Capabilities: [50] MSI: Enable- Count=1/4 Maskable- 64bit+
            Address: 0000000000000000  Data: 0000
        Capabilities: [78] Power Management version 3
            Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
            Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [80] Express (v2) Endpoint, MSI 00
            DevCap:    MaxPayload 512 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
            DevCtl:    Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                MaxPayload 128 bytes, MaxReadReq 512 bytes
            DevSta:    CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
            LnkCap:    Port #1, Speed 5GT/s, Width x1, ASPM L0s, Latency L0 unlimited, L1 unlimited
                ClockPM- Surprise- LLActRep- BwNot-
            LnkCtl:    ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
                ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
            LnkSta:    Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
            DevCap2: Completion Timeout: Range ABCD, TimeoutDis+
            DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
            LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
                 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                 Compliance De-emphasis: -6dB
            LnkSta2: Current De-emphasis Level: -6dB
        Capabilities: [100 v1] Virtual Channel
            Caps:    LPEVC=0 RefClk=100ns PATEntryBits=1
            Arb:    Fixed- WRR32- WRR64- WRR128-
            Ctrl:    ArbSelect=Fixed
            Status:    InProgress-
            VC0:    Caps:    PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
                Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
                Ctrl:    Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
                Status:    NegoPending- InProgress-

    But still behaving badly, not being able to read PCIe registers at BARx + 0x20:

    Unhandled fault: Precise External Abort on non-linefetch (0x1008) at 0xca896021
    Internal error: : 1008 [#1]
    last sysfs file: /sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/class
    Modules linked in: altpciechdma(+) [last unloaded: altpciechdma]
    CPU: 0    Not tainted  (2.6.37 #3)
    PC is at probe+0x89c/0xda0 [altpciechdma]

    And the module I am inserting is not doing at all the Altera PCIe DMA that are done on the PC.

    Is the PCIe subsystem of the AM389X supposed to work with Linux-2.6.37?

    Regards,

    Etienne.

  • Hi Etienne and Ralph,

    I gather that there are at least 3 issues/queries on this thread:

    1) Kernel not booting with Rev B when SW5 OFF, while Rev E doesn't boot in either switch settings with Realtek PCIe card in the slot

    2) PCIe hotplug support with DM816x/AM389x kernel

    3) Resource assignment from DM816x/AM389x kernel does not match (or is incomplete) compared to that done by Linux PC when Altera PCIe boad is plugged in

    Since these are 3 separate issues, please open separate threads for the later 2 while I try to respond on 1st issue.

    Regarding the 1st issue:

    RevB case: SW5-1 OFF means the PCIe slots pin A11 is input and I think the PCIe card is keeping this pin low. Can you check the A11 status when kernel doesn't boot? You can probe R201 (one end is at 3.3V while the other is connected to A11)

    While it is expected that the SW5-1 ON is preventing card detection on 'unmodified' EVM. Please refer http://processors.wiki.ti.com/index.php/DM816x_C6A816x_AM389x_PCI_Express_Root_Complex_Driver_User_Guide#Taking_care_of_PERSTn

    RevE case: I will need to review schematics for any change as I haven't used one yet. But I suspect in this case the A11 pin remains low and as input all the time, preventing the boot of DM816x device. I will follow up on Rev E later.

       Hemant

     

     

     

     

  • Hi Hemant,

    For the problem 1).

    Please first note that I was trying those PCIe1x network cards because I still did not have an Altera PCIe card to test.

    Now I have one, it is not a real problem for my job to see those cards working. Moreover, the revision B is now being used by someone else.

    But to answer your query, if I boot U-boot with PCIe RST OFF (opposite of R195), I have 3.32 V on one side of R201 and 3.29V on the other side.

    If I boot with PCIe RST ON (towards R195) I have 3.32 V on the two sides of R201.

    The problem is not that Linux do not detect the PCIe card, it is that Linux do not boot at all, even if U-boot is perfectly working.

     

    Regards,

    Etienne.

  • Etienne,

    Can you enable low level debugging and rebuild the kernel. Please add "earlyprintk" (without quotes) to the bootargs and provide the log.

    Refer: http://processors.wiki.ti.com/index.php/DM816x_C6A816x_AM389x_PCI_Express_Root_Complex_Driver_User_Guide#Kernel_Configuration

    Regarding Altera BAR0, you said "if I use an official arm linux.com kernel" - what does this mean? Are you trying with a different ARM based board as PCIe root complex?

       Hemant

  • Hemant,

    "if I use an official arm kernel.org" means that we do have a complete development chain that we can fully regenerate in-house, from the official Linux kernel; not a modified kernel compiled by a modified compiler. That is using the exact same hardware.

    That kernel already had "[*] Early printk (NEW)" active, and giving "earlyprintk" boot parameter do not change anything, nothing is displayed after the last message of U-boot.

    I have installed the EZSDK to know if the error is coming from our configuration or from some common problem, and it is the later.

    As a note, I had to install the EZSDK as root on Ubuntu 64 LTS because of this error:

    $ Desktop/arm_ti_toolchain/ezsdk_dm816x-evm_5_01_01_80_setuplinux
    unknown user id: 94387
        while executing
    "id user"
        (procedure "::InstallJammer::CommonInit" line 219)
        invoked from within
    "::InstallJammer::CommonInit"
        (procedure "::InstallJammer::InitInstall" line 19)
        invoked from within
    "::InstallJammer::InitInstall"
        (file "/installkitvfs/main2.tcl" line 28613)
        invoked from within
    "source [file join $::installkit::root main2.tcl]"
        (file "/installkitvfs/main.tcl" line 3)

    that is probably because:

    eetilor@lnxdws033:~$ whoami
    eetilor
    eetilor@lnxdws033:~$ grep eetilor /etc/passwd
    eetilor@lnxdws033:~$ getent passwd eetilor
    eetilor:*:97777:22252:Etienne Lorrain:/home/eetilor:/bin/bash
    eetilor@lnxdws033:~$

    But I DO NOT WANT to debug the EZSDK, I do not USE it, if we had a bug in our own generation system I can test with the EZSDK (I just need to be root to compile), but my problem is not the EZSDK.

    As I said that freeze on network card is a secondary problem, my real and first problem is that the Altera PCIe implementation works perfectly on PC but not on TI PCIe.

    The fact that another (unknown in this network card) PCIe implementation do not work is not that important, I just continue this thread because it may help us to locate problems with the Altera PCIe implementation later on - if there is such problem.

    Regards,

    Etienne.

  • Hemant,

    If I add the "earlyprintk" to the command line of the EZSDK linux kernel, I get a valid boot at first, but then:

    NOR: Can't request GPMC CS
    registered ti816x_vpss device
    pm_dbg_init: only OMAP3 supported
    Registered ti81xx_fb device
    ti816x_pcie: Invoking PCI BIOS...
    ti816x_pcie: Setting up Host Controller...
    ti816x_pcie: Register base mapped @0xd0820000
    Unhandled fault: external abort on non-linefetch (0x1008) at 0xd0820004
    Internal error: : 1008 [#1]
    last sysfs file:
    Modules linked in:
    CPU: 0    Not tainted  (2.6.37 #1)
    PC is at ti816x_pcie_setup+0x224/0x440
    LR is at 0x0
    pc : [<c0054c28>]    lr : [<00000000>]    psr: 60000013
    sp : cc82be40  ip : d0820000  fp : cc82be6c
    r10: 00000000  r9 : 00000000  r8 : c04b96fc
    r7 : cc86a01c  r6 : cc86a000  r5 : 00000000  r4 : c04811d8
    r3 : 00000604  r2 : d082100a  r1 : d0821014  r0 : d0821010
    Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
    Control: 10c5387f  Table: 80004019  DAC: 00000017
    Process swapper (pid: 1, stack limit = 0xcc82a2e8)
    Stack: (0xcc82be40 to 0xcc82c000)
    be40: c00fd1c8 c00b4cc0 c047d880 cc867fc0 c048a7bc c048a784 00000000 00000000
    be60: cc82be94 cc82be70 c000c270 c0054a10 c047d880 c047d880 c048a784 cc865700
    be80: c04a01b0 00000000 cc82bea4 cc82be98 c00548d8 c000c1e8 cc82beb4 cc82bea8
    bea0: c01d39f0 c0054894 cc82bed4 cc82beb8 c01d295c c01d39dc c047d880 c047d8b4
    bec0: c048a784 cc865700 cc82bef4 cc82bed8 c01d2a84 c01d2898 00000000 c01d2a1c
    bee0: c048a784 cc865700 cc82bf1c cc82bef8 c01d2108 c01d2a28 cc81bfb8 cc8652b0
    bf00: c0184618 c002a9b0 c0008670 c048a784 cc82bf2c cc82bf20 c01d2780 c01d20c4
    bf20: cc82bf5c cc82bf30 c01d19e0 c01d276c c040923e cc82bf40 c002a9b0 c0008670
    bf40: c048a784 00000013 c00117ac 00000000 cc82bf84 cc82bf60 c01d2dbc c01d1938
    bf60: c002a9b0 c0008670 c0063ae0 00000013 c00117ac 00000000 cc82bf94 cc82bf88
    bf80: c01d3cdc c01d2d18 cc82bfa4 cc82bf98 c00117c0 c01d3c9c cc82bfdc cc82bfa8
    bfa0: c00353d8 c00117b8 00000197 c0008670 c0063ae0 00000013 c002a9b0 c0008670
    bfc0: c0063ae0 00000013 00000000 00000000 cc82bff4 cc82bfe0 c000870c c0035314
    bfe0: 00000000 c0008670 00000000 cc82bff8 c0063ae0 c000867c 3385344d 37d52c6c
    Backtrace:
    [<c0054a04>] (ti816x_pcie_setup+0x0/0x440) from [<c000c270>] (pci_common_init+0x94/0x188)
     r8:00000000 r7:00000000 r6:c048a784 r5:c048a7bc r4:cc867fc0
    [<c000c1dc>] (pci_common_init+0x0/0x188) from [<c00548d8>] (ti816x_pcie_probe+0x50/0x68)
     r9:00000000 r8:c04a01b0 r7:cc865700 r6:c048a784 r5:c047d880
    r4:c047d880
    [<c0054888>] (ti816x_pcie_probe+0x0/0x68) from [<c01d39f0>] (platform_drv_probe+0x20/0x24)
    [<c01d39d0>] (platform_drv_probe+0x0/0x24) from [<c01d295c>] (driver_probe_device+0xd0/0x190)
    [<c01d288c>] (driver_probe_device+0x0/0x190) from [<c01d2a84>] (__driver_attach+0x68/0x8c)
     r7:cc865700 r6:c048a784 r5:c047d8b4 r4:c047d880
    [<c01d2a1c>] (__driver_attach+0x0/0x8c) from [<c01d2108>] (bus_for_each_dev+0x50/0x84)
     r7:cc865700 r6:c048a784 r5:c01d2a1c r4:00000000
    [<c01d20b8>] (bus_for_each_dev+0x0/0x84) from [<c01d2780>] (driver_attach+0x20/0x28)
     r6:c048a784 r5:c0008670 r4:c002a9b0
    [<c01d2760>] (driver_attach+0x0/0x28) from [<c01d19e0>] (bus_add_driver+0xb4/0x234)
    [<c01d192c>] (bus_add_driver+0x0/0x234) from [<c01d2dbc>] (driver_register+0xb0/0x13c)
    [<c01d2d0c>] (driver_register+0x0/0x13c) from [<c01d3cdc>] (platform_driver_register+0x4c/0x60)
     r9:00000000 r8:c00117ac r7:00000013 r6:c0063ae0 r5:c0008670
    r4:c002a9b0
    [<c01d3c90>] (platform_driver_register+0x0/0x60) from [<c00117c0>] (ti816x_pcie_rc_init+0x14/0x20)
    [<c00117ac>] (ti816x_pcie_rc_init+0x0/0x20) from [<c00353d8>] (do_one_initcall+0xd0/0x1a4)
    [<c0035308>] (do_one_initcall+0x0/0x1a4) from [<c000870c>] (kernel_init+0x9c/0x154)
    [<c0008670>] (kernel_init+0x0/0x154) from [<c0063ae0>] (do_exit+0x0/0x5e4)
     r5:c0008670 r4:00000000
    Code: e1a01000 e2800010 e1c230b0 e2811014 (e59c3004)
    ---[ end trace 1b75b31a2719ed1c ]---
    Kernel panic - not syncing: Attempted to kill init!
    Backtrace:
    [<c0043b44>] (dump_backtrace+0x0/0x110) from [<c0353b74>] (dump_stack+0x18/0x1c)
     r7:cc828000 r6:cc828000 r5:c0054c2a r4:c04b98d0
    [<c0353b5c>] (dump_stack+0x0/0x1c) from [<c0353bd8>] (panic+0x60/0x17c)
    [<c0353b78>] (panic+0x0/0x17c) from [<c0063b54>] (do_exit+0x74/0x5e4)
     r3:c048d1e0 r2:cc82bc70 r1:cc8280fc r0:c040ed07
    [<c0063ae0>] (do_exit+0x0/0x5e4) from [<c0043ef0>] (die+0x29c/0x2d8)
    [<c0043c54>] (die+0x0/0x2d8) from [<c0043fec>] (arm_notify_die+0x5c/0x60)
    [<c0043f90>] (arm_notify_die+0x0/0x60) from [<c00352f0>] (do_DataAbort+0x88/0x9c)
     r5:c047c7b8 r4:00000007
    [<c0035268>] (do_DataAbort+0x0/0x9c) from [<c0355bac>] (__dabt_svc+0x4c/0x60)
    Exception stack(0xcc82bdf8 to 0xcc82be40)
    bde0:                                                       d0821010 d0821014
    be00: d082100a 00000604 c04811d8 00000000 cc86a000 cc86a01c c04b96fc 00000000
    be20: 00000000 cc82be6c d0820000 cc82be40 00000000 c0054c28 60000013 ffffffff
     r8:c04b96fc r7:cc86a01c r6:cc86a000 r5:cc82be2c r4:ffffffff
    [<c0054a04>] (ti816x_pcie_setup+0x0/0x440) from [<c000c270>] (pci_common_init+0x94/0x188)
     r8:00000000 r7:00000000 r6:c048a784 r5:c048a7bc r4:cc867fc0
    [<c000c1dc>] (pci_common_init+0x0/0x188) from [<c00548d8>] (ti816x_pcie_probe+0x50/0x68)
     r9:00000000 r8:c04a01b0 r7:cc865700 r6:c048a784 r5:c047d880
    r4:c047d880
    [<c0054888>] (ti816x_pcie_probe+0x0/0x68) from [<c01d39f0>] (platform_drv_probe+0x20/0x24)
    [<c01d39d0>] (platform_drv_probe+0x0/0x24) from [<c01d295c>] (driver_probe_device+0xd0/0x190)
    [<c01d288c>] (driver_probe_device+0x0/0x190) from [<c01d2a84>] (__driver_attach+0x68/0x8c)
     r7:cc865700 r6:c048a784 r5:c047d8b4 r4:c047d880
    [<c01d2a1c>] (__driver_attach+0x0/0x8c) from [<c01d2108>] (bus_for_each_dev+0x50/0x84)
     r7:cc865700 r6:c048a784 r5:c01d2a1c r4:00000000
    [<c01d20b8>] (bus_for_each_dev+0x0/0x84) from [<c01d2780>] (driver_attach+0x20/0x28)
     r6:c048a784 r5:c0008670 r4:c002a9b0
    [<c01d2760>] (driver_attach+0x0/0x28) from [<c01d19e0>] (bus_add_driver+0xb4/0x234)
    [<c01d192c>] (bus_add_driver+0x0/0x234) from [<c01d2dbc>] (driver_register+0xb0/0x13c)
    [<c01d2d0c>] (driver_register+0x0/0x13c) from [<c01d3cdc>] (platform_driver_register+0x4c/0x60)
     r9:00000000 r8:c00117ac r7:00000013 r6:c0063ae0 r5:c0008670
    r4:c002a9b0
    [<c01d3c90>] (platform_driver_register+0x0/0x60) from [<c00117c0>] (ti816x_pcie_rc_init+0x14/0x20)
    [<c00117ac>] (ti816x_pcie_rc_init+0x0/0x20) from [<c00353d8>] (do_one_initcall+0xd0/0x1a4)
    [<c0035308>] (do_one_initcall+0x0/0x1a4) from [<c000870c>] (kernel_init+0x9c/0x154)
    [<c0008670>] (kernel_init+0x0/0x154) from [<c0063ae0>] (do_exit+0x0/0x5e4)
     r5:c0008670 r4:00000000

    So basically we get the same kind of error (external abort on non-linefetch) as when I use a module reading BARx mapped memory on my Altera PCIe board.

    The difference is that the "external abort on non-linefetch" appears in the kernel itself instead of in a module (for my Altera PCIe), so the Linux kernel cannot continue and appear frozen.

    Regards,

    Etienne.

  • Etienne,

    Is this crash consistent? In one of your earlier posts you sent the 'lspci' log.

       Hemant

  • The crash is consistent with this network card, the lspci -vv is for the Altera board which do not have an in-kernel driver but a module.

    Redone with network card:

    ti816x_pcie: Invoking PCI BIOS...
    ti816x_pcie: Setting up Host Controller...
    ti816x_pcie: Register base mapped @0xd0820000
    Unhandled fault: external abort on non-linefetch (0x1008) at 0xd0820004
    Internal error: : 1008 [#1]
    last sysfs file:
    Modules linked in:
    CPU: 0    Not tainted  (2.6.37 #1)
    PC is at ti816x_pcie_setup+0x224/0x440

  • Hi,

    1)  RTL8168 no detected when SW5-1 ON - this is expected and mentioned in RC Driver User Guide.

    2) Crash when SW5-1 OFF - I suspect the PCIe module is getting reset/doesn't come out of reset when RC driver tries to access it. Do you have JTAG connectivity available?

    3) Configuration with Altera card - the AM389x RC has total  256MB window size so the total size of BARs of connected endpoints should not exceed 256MB while in Altera case, the BAR0 itself is 256 MB. Is it possible for you to reduce the BAR0?

       Hemant  

     

     

  • Hi,

    1) Same crash whatever the position of SW5-1 for the network card:

    ON:

    pm_dbg_init: only OMAP3 supported
    Registered ti81xx_fb device
    ti816x_pcie: Invoking PCI BIOS...
    ti816x_pcie: Setting up Host Controller...
    ti816x_pcie: Register base mapped @0xd0820000
    Unhandled fault: external abort on non-linefetch (0x1008) at 0xd0820004
    Internal error: : 1008 [#1]
    last sysfs file:
    Modules linked in:
    CPU: 0    Not tainted  (2.6.37 #1)
    PC is at ti816x_pcie_setup+0x224/0x440
    LR is at 0x0
    pc : [<c0054c28>]    lr : [<00000000>]    psr: 60000013
    sp : cc82be40  ip : d0820000  fp : cc82be6c
    r10: 00000000  r9 : 00000000  r8 : c04b96fc
    r7 : cc86a01c  r6 : cc86a000  r5 : 00000000  r4 : c04811d8
    r3 : 00000604  r2 : d082100a  r1 : d0821014  r0 : d0821010
    Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
    Control: 10c5387f  Table: 80004019  DAC: 00000017
    Process swapper (pid: 1, stack limit = 0xcc82a2e8)

    OFF:

    pm_dbg_init: only OMAP3 supported
    Registered ti81xx_fb device
    ti816x_pcie: Invoking PCI BIOS...
    ti816x_pcie: Setting up Host Controller...
    ti816x_pcie: Register base mapped @0xd0820000
    Unhandled fault: external abort on non-linefetch (0x1008) at 0xd0820004
    Internal error: : 1008 [#1]
    last sysfs file:
    Modules linked in:
    CPU: 0    Not tainted  (2.6.37 #1)
    PC is at ti816x_pcie_setup+0x224/0x440
    LR is at 0x0
    pc : [<c0054c28>]    lr : [<00000000>]    psr: 60000013
    sp : cc82be40  ip : d0820000  fp : cc82be6c
    r10: 00000000  r9 : 00000000  r8 : c04b96fc
    r7 : cc86a01c  r6 : cc86a000  r5 : 00000000  r4 : c04811d8
    r3 : 00000604  r2 : d082100a  r1 : d0821014  r0 : d0821010
    Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
    Control: 10c5387f  Table: 80004019  DAC: 00000017
    Process swapper (pid: 1, stack limit = 0xcc82a2e8)

    That is with a revision E Digital Spectrum board, starting from power OFF.

     

    2) You asked me to create another thread for the Altera board, at:

    http://e2e.ti.com/support/dsp/davinci_digital_media_processors/f/717/t/124004.aspx

    I suppose you are talking of the Altera PCIe board because I do not have JTAG on the network board.

    I have JTAG on Altera PCIe board, but if it were in reset - could we still see the lspci describing:

    01:00.0 Class ff00: Device 1172:e001 (rev 01)

     

    3) The Altera demo application do advertise big BARs but only small portion are mapped: On a PC, you get:

    [  177.480091] Succesfully requested IRQ #50 with dev_id 0xf3c07880
    [  177.480093] BAR0 0xd0000000-0xdfffffff flags 0x00040200
    [  177.480094] BAR1 0xcff00000-0xcff3ffff flags 0x00040200
    [  177.480095] BAR2 0xcff40000-0xcff7ffff flags 0x00040200
    [  177.480106] BAR[0] mapped at 0xf8460000 with length 32768(/268435456).
    [  177.480110] BAR[2] mapped at 0xf83e4000 with length 256(/262144).

    I prefer to keep the default design if possible to rule out re-generation problems, do you think having big BAR but mapping only short portion is likely to create the problem?

    Etienne.

  • Etienne,

    Regarding JTAG, I was referring to connecting to AM389x to see if the PCIe h/w got reset which resulted into the crash.

       Hemant

  • The crash when reading 32 bits is solved by doing:

    pcie_set_readrq(dev, 256)

    In the driver that I can modify, i.e. the Altera one.

    I am not sure why a maximum size PCIe read request message is done to read 32 bits, but I do not want to investigate neither.

    So the PCIe is not reseted.

  • Hi Etienne,
    did you solve the PCIE NOT BOOTING issue?
    Regards,
    - Robert