This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

loading code via PCIe - Boot issues

Dear support,

Our goal is to use the C6670 as part as a full system. In this system, a host card (single board computer with an OS) will be loading the code inside the DSP and the DSP will then be running. We will have in the system 8 cards all connected via the PCIe bus.

That being said, we want, after power on to the system, that the C6670 would be ready to receive a new program.

It seemed logical to study how c:\TI_MCSDK\mcsdk_2_01_00_03\tools\boot_loader\examples\pcie\ was running under linux to learn and replicate the process using another os. I am not an expert in linux but we kind of got the examples working.

We are having some issues that I need your help with. the first part are purely linux and the other ones are more about the loading system on boot.

We have EEPROM 0x51 loaded with IBL from mcsdk_2_01_00_03 without recompiling the code. the binary provided is the one in flash.

----------------------- linux questions ----------------

1) let's say that I compile the pciedemo to run hello_word. then, I insert the module into the kernel, all seems to be ok. When I remove the module, recompile and reinssert the module, the linux PC completely hang and crash. It needs a power off to be able to run again. This means I need to power off each time I want to try my code. Nothing in the instructions says that I cannot do insert, remove and insert again. Should I do something in between 2 insertions? Is there anyway I can provide you more information that can help with this issue?

we run a fedora distribution with this version of the kernel: 2.6.35.6-45.fc14.x86_64

2) I searched for this problem on the forum and no one seems to have ask about it so I hope it is a linux issue. when trying to run the pcieInterrupt example to use EDMA, I receive a segmentation fault. I am attaching a file

[root@localhost spectrum]# lspci -d 104c:b005 -xxx -v
08:00.0 Multimedia controller: Texas Instruments Device b005 (rev 01)
	Physical Slot: 4
	Flags: bus master, fast devsel, latency 0, IRQ 16
	Memory at e4400000 (32-bit, non-prefetchable) [size=1M]
	Memory at e4200000 (32-bit, prefetchable) [size=1M]
	Memory at e4000000 (32-bit, prefetchable) [size=2M]
	Memory at e3000000 (32-bit, prefetchable) [size=16M]
	Memory at e4310000 (32-bit, prefetchable) [size=4K]
	Memory at e4300000 (32-bit, prefetchable) [size=64K]
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [70] Express Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Kernel modules: windrvr6
00: 4c 10 05 b0 07 01 18 00 01 00 80 04 10 00 00 00
10: 00 00 40 e4 08 00 20 e4 08 00 00 e4 08 00 00 e3
20: 08 00 31 e4 08 00 30 e4 00 00 00 00 00 00 01 00
30: 00 00 00 00 40 00 00 00 00 00 00 00 0a 01 00 00
40: 01 50 03 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 05 70 80 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 10 00 02 00 01 87 2c 01 1f 28 01 00 22 34 03 00
80: c0 00 22 10 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 1f 00 00 00 00 00 00 00 06 00 00 00
a0: 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

fedora kernel version
2.6.35.6-45.fc14.x86_64

[ 1641.392501] Finding the device....
[ 1641.392517] Found TI device
[ 1641.392519] TI device: vendor=0x104c, dev=0xb005, irq=0x0000000a
[ 1641.392520] Reading the BAR areas....
[ 1641.394245] Enabling the device....
[ 1641.394259] pci 0000:08:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 1641.394263] pci 0000:08:00.0: setting latency timer to 64
[ 1641.394270] Access PCIE application register ....
[ 1641.394272] Registering the irq 10 ...
[ 1641.394288] Allocating consistent memory ...
[ 1641.402129] Boot entry address is 0x  82ee80
[ 1641.402743] Total 5 sections, 0xfed0 bytes of data were written
[ 1643.462014] Write DMA to DSP ...
[ 1643.468102] Generating interrupt to DSP ...
[ 1643.925097] irq 16: nobody cared (try booting with the "irqpoll" option)
[ 1643.925100] Pid: 2653, comm: insmod Tainted: P          I 2.6.35.6-45.fc14.x86_64 #1
[ 1643.925102] Call Trace:
[ 1643.925103]  <IRQ>  [<ffffffff810a6e2b>] __report_bad_irq.clone.1+0x3d/0x8b
[ 1643.925111]  [<ffffffff810a6f93>] note_interrupt+0x11a/0x17f
[ 1643.925113]  [<ffffffff810a7a73>] handle_fasteoi_irq+0xa8/0xce
[ 1643.925116]  [<ffffffff8100c2ea>] handle_irq+0x88/0x90
[ 1643.925120]  [<ffffffff8146efb4>] do_IRQ+0x5c/0xb4
[ 1643.925122]  [<ffffffff81469513>] ret_from_intr+0x0/0x11
[ 1643.925124]  <EOI>  [<ffffffff81010b17>] ? native_read_tsc+0x6/0x16
[ 1643.925129]  [<ffffffff8122080e>] paravirt_read_tsc+0xe/0x12
[ 1643.925131]  [<ffffffff812208ff>] delay_tsc+0x35/0x74
[ 1643.925133]  [<ffffffff81220859>] __delay+0xf/0x11
[ 1643.925135]  [<ffffffff8122089d>] __const_udelay+0x42/0x44
[ 1643.925139]  [<ffffffffa00f3386>] init_module+0x324/0x424 [pciedemo]
[ 1643.925141]  [<ffffffff81010587>] ? sched_clock+0x9/0xd
[ 1643.925144]  [<ffffffff8103c040>] ? need_resched+0x23/0x2d
[ 1643.925146]  [<ffffffff8103c058>] ? should_resched+0xe/0x2e
[ 1643.925149]  [<ffffffff81467b05>] ? _cond_resched+0xe/0x22
[ 1643.925150]  [<ffffffff8146824a>] ? mutex_lock+0x29/0x50
[ 1643.925153]  [<ffffffff810c2f5c>] ? trace_module_notify+0x2b5/0x2c6
[ 1643.925156]  [<ffffffff810ad93a>] ? tracepoint_module_notify+0x2c/0x30
[ 1643.925158]  [<ffffffff8146c4b4>] ? notifier_call_chain+0x37/0x63
[ 1643.925162]  [<ffffffffa00f3062>] ? init_module+0x0/0x424 [pciedemo]
[ 1643.925165]  [<ffffffff810021a1>] do_one_initcall+0x5e/0x155
[ 1643.925168]  [<ffffffff8107caa9>] sys_init_module+0xa6/0x1e4
[ 1643.925170]  [<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b
[ 1643.925172] handlers:
[ 1643.925173] [<ffffffffa009160b>] (nouveau_irq_handler+0x0/0x1ac5 [nouveau])
[ 1643.925187] Disabling IRQ #16
[ 1644.467454] DMA write throughput is: 657.24 MB/s
[ 1644.467470] divide error: 0000 [#1] SMP 
[ 1644.467473] last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
[ 1644.467475] CPU 1 
[ 1644.467476] Modules linked in: pciedemo(+) fuse sunrpc ipv6 cpufreq_ondemand acpi_cpufreq freq_table mperf uinput snd_hda_codec_realtek windrvr6(P) hp_wmi rfkill snd_hda_intel x38_edac edac_core iTCO_wdt iTCO_vendor_support snd_hda_codec snd_hwdep ppdev parport_pc parport wmi snd_seq snd_seq_device tg3 snd_pcm snd_timer snd soundcore snd_page_alloc microcode nouveau ttm drm_kms_helper drm i2c_algo_bit video output i2c_core [last unloaded: scsi_wait_scan]
[ 1644.467499] 
[ 1644.467501] Pid: 2653, comm: insmod Tainted: P          I 2.6.35.6-45.fc14.x86_64 #1 0AA0h/HP xw4600 Workstation
[ 1644.467503] RIP: 0010:[<ffffffffa00f3404>]  [<ffffffffa00f3404>] init_module+0x3a2/0x424 [pciedemo]
[ 1644.467507] RSP: 0018:ffff88005e39bde8  EFLAGS: 00010246
[ 1644.467509] RAX: 00000000003d0900 RBX: 00000000003d0900 RCX: 0000000000000000
[ 1644.467511] RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffffffffa00f38e3
[ 1644.467512] RBP: ffff88005e39bf08 R08: 0000000000000002 R09: 00000000fffffffe
[ 1644.467514] R10: ffff8800de39bd07 R11: 0000000000000000 R12: 0000000000000340
[ 1644.467516] R13: 00007f28bc572010 R14: 00007f28bc572010 R15: 0000000000000003
[ 1644.467518] FS:  00007f28bc634720(0000) GS:ffff880002080000(0000) knlGS:0000000000000000
[ 1644.467520] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1644.467521] CR2: 00007f48e8605fc8 CR3: 000000005e3c5000 CR4: 00000000000406e0
[ 1644.467523] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1644.467525] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1644.467527] Process insmod (pid: 2653, threadinfo ffff88005e39a000, task ffff88007b30c5c0)
[ 1644.467528] Stack:
[ 1644.467529]  ffff880000000000 ffffffff81010587 ffff88005e39be08 ffffffff8103c040
[ 1644.467532] <0> ffff88005e39be18 ffffffff8103c058 ffff88005e39be28 ffffffff81467b05
[ 1644.467534] <0> ffff88005e39be58 ffffffff8146824a ffff88005e39be48 0000000000000246
[ 1644.467537] Call Trace:
[ 1644.467540]  [<ffffffff81010587>] ? sched_clock+0x9/0xd
[ 1644.467543]  [<ffffffff8103c040>] ? need_resched+0x23/0x2d
[ 1644.467545]  [<ffffffff8103c058>] ? should_resched+0xe/0x2e
[ 1644.467547]  [<ffffffff81467b05>] ? _cond_resched+0xe/0x22
[ 1644.467549]  [<ffffffff8146824a>] ? mutex_lock+0x29/0x50
[ 1644.467552]  [<ffffffff810c2f5c>] ? trace_module_notify+0x2b5/0x2c6
[ 1644.467554]  [<ffffffff810ad93a>] ? tracepoint_module_notify+0x2c/0x30
[ 1644.467556]  [<ffffffff8146c4b4>] ? notifier_call_chain+0x37/0x63
[ 1644.467560]  [<ffffffffa00f3062>] ? init_module+0x0/0x424 [pciedemo]
[ 1644.467562]  [<ffffffff810021a1>] do_one_initcall+0x5e/0x155
[ 1644.467565]  [<ffffffff8107caa9>] sys_init_module+0xa6/0x1e4
[ 1644.467567]  [<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b
[ 1644.467568] Code: 33 08 01 00 48 2b 05 1c 08 01 00 31 d2 48 8b 0d 2b 08 01 00 2b 0d 15 08 01 00 48 c7 c7 e3 38 0f a0 69 c0 40 42 0f 00 01 c1 89 d8 <f7> f1 89 c6 6b c2 64 31 d2 f7 f1 89 c2 31 c0 e8 05 3f 37 e1 48 
[ 1644.467590] RIP  [<ffffffffa00f3404>] init_module+0x3a2/0x424 [pciedemo]
[ 1644.467593]  RSP <ffff88005e39bde8>
[ 1644.467596] ---[ end trace a7919e7f17c0a727 ]---



[root@localhost spectrum]# lsmod
Module                  Size  Used by
tcp_lp                  2111  0 
pciedemo               77810  1 


. First you will see the info returned by lspci then the info I received from dmesg.

From this dump, can you help me to find out what the problem is?

I think there is something wrong with the IRQ. the lspci says that the IRQ is 16 and when enabling the devicem we see this printout. pci 0000:08:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16, however, in the following of dmesg log, irq10 is being registered (Registering the irq 10)

----------------------- other questions ----------------

3) Why can't a *.out file be loaded directly onto memory? When loading from CCS, is CCS studio going through the same process of converting the *.out file into a boot table file

4) Is it required t be in PCIe boot mode to run the examples provided? If I boot from flash with the PCIe module enabled, and the code from flash do similar things that the current IBL, would the examples work?

5) Are you aware of any power on issues using PCIe boot mode? The reason for this question is that when plugging the EVM in different PC's the EVM is not working properly. Some PC needs a warm reset before the OS starts, some PC do not start at all, the displays is getting completely corrupted, Some PC's do three strange on-off sequence before the PC boots.

Thanks for your help

Aymeric

  • I am not an linux expert too. So I have forwared your first two question to a Linux expert. He might answer thos questions soon.

    As for rest, here you go...

    3. The Boot loader understands the code is a certain format only. Eventually the boot loader decodes the boot table and loads the sections in the same way the ccs decodes the .out file. Just a different format for boot loader.

    4. Yes, the bootloader needs to know that the image will be loaded through the PCIe connection.

    5. Not that I know of. This might be again how the PC behaves.I am not sure why PC has different sequence to boot a device connected through PCIe. I might be missing something here.

    Thanks,

    Arun.

  • Arun,

    Thanks for the answers and thanks for forwarding the linux questions.

    To get back to question 4, I am a bit confuse with the boot process and your answer. I am not sure if you are saying: YES it has to be PCIe boot mode or YES I need to turn on the PCIe interface.

    PCIE boot example version 4, section 9.2, "the role of the IBL in PCIE boot mode states that the IBL does not "boot again", it just write the devstat register.

    Then the IBL monitors the magic address until it is not zero anymore and starts boot again according to this address.

    The description of the examples afterward all have the same idea, they push the boot image data to the correct area of memory via PCIe , then update the magic address with the boot entry address (_c_int00).

    So to get back to my question, must the boot mode be PCIe to allow pushing data onto memory via PCIe?

    I think the answer is no but if I am wrong, could you please detail a bit more what you think I am not understanding in the process described in the manual.

    Thanks a lot

    Aymeric

  • .   

    Aymeric,

    For Q1, in the pciedemo.c code, there is a macro called:

     #define LOCAL_RESET 0 =====> please change this to 1 and re-compile the code. It does a DSP local reset 10 seconds later you ran "hello world" demo.
     
    #if LOCAL_RESET
    mdelay(10000);
    dspLocalReset();
    #endif

    Then, you should be able to repeatedly remove the module, insert the module, remove, insert ... without power cycle the host PC. Let me know if this works for you.

    For Q2, I don't know what IRQ16 is, but it seems this is not the problem. See the last page of examples\pcie\docs\readme.pdf, IRQ 11 is used, but there is also an info PCI INT A -> GSI 16 (level, low) -> IRQ 16 in the log. This issue that host didn't receive the interrupt from DSP needs some debug, do you have CCS emulation to DSP? If so, can you check the DSP DDR memory starting 0x80000000? Did DSP negate the data from host(pattern 0x00, 0x01, 0x02 ....) to 0xff, 0xfe, 0xfd...?

    Regards, Eric

  • Eric,

    thanks for the answers and sorry for the delay in getting back to you but I had to investigate a bit more. I really hope you can help us investigating the issue in inserting the module, removing it and running the examples.

    Adding the dspLocalReset function does not resolve the problems. If I only run the reset program over and over, I can continuously insert and remove the pciedemo module from the kernel.

    I am attaching another log that will provide you a lot of information all along my investigation but here is what we see:2678.full_logs.txt

    1) We have a video card and the EVM on 2 different PCI bus

    2) In the bios, for whatever reason, both have the same IRQ, IRQ10.

    3) when we start linux and print the devices on the PCI bus, the video card received IRQ 16 and the EVM received IRQ 10

    4) the first time i insert the module ( for the first dmesg, I provided you with all the information since I do not know what will be relevant to you) it seems that there is a confusion with the irq that needs to be used.

    [  450.067270] Finding the device....
    [  450.067291] Found TI device
    ====> [  450.067294] TI device: vendor=0x104c, dev=0xb005, irq=0x0000000a
    [  450.067296] Reading the BAR areas....
    [  450.070895] Enabling the device....
    ====>[  450.070913] pci 0000:08:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
    [  450.070919] pci 0000:08:00.0: setting latency timer to 64
    [  450.070927] Access PCIE application register ....
    ====>[  450.070930] Registering the irq 10 ...
    [  460.067958] Start local reset assert for core (module id): 23 ...
    [  460.067965] Start local reset assert for core (module id): 24 ...
    [  460.067970] Start local reset assert for core (module id): 26 ...
    [  460.067976] Start local reset assert for core (module id): 28 ...

    but before I remove the module, I can see that irq10 is registered: irq  10:         0 TI

    5) more interestingly, the PCI device list has now changed

    [root@localhost pcie_test]$ lspci -d 104c:b005 -x -v
    08:00.0 Multimedia controller: Texas Instruments Device b005 (rev 01)
        Physical Slot: 4
        Flags: bus master, fast devsel, latency 0, IRQ 16


    6) I can run several time remove and it seems fine (when ONLY running reset) and 16 is used all the time. the kernel logs are "coherent"

    ==>[ 1420.509943] TI device: vendor=0x104c, dev=0xb005, irq=0x00000010
    [ 1420.509945] Reading the BAR areas....
    [ 1420.512711] Enabling the device....
    [ 1420.512718] pci 0000:08:00.0: setting latency timer to 64
    ===>[ 1420.512726] Access PCIE application register ....
    ===>[ 1420.512728] Registering the irq 16 ...

    and really coherent:

    [root@localhost pcie_test]# cat /proc/interrupts
               CPU0       CPU1       
     16:          3          5   IO-APIC-fasteoi   nouveau, TI 667x PCIE

    7) I can run the hello_word example ( or the dma example), it will complete and when I try to remove the module I get the same kind of errors that I had before.


    We have continuously the same kind of errors when inserting, removing the modules or running the examples.

    Regards, Aymeric






  • Aymeric,

    I tested with local reset enabled for hello world or edma demo, I can repeatedly insert, remove the module. Attached is my log with comments. At the first time, PCie card is also registered with IRQ 11, then changed to 16 from the second time. In your crash log, it happened in rmmod, I think this is a Linux system issue. I looked at IRQ 16 in my case, it is:

      16:        186          0          0          1   IO-APIC-fasteoi   uhci_hcd:usb3, HDA Intel

    Regards, Eric

    3884.0110.log

  • Thanks for the info, I will try to install another linux version (Ubuntu 10.04 as recommended in the manual).

    I hope this will resolve my issues...

    Aymeric

  • Eric,

    I thought I would provide you with some feedback on this issue. It seems that the Bios we are using and fedora are conflicting.

    The issue (as you advised) as nothing to do with the code that is being run.

    I would like to ask an extra question. The IBL seems to cause issues at power on from some PCs (reset several time before real boot) and we want our board to have either our own dedicated IBL (which would be a question from another support question) or just boot from the RBL.

    I found this post: http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/t/171960.aspx

    I just want to confirm that the boot magic address for the C6670 is 0x0x8ffffc as it is monitored on the IBL or all the example projects for linux_host_loader.

    Thanks in advance

  • Aymeric,

    [I just want to confirm that the boot magic address for the C6670 is 0x0x8ffffc as it is monitored on the IBL or all the example projects for linux_host_loader.]

    Yes, this is correct.

    Regards, Eric