This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

66AK2H12: MPI over Hyperlink

Part Number: 66AK2H12

Hy,

We made some progress with our Hyperlink testing on custom board. The hyperlink LLD example program can run successfully on both device in non-loopback mode. The question related to this was:

I've ran the mpm-transport library tests and they finished successfully. The mpm_transport_hyplnk_remote.out program uses hyperlink-0 as default port. I modified it to use hyperlink-1 on one DSP. Each port is functioning properly, according to the testprogram. The program trace is the following:

testing hyperlink mmap write
completed hyperlink mmap read and check
testing hyperlink mpm_transport_write and mpm_transport_read
Testing remote 36-bit put

Testing remote interrupt

 Waiting for interrupt.2..with 5 sec timeout
 wait for interrupt 2 timed out
 Thread 1 created successfully

 Thread: Waiting for interrupt.3..
 Thread 2 created successfully

 Thread: Waiting for interrupt.3 A..
 Thread 3 created successfully

 Background loop 1 executing...
 Thread: Waiting for interrupt.3 B..
 Generating test interrupt from background
 Background loop 1 executing...
 Thread: Wait for interrupt 3 A complete , event bitmap c result: 1
 Thread: Wait for interrupt 3 complete , event bitmap 4 result: 1
 Background loop 3 executing...
 Thread: Wait for interrupt 3 B complete , event bitmap 2 result: 1
 Testing remote interrupt: using model 2

 Waiting for interrupt.4..with 5 sec timeout
 Interrupt 4 wait timed out
mpm transport: hyperlink remote test passed

Our goal is to run ti-openmpi nbody testprogram successfully over hyperlink. The testmpi program is also successful. Only nbody fails. The verbose trace of the nbody is the following:

/opt/ti-openmpi/bin/mpirun --mca btl_base_verbose 100 --mca btl self,hlink -np 2 -host DSP1,DSP2 ./nbody 1000

[DSP2:02204] mca: base: components_open: Looking for btl components
[DSP2:02204] mca: base: components_open: opening btl components
[DSP2:02204] mca: base: components_open: found loaded component hlink
[DSP2:02204] BTL_HLINK TIMPIDBG: hlink_component_register!!!
[DSP2:02204] This is EVM, using hl0 only!
[DSP2:02204] mca: base: components_open: component hlink register function successful
[DSP2:02204] BTL_HLINK TIMPIDBG: hlink_component_open!!!
[DSP2:02204] BTL_HLINK  BTL HLINK start of HYPLNKINITCFG: 0xb6a70130 
[DSP2:02204] BTL_HLINK [0x21400000]
[DSP2:02204] BTL_HLINK [0x40000000]
[DSP2:02204] BTL_HLINK [0x21400100]
[DSP2:02204] BTL_HLINK [0x28000000]
[DSP2:02204] BTL_HLINK [(nil)]
[DSP2:02204] BTL_HLINK [(nil)]
[DSP2:02204] BTL_HLINK [(nil)]
[DSP2:02204] BTL_HLINK [(nil)]
[DSP2:02204] BTL_HLINK BTL HLINK end of HYPLNKINITCFG
[DSP2:02204] BTL_HLINK: CMEM_init OK!
[DSP2:02204] mca: base: components_open: component hlink open function successful
[DSP2:02204] mca: base: components_open: found loaded component self
[DSP2:02204] mca: base: components_open: component self has no register function
[DSP2:02204] mca: base: components_open: component self open function successful
[DSP1:02191] mca: base: components_open: Looking for btl components
[DSP2:02204] select: initializing btl component hlink
[DSP2:02204] BTL_HLINK TIMPIDBG: hlink_component_init!!!
[DSP2:02204] BTL_HLINK shmem open successfull!!
[DSP2:02204] BTL_HLINK: CMEM physAddr:        22000000 (to a2000000) userAddr:0xb5989000
[DSP2:02204] BTL_HLINK shmem MSMC0 mmap successfull!!
[DSP2:02204] BTL_HLINK shmem MSMC0 mmap successfull!!
[DSP2:02204] BTL_HLINK attempt HyperLink0 then HyperLink1
[DSP2:02204] BTL_HLINK hyplnk0 attempt opening
[DSP2:02204] BTL_HLINK hyplnk0 open successfull!!
[DSP2:02204] BTL_HLINK hyplnk1 attempt opening
[DSP1:02191] mca: base: components_open: opening btl components
[DSP1:02191] mca: base: components_open: found loaded component hlink
[DSP1:02191] BTL_HLINK TIMPIDBG: hlink_component_register!!!
[DSP1:02191] This is EVM, using hl0 only!
[DSP1:02191] mca: base: components_open: component hlink register function successful
[DSP1:02191] BTL_HLINK TIMPIDBG: hlink_component_open!!!
[DSP1:02191] BTL_HLINK  BTL HLINK start of HYPLNKINITCFG: 0xb6a1a130 
[DSP1:02191] BTL_HLINK [0x21400000]
[DSP1:02191] BTL_HLINK [0x40000000]
[DSP1:02191] BTL_HLINK [0x21400100]
[DSP1:02191] BTL_HLINK [0x28000000]
[DSP1:02191] BTL_HLINK [(nil)]
[DSP1:02191] BTL_HLINK [(nil)]
[DSP1:02191] BTL_HLINK [(nil)]
[DSP1:02191] BTL_HLINK [(nil)]
[DSP1:02191] BTL_HLINK BTL HLINK end of HYPLNKINITCFG
[DSP1:02191] BTL_HLINK: CMEM_init OK!
[DSP1:02191] mca: base: components_open: component hlink open function successful
[DSP1:02191] mca: base: components_open: found loaded component self
[DSP1:02191] mca: base: components_open: component self has no register function
[DSP1:02191] mca: base: components_open: component self open function successful
[DSP1:02191] select: initializing btl component hlink
[DSP1:02191] BTL_HLINK TIMPIDBG: hlink_component_init!!!
[DSP1:02191] BTL_HLINK shmem open successfull!!
[DSP1:02191] BTL_HLINK: CMEM physAddr:        22000000 (to a2000000) userAddr:0xb5933000
[DSP1:02191] BTL_HLINK shmem MSMC0 mmap successfull!!
[DSP1:02191] BTL_HLINK shmem MSMC0 mmap successfull!!
[DSP1:02191] BTL_HLINK attempt HyperLink0 then HyperLink1
[DSP1:02191] BTL_HLINK hyplnk0 attempt opening
[DSP2:02204] BTL_HLINK hyplnk1 open failed
[DSP2:02204] BTL_HLINK hyplnk0=0x77cf0 hyplnk1=(nil)
[DSP2:02204] BTL_HLINK hyplnk0 MSMC mmap successfull!!
[DSP2:02204] BTL_HLINK hyplnk0 DDR mmap successfull!!
[DSP2:02204] BTL_HLINK BTL HLINK turned ON (hl0=1 hl1=0)!!!
[DSP2:02204] select: init of component hlink returned success
[DSP2:02204] select: initializing btl component self
[DSP2:02204] select: init of component self returned success
[DSP1:02191] BTL_HLINK hyplnk0 open failed
[DSP1:02191] BTL_HLINK hyplnk1 attempt opening
[DSP1:02191] BTL_HLINK hyplnk1 open successfull!!
[DSP1:02191] BTL_HLINK hyplnk0=(nil) hyplnk1=0x76f38
[DSP1:02191] BTL_HLINK hyplnk1 MSMC mmap successfull!!
[DSP1:02191] BTL_HLINK hyplnk1 DDR mmap successfull!!
[DSP1:02191] BTL_HLINK BTL HLINK turned ON (hl0=0 hl1=1)!!!
[DSP1:02191] select: init of component hlink returned success
[DSP1:02191] select: initializing btl component self
[DSP1:02191] select: init of component self returned success
[DSP1:02191] TIMPIDBG 0 rmt_msgbase=0x9dd0b000 loc_msgbase=0xb6133000
[DSP1:02191] TIMPIDBG B hlink_module_add_procs(0) remote hostname=DSP1 (over HL0:NOTUSED HL1:DSP2, bridge:NOTUSED) rem_rank[0] jobid=c5460001
[DSP1:02191] TIMPIDBG B hlink_module_add_procs(1) remote hostname=DSP2 (over HL0:NOTUSED HL1:DSP2, bridge:NOTUSED) rem_rank[1] jobid=c5460001
[DSP2:02204] TIMPIDBG 0 rmt_msgbase=0xa5d69000 loc_msgbase=0xb5d89000
[DSP2:02204] TIMPIDBG B hlink_module_add_procs(0) remote hostname=DSP1 (over HL0:DSP1 HL1:NOTUSED, bridge:NOTUSED) rem_rank[0] jobid=c5460001
[DSP2:02204] HLINK0: TIMPIDBG[0, nlocal 0] hlink_module_add_procs, I can talk to rank: (0) on DSP1
[DSP2:02204] TIMPIDBG B hlink_module_add_procs(1) remote hostname=DSP2 (over HL0:DSP1 HL1:NOTUSED, bridge:NOTUSED) rem_rank[1] jobid=c5460001
[DSP2:02204] TIMPIDBG hlink_module_add_procs, RANKS: my_rank=1 nprocs=2 nlocal_procs=1 endpoint_count=2
[DSP1:02191] HLINK1: TIMPIDBG[1, nlocal 0] hlink_module_add_procs, I can talk to rank: (1) on DSP2
[DSP1:02191] TIMPIDBG hlink_module_add_procs, RANKS: my_rank=0 nprocs=2 nlocal_procs=1 endpoint_count=2
[DSP1:02191] TIMPIDBG: hlink_module_prepare_src EAGER: 130048 MAX: 131072 (btl=0xb6ac72fc)
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x7f900
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x7f980
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x7fa00
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x7fa80
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x7fb00
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x7fb80
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x7fc00
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x7fc80
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x7fd00
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x7fd80
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x7fe00
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x7fe80
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x7ff00
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x7ff80
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x80000
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x80080
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x80100
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x80180
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x80200
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x80280
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x80300
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x80380
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x80400
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x80480
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x80500
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x80580
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x80600
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x80680
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x80700
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x80780
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x80800
[DSP1:02191] BTL HLINK fragmentInitialized (btl=0xb6ac72fc) frag=0x80880
[DSP1:02191] TIMPIDBG: hlink_module_send frag=0x80880 (base=0x9d8b848c size=28014)!!!
[DSP1:02191] TIMPIDBG: hlink_module_send FIFO_WRITE: nprocs 1(0)->0, frag_hdr=0x9d8b8480 eager_limit=130048
[DSP1:02191] TMPIDBG: rmt_ep 0 send (hyplnk_iface1) (bridging 0) (use_edma 1 seg_len 28014 eager_limit 130048)
[DSP1:02191] BTL_HL[  122.505225] mpirun invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
[  122.512698] CPU: 3 PID: 2189 Comm: mpirun Tainted: G           O 3.10.72-gca334de #1
[  122.520431] [<c00159c0>] (unwind_backtrace+0x0/0xf8) from [<c0011abc>] (show_stack+0x10/0x14)
[  122.528947] [<c0011abc>] (show_stack+0x10/0x14) from [<c05076a0>] (dump_header.isra.14+0x78/0x198)
[  122.531444] syslog-ng: page allocation failure: order:0, mode:0x201db
[  122.531449] CPU: 0 PID: 2046 Comm: syslog-ng Tainted: G           O 3.10.72-gca334de #1
[  122.531464] [<c00159c0>] (unwind_backtrace+0x0/0xf8) from [<c0011abc>] (show_stack+0x10/0x14)
[  122.531475] [<c0011abc>] (show_stack+0x10/0x14) from [<c009db40>] (warn_alloc_failed+0xa8/0x120)
[  122.531485] [<c009db40>] (warn_alloc_failed+0xa8/0x120) from [<c00a0a60>] (__alloc_pages_nodemask+0x614/0x8c8)
[  122.531492] [<c00a0a60>] (__alloc_pages_nodemask+0x614/0x8c8) from [<c009b7b0>] (filemap_fault+0x1c8/0x410)
[  122.535890] 3367 reserved pages
[  122.544463] 4665 total pagecache pages
[  122.549902] 139264 pages of RAM
[  122.549905] 3367 reserved pages
[  122.551925] 4665 total pagecache pages
[  122.557679] 139264 pages of RAM
[  122.557682] 3367 reserved pages
[  122.561911] 4665 total pagecache pages
[  122.567090] 139264 pages of RAM
[  122.567092] 3367 reserved pages
[  122.571902] 4665 total pagecache pages
[  122.576763] 139264 pages of RAM
[  122.576765] 3367 reserved pages
[  122.576767] 2235 slab pages
[  122.576769] 263437 pages shared
[  122.576770] 0 pages swap cached
[  122.581730] syslog-ng: page allocation failure: order:0, mode:0x201db
[  122.581734] CPU: 0 PID: 2046 Comm: syslog-ng Tainted: G           O 3.10.72-gca334de #1
[  122.581747] [<c00159c0>] (unwind_backtrace+0x0/0xf8) from [<c0011abc>] (show_stack+0x10/0x14)
[  122.581756] [<c0011abc>] (show_stack+0x10/0x14) from [<c009db40>] (warn_alloc_failed+0xa8/0x120)
[  122.581765] [<c009db40>] (warn_alloc_failed+0xa8/0x120) from [<c00a0a60>] (__alloc_pages_nodemask+0x614/0x8c8)
[  122.581772] [<c00a0a60>] (__alloc_pages_nodemask+0x614/0x8c8) from [<c009b7b0>] (filemap_fault+0x1c8/0x410)
[  122.581779] [<c009b7b0>] (filemap_fault+0x1c8/0x410) from [<c00b7340>] (__do_fault+0x68/0x560)
[  122.581786] [<c00b7340>] (__do_fault+0x68/0x560) from [<c00b9a1c>] (handle_pte_fault+0x90/0x1164)
[  122.581792] [<c00b9a1c>] (handle_pte_fault+0x90/0x1164) from [<c00bbb60>] (handle_mm_fault+0xec/0x154)
[  122.581801] [<c00bbb60>] (handle_mm_fault+0xec/0x154) from [<c050e134>] (do_page_fault.part.7+0x1d8/0x36c)
[  122.581809] [<c050e134>] (do_page_fault.part.7+0x1d8/0x36c) from [<c050e328>] (do_page_fault+0x60/0x6c)
[  122.581816] [<c050e328>] (do_page_fault+0x60/0x6c) from [<c0008490>] (do_PrefetchAbort+0x2c/0x94)
[  122.581822] [<c0008490>] (do_PrefetchAbort+0x2c/0x94) from [<c050cdd4>] (ret_from_exception+0x0/0x10)
[  122.581824] Exception stack(0xdd65dfb0 to 0xdd65dff8)
[  122.581828] dfa0:                                     00000001 bed63b98 00000008 0010efc8
[  122.581833] dfc0: bed63b98 00000455 bed63c48 000000fc bed63c20 00013800 000003e8 bed63c44
[  122.581837] dfe0: 00000000 bed63b94 b6eedbdc b6cd054c 600f0010 ffffffff
[  122.581838] Mem-info:
[  122.581841] DMA per-cpu:
[  122.581844] CPU    0: hi:  186, btch:  31 usd:  39
[  122.581847] CPU    1: hi:  186, btch:  31 usd: 153
[  122.581849] CPU    2: hi:  186, btch:  31 usd: 177
[  122.581852] CPU    3: hi:  186, btch:  31 usd:  65
[  122.581860] active_anon:123198 inactive_anon:3876 isolated_anon:0
[  122.581860]  active_file:25 inactive_file:9 isolated_file:32
[  122.581860]  unevictable:0 dirty:0 writeback:0 unstable:0
[  122.581860]  free:703 slab_reclaimable:902 slab_unreclaimable:2876
[  122.581860]  mapped:48 shmem:4573 pagetables:787 bounce:0
[  122.581860]  free_cma:9
[  122.581870] DMA free:2812kB min:2900kB low:3624kB high:4348kB active_anon:492792kB inactive_anon:15504kB active_file:100kB inactive_file:36kB unevictable:0kB isolated(anon):0kB isolated(file):128kB prs
[  122.581877] lowmem_reserve[]: 0 0 0 0
[  122.581899] DMA: 1*4kB (R) 0*8kB 0*16kB 1*32kB (R) 0*64kB 0*128kB 1*256kB (R) 1*512kB (R) 0*1024kB 1*2048kB (R) 0*4096kB = 2852kB
[  122.581900] 4665 total pagecache pages
[  122.586573] 139264 pages of RAM
[  122.586574] 1473 free pages
[  122.586575] 3367 reserved pages
[  122.586577] 2235 slab pages
[  122.586579] 263437 pages shared
[  122.586580] 0 pages swap cached
[  122.591718] syslog-ng: page allocation failure: order:0, mode:0x201db
[  122.591723] CPU: 0 PID: 2046 Comm: syslog-ng Tainted: G           O 3.10.72-gca334de #1
[  122.591736] [<c00159c0>] (unwind_backtrace+0x0/0xf8) from [<c0011abc>] (show_stack+0x10/0x14)
[  122.591746] [<c0011abc>] (show_stack+0x10/0x14) from [<c009db40>] (warn_alloc_failed+0xa8/0x120)
[  122.591754] [<c009db40>] (warn_alloc_failed+0xa8/0x120) from [<c00a0a60>] (__alloc_pages_nodemask+0x614/0x8c8)
[  122.591761] [<c00a0a60>] (__alloc_pages_nodemask+0x614/0x8c8) from [<c009b7b0>] (filemap_fault+0x1c8/0x410)
[  122.591768] [<c009b7b0>] (filemap_fault+0x1c8/0x410) from [<c00b7340>] (__do_fault+0x68/0x560)
[  122.591775] [<c00b7340>] (__do_fault+0x68/0x560) from [<c00b9a1c>] (handle_pte_fault+0x90/0x1164)
[  122.591781] [<c00b9a1c>] (handle_pte_fault+0x90/0x1164) from [<c00bbb60>] (handle_mm_fault+0xec/0x154)
[  122.591789] [<c00bbb60>] (handle_mm_fault+0xec/0x154) from [<c050e134>] (do_page_fault.part.7+0x1d8/0x36c)
[  122.591797] [<c050e134>] (do_page_fault.part.7+0x1d8/0x36c) from [<c050e328>] (do_page_fault+0x60/0x6c)
[  122.591804] [<c050e328>] (do_page_fault+0x60/0x6c) from [<c0008490>] (do_PrefetchAbort+0x2c/0x94)
[  122.591810] [<c0008490>] (do_PrefetchAbort+0x2c/0x94) from [<c050cdd4>] (ret_from_exception+0x0/0x10)
[  122.591812] Exception stack(0xdd65dfb0 to 0xdd65dff8)
[  122.591816] dfa0:                                     00000001 bed63b98 00000008 0010efc8
[  122.591821] dfc0: bed63b98 00000455 bed63c48 000000fc bed63c20 00013800 000003e8 bed63c44
[  122.591824] dfe0: 00000000 bed63b94 b6eedbdc b6cd054c 600f0010 ffffffff
[  122.591826] Mem-info:
[  122.591828] DMA per-cpu:
[  122.591831] CPU    0: hi:  186, btch:  31 usd:  39
[  122.591833] CPU    1: hi:  186, btch:  31 usd: 153
[  122.591836] CPU    2: hi:  186, btch:  31 usd: 177
[  122.591838] CPU    3: hi:  186, btch:  31 usd:  65
[  122.591846] active_anon:123198 inactive_anon:3876 isolated_anon:0
[  122.591846]  active_file:25 inactive_file:9 isolated_file:32
[  122.591846]  unevictable:0 dirty:0 writeback:0 unstable:0
[  122.591846]  free:703 slab_reclaimable:902 slab_unreclaimable:2876
[  122.591846]  mapped:48 shmem:4573 pagetables:787 bounce:0
[  122.591846]  free_cma:9
[  122.591857] DMA free:2812kB min:2900kB low:3624kB high:4348kB active_anon:492792kB inactive_anon:15504kB active_file:100kB inactive_file:36kB unevictable:0kB isolated(anon):0kB isolated(file):128kB prs
[  122.591864] lowmem_reserve[]: 0 0 0 0
[  122.591887] DMA: 1*4kB (R) 0*8kB 0*16kB 1*32kB (R) 0*64kB 0*128kB 1*256kB (R) 1*512kB (R) 0*1024kB 1*2048kB (R) 0*4096kB = 2852kB
[  122.591889] 4665 total pagecache pages
[  122.596425] 139264 pages of RAM
[  122.596427] 1473 free pages
[  122.596428] 3367 reserved pages
[  122.596430] 2235 slab pages
[  122.596431] 263437 pages shared
[  122.596433] 0 pages swap cached
[  122.601709] syslog-ng: page allocation failure: order:0, mode:0x201db
[  122.601714] CPU: 0 PID: 2046 Comm: syslog-ng Tainted: G           O 3.10.72-gca334de #1
[  122.601728] [<c00159c0>] (unwind_backtrace+0x0/0xf8) from [<c0011abc>] (show_stack+0x10/0x14)
[  122.601738] [<c0011abc>] (show_stack+0x10/0x14) from [<c009db40>] (warn_alloc_failed+0xa8/0x120)
[  122.601748] [<c009db40>] (warn_alloc_failed+0xa8/0x120) from [<c00a0a60>] (__alloc_pages_nodemask+0x614/0x8c8)
[  122.601755] [<c00a0a60>] (__alloc_pages_nodemask+0x614/0x8c8) from [<c009b7b0>] (filemap_fault+0x1c8/0x410)
[  122.601763] [<c009b7b0>] (filemap_fault+0x1c8/0x410) from [<c00b7340>] (__do_fault+0x68/0x560)
[  122.601769] [<c00b7340>] (__do_fault+0x68/0x560) from [<c00b9a1c>] (handle_pte_fault+0x90/0x1164)
[  122.601776] [<c00b9a1c>] (handle_pte_fault+0x90/0x1164) from [<c00bbb60>] (handle_mm_fault+0xec/0x154)
[  122.601784] [<c00bbb60>] (handle_mm_fault+0xec/0x154) from [<c050e134>] (do_page_fault.part.7+0x1d8/0x36c)
[  122.601792] [<c050e134>] (do_page_fault.part.7+0x1d8/0x36c) from [<c050e328>] (do_page_fault+0x60/0x6c)
[  122.601799] [<c050e328>] (do_page_fault+0x60/0x6c) from [<c0008490>] (do_PrefetchAbort+0x2c/0x94)
[  122.601805] [<c0008490>] (do_PrefetchAbort+0x2c/0x94) from [<c050cdd4>] (ret_from_exception+0x0/0x10)
[  122.601808] Exception stack(0xdd65dfb0 to 0xdd65dff8)
[  122.601812] dfa0:                                     00000001 bed63b98 00000008 0010efc8
[  122.601816] dfc0: bed63b98 00000455 bed63c48 000000fc bed63c20 00013800 000003e8 bed63c44
[  122.601820] dfe0: 00000000 bed63b94 b6eedbdc b6cd054c 600f0010 ffffffff
[  122.601822] Mem-info:
[  122.601823] DMA per-cpu:
[  122.601826] CPU    0: hi:  186, btch:  31 usd:  39
[  122.601828] CPU    1: hi:  186, btch:  31 usd: 153
[  122.601831] CPU    2: hi:  186, btch:  31 usd: 177
[  122.601834] CPU    3: hi:  186, btch:  31 usd:  65
[  122.601842] active_anon:123198 inactive_anon:3876 isolated_anon:0
[  122.601842]  active_file:25 inactive_file:9 isolated_file:32
[  122.601842]  unevictable:0 dirty:0 writeback:0 unstable:0
[  122.601842]  free:703 slab_reclaimable:902 slab_unreclaimable:2876
[  122.601842]  mapped:48 shmem:4573 pagetables:787 bounce:0
[  122.601842]  free_cma:9
[  122.601854] DMA free:2812kB min:2900kB low:3624kB high:4348kB active_anon:492792kB inactive_anon:15504kB active_file:100kB inactive_file:36kB unevictable:0kB isolated(anon):0kB isolated(file):128kB prs
[  122.601860] lowmem_reserve[]: 0 0 0 0
[  122.601883] DMA: 1*4kB (R) 0*8kB 0*16kB 1*32kB (R) 0*64kB 0*128kB 1*256kB (R) 1*512kB (R) 0*1024kB 1*2048kB (R) 0*4096kB = 2852kB
[  122.601884] 4665 total pagecache pages
[  122.606313] 139264 pages of RAM
[  122.606314] 1473 free pages
[  122.606316] 3367 reserved pages
[  122.606317] 2235 slab pages
[  122.606318] 263437 pages shared
[  122.606320] 0 pages swap cached
[  122.611713] syslog-ng: page allocation failure: order:0, mode:0x201db
[  122.611717] CPU: 0 PID: 2046 Comm: syslog-ng Tainted: G           O 3.10.72-gca334de #1
[  122.611730] [<c00159c0>] (unwind_backtrace+0x0/0xf8) from [<c0011abc>] (show_stack+0x10/0x14)
[  122.611740] [<c0011abc>] (show_stack+0x10/0x14) from [<c009db40>] (warn_alloc_failed+0xa8/0x120)
[  122.611748] [<c009db40>] (warn_alloc_failed+0xa8/0x120) from [<c00a0a60>] (__alloc_pages_nodemask+0x614/0x8c8)
[  122.611755] [<c00a0a60>] (__alloc_pages_nodemask+0x614/0x8c8) from [<c009b7b0>] (filemap_fault+0x1c8/0x410)
[  122.611761] [<c009b7b0>] (filemap_fault+0x1c8/0x410) from [<c00b7340>] (__do_fault+0x68/0x560)
[  122.611768] [<c00b7340>] (__do_fault+0x68/0x560) from [<c00b9a1c>] (handle_pte_fault+0x90/0x1164)
[  122.611774] [<c00b9a1c>] (handle_pte_fault+0x90/0x1164) from [<c00bbb60>] (handle_mm_fault+0xec/0x154)
[  122.611782] [<c00bbb60>] (handle_mm_fault+0xec/0x154) from [<c050e134>] (do_page_fault.part.7+0x1d8/0x36c)
[  122.611790] [<c050e134>] (do_page_fault.part.7+0x1d8/0x36c) from [<c050e328>] (do_page_fault+0x60/0x6c)
[  122.611797] [<c050e328>] (do_page_fault+0x60/0x6c) from [<c0008490>] (do_PrefetchAbort+0x2c/0x94)
[  122.611803] [<c0008490>] (do_PrefetchAbort+0x2c/0x94) from [<c050cdd4>] (ret_from_exception+0x0/0x10)
[  122.611806] Exception stack(0xdd65dfb0 to 0xdd65dff8)
[  122.611809] dfa0:                                     00000001 bed63b98 00000008 0010efc8
[  122.611814] dfc0: bed63b98 00000455 bed63c48 000000fc bed63c20 00013800 000003e8 bed63c44
[  122.611818] dfe0: 00000000 bed63b94 b6eedbdc b6cd054c 600f0010 ffffffff
[  122.611820] Mem-info:
[  122.611821] DMA per-cpu:
[  122.611824] CPU    0: hi:  186, btch:  31 usd:  39
[  122.611827] CPU    1: hi:  186, btch:  31 usd: 153
[  122.611829] CPU    2: hi:  186, btch:  31 usd: 177
[  122.611831] CPU    3: hi:  186, btch:  31 usd:  65
[  122.611840] active_anon:123198 inactive_anon:3876 isolated_anon:0
[  122.611840]  active_file:25 inactive_file:9 isolated_file:32
[  122.611840]  unevictable:0 dirty:0 writeback:0 unstable:0
[  122.611840]  free:703 slab_reclaimable:902 slab_unreclaimable:2876
[  122.611840]  mapped:48 shmem:4573 pagetables:787 bounce:0
[  122.611840]  free_cma:9
[  122.611850] DMA free:2812kB min:2900kB low:3624kB high:4348kB active_anon:492792kB inactive_anon:15504kB active_file:100kB inactive_file:36kB unevictable:0kB isolated(anon):0kB isolated(file):128kB prs
[  122.611857] lowmem_reserve[]: 0 0 0 0
[  122.611879] DMA: 1*4kB (R) 0*8kB 0*16kB 1*32kB (R) 0*64kB 0*128kB 1*256kB (R) 1*512kB (R) 0*1024kB 1*2048kB (R) 0*4096kB = 2852kB
[  122.611881] 4665 total pagecache pages
[  122.616280] 139264 pages of RAM
[  122.616281] 1473 free pages
[  122.616282] 3367 reserved pages
[  122.616284] 2235 slab pages
[  122.616285] 263437 pages shared
[  122.616288] 0 pages swap cached
[  122.621731] syslog-ng: page allocation failure: order:0, mode:0x201db
[  122.621736] CPU: 0 PID: 2046 Comm: syslog-ng Tainted: G           O 3.10.72-gca334de #1
[  122.621750] [<c00159c0>] (unwind_backtrace+0x0/0xf8) from [<c0011abc>] (show_stack+0x10/0x14)
[  122.621759] [<c0011abc>] (show_stack+0x10/0x14) from [<c009db40>] (warn_alloc_failed+0xa8/0x120)
[  122.621767] [<c009db40>] (warn_alloc_failed+0xa8/0x120) from [<c00a0a60>] (__alloc_pages_nodemask+0x614/0x8c8)
[  122.621774] [<c00a0a60>] (__alloc_pages_nodemask+0x614/0x8c8) from [<c009b7b0>] (filemap_fault+0x1c8/0x410)
[  122.621781] [<c009b7b0>] (filemap_fault+0x1c8/0x410) from [<c00b7340>] (__do_fault+0x68/0x560)
[  122.621787] [<c00b7340>] (__do_fault+0x68/0x560) from [<c00b9a1c>] (handle_pte_fault+0x90/0x1164)
[  122.621794] [<c00b9a1c>] (handle_pte_fault+0x90/0x1164) from [<c00bbb60>] (handle_mm_fault+0xec/0x154)
[  122.621803] [<c00bbb60>] (handle_mm_fault+0xec/0x154) from [<c050e134>] (do_page_fault.part.7+0x1d8/0x36c)
[  122.621812] [<c050e134>] (do_page_fault.part.7+0x1d8/0x36c) from [<c050e328>] (do_page_fault+0x60/0x6c)
[  122.621818] [<c050e328>] (do_page_fault+0x60/0x6c) from [<c0008490>] (do_PrefetchAbort+0x2c/0x94)
[  122.621825] [<c0008490>] (do_PrefetchAbort+0x2c/0x94) from [<c050cdd4>] (ret_from_exception+0x0/0x10)
[  122.621828] Exception stack(0xdd65dfb0 to 0xdd65dff8)
[  122.621832] dfa0:                                     00000001 bed63b98 00000008 0010efc8
[  122.621836] dfc0: bed63b98 00000455 bed63c48 000000fc bed63c20 00013800 000003e8 bed63c44
[  122.621840] dfe0: 00000000 bed63b94 b6eedbdc b6cd054c 600f0010 ffffffff
[  122.621842] Mem-info:
[  122.621843] DMA per-cpu:
[  122.621846] CPU    0: hi:  186, btch:  31 usd:  39
[  122.621849] CPU    1: hi:  186, btch:  31 usd: 153
[  122.621851] CPU    2: hi:  186, btch:  31 usd: 177
[  122.621853] CPU    3: hi:  186, btch:  31 usd:  65
[  122.621861] active_anon:123198 inactive_anon:3876 isolated_anon:0
[  122.621861]  active_file:25 inactive_file:9 isolated_file:32
[  122.621861]  unevictable:0 dirty:0 writeback:0 unstable:0
[  122.621861]  free:703 slab_reclaimable:902 slab_unreclaimable:2876
[  122.621861]  mapped:48 shmem:4573 pagetables:787 bounce:0
[  122.621861]  free_cma:9
[  122.621872] DMA free:2812kB min:2900kB low:3624kB high:4348kB active_anon:492792kB inactive_anon:15504kB active_file:100kB inactive_file:36kB unevictable:0kB isolated(anon):0kB isolated(file):128kB prs
[  122.621879] lowmem_reserve[]: 0 0 0 0
[  122.621901] DMA: 1*4kB (R) 0*8kB 0*16kB 1*32kB (R) 0*64kB 0*128kB 1*256kB (R) 1*512kB (R) 0*1024kB 1*2048kB (R) 0*4096kB = 2852kB
[  122.621902] 4665 total pagecache pages
[  122.626290] 139264 pages of RAM
[  122.626291] 1473 free pages
[  122.626292] 3367 reserved pages
[  122.626294] 2235 slab pages
[  122.626295] 263437 pages shared
[  122.626296] 0 pages swap cached
[  124.022116] [<c05076a0>] (dump_header.isra.14+0x78/0x198) from [<c009c8a0>] (oom_kill_process+0x368/0x3dc)
[  124.031742] [<c009c8a0>] (oom_kill_process+0x368/0x3dc) from [<c009cdb0>] (out_of_memory+0x2a4/0x2f8)
[  124.040936] [<c009cdb0>] (out_of_memory+0x2a4/0x2f8) from [<c00a0ce8>] (__alloc_pages_nodemask+0x89c/0x8c8)
[  124.050650] [<c00a0ce8>] (__alloc_pages_nodemask+0x89c/0x8c8) from [<c00ba440>] (handle_pte_fault+0xab4/0x1164)
[  124.060709] [<c00ba440>] (handle_pte_fault+0xab4/0x1164) from [<c00bbb60>] (handle_mm_fault+0xec/0x154)
[  124.070069] [<c00bbb60>] (handle_mm_fault+0xec/0x154) from [<c050e134>] (do_page_fault.part.7+0x1d8/0x36c)
[  124.079694] [<c050e134>] (do_page_fault.part.7+0x1d8/0x36c) from [<c050e328>] (do_page_fault+0x60/0x6c)
[  124.089059] [<c050e328>] (do_page_fault+0x60/0x6c) from [<c00083fc>] (do_DataAbort+0x2c/0x94)
[  124.097561] [<c00083fc>] (do_DataAbort+0x2c/0x94) from [<c050cb74>] (__dabt_usr+0x34/0x40)
[  124.105796] Exception stack(0xdf9edfb0 to 0xdf9edff8)
[  124.110838] dfa0:                                     00002021 0001c179 380a1e88 00000000
[  124.118980] dfc0: 00000000 b6e344c8 b6e344f8 3809fe70 000001ff 0001e198 0003d368 b6e34500
[  124.127131] dfe0: b6e344c8 bed32640 b6dab351 b6da952e 00000030 ffffffff
[  124.133727] Mem-info:
[  124.135986] DMA per-cpu:
[  124.138507] CPU    0: hi:  186, btch:  31 usd:  39
[  124.143280] CPU    1: hi:  186, btch:  31 usd: 153
[  124.148046] CPU    2: hi:  186, btch:  31 usd: 177
[  124.152817] CPU    3: hi:  186, btch:  31 usd:  65
[  124.157589] active_anon:123209 inactive_anon:3876 isolated_anon:0
[  124.157589]  active_file:19 inactive_file:11 isolated_file:32
[  124.157589]  unevictable:0 dirty:0 writeback:0 unstable:0
[  124.157589]  free:721 slab_reclaimable:902 slab_unreclaimable:2876
[  124.157589]  mapped:59 shmem:4573 pagetables:795 bounce:0
[  124.157589]  free_cma:0
[  124.188662] DMA free:2884kB min:2900kB low:3624kB high:4348kB active_anon:492836kB inactive_anon:15504kB active_file:76kB inactive_file:44kB unevictable:0kB isolated(anon):0kB isolated(file):128kB preo
[  124.230269] lowmem_reserve[]: 0 0 0 0
[  124.233949] DMA: 1*4kB (R) 0*8kB 0*16kB 2*32kB (UR) 0*64kB 0*128kB 1*256kB (R) 1*512kB (R) 0*1024kB 1*2048kB (R) 0*4096kB = 2884kB
[  124.245757] 4608 total pagecache pages
[  124.254280] 139264 pages of RAM
[  124.257403] 1531 free pages
[  124.260176] 3367 reserved pages
[  124.263304] 2235 slab pages
[  124.266078] 263245 pages shared
[  124.269198] 0 pages swap cached
[  124.272328] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[  124.280194] [ 1347]     0  1347      661      102       4        0         -1000 udevd
[  124.288085] [ 1961]   997  1961      588       38       4        0             0 dbus-daemon
[  124.296494] [ 1983]     0  1983    35826       62      69        0             0 mpmsrv
[  124.304471] [ 1999]     0  1999      472       41       4        0             0 rmServer.out
[  124.312965] [ 2003]     0  2003      467       16       3        0             0 telnetd
[  124.321027] [ 2006]     0  2006      376       22       3        0             0 lad_tci6638
[  124.329425] [ 2013]     0  2013      660      104       4        0         -1000 udevd
[  124.337312] [ 2014]     0  2014      660      104       4        0         -1000 udevd
[  124.345200] [ 2018]   999  2018      483       48       4        0             0 rpcbind
[  124.353265] [ 2031]   998  2031      419       34       3        0             0 rpc.statd
[  124.361500] [ 2037]     0  2037      536       29       4        0             0 dropbear
[  124.369641] [ 2045]     0  2045      863       42       4        0             0 syslog-ng
[  124.377873] [ 2046]     0  2046     1591      219       6        0             0 syslog-ng
[  124.386105] [ 2182]     0  2182      657       63       4        0             0 login
[  124.393991] [ 2183]     0  2183      652       59       4        0             0 sh
[  124.401619] [ 2184]     0  2184      652       82       4        0             0 bash
[  124.409414] [ 2189]     0  2189   230529   121206     452        0             0 mpirun
[  124.417388] [ 2190]     0  2190      681       46       3        0             0 ssh
[  124.425106] [ 2191]     0  2191   105251      299     209        0             0 nbody
[  124.432993] Out of memory: Kill process 2189 (mpirun) score 868 or sacrifice child
[  124.440535] Killed process 2191 (nbody) total-vm:421004kB, anon-rss:1196kB, file-rss:0kB
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 2191 on node DSP1 exited on signal 9 (Killed).
--------------------------------------------------------------------------
INK send EDMA_TRANSFER (to a2800000 from a2000000, size=28160!
00000, size=28160!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!
[DSP1:02191] BTL_HLINK no room in FIFO (send)!

I was looking for the source code of the ti-openmpi module but I did not found it with verison control. Only found this:

But I don't know how this is related to ti-openmpi in mcsdk-hpc package.

Please provide ti-openmpi source code. Please provide help to solve this issue with ti-openmpi. We have a very big project, which uses ti-openmpi over hyperlink prototyped on evaluation modules. We need it to work on custom hardware too. The hyperlink seems functioning properly. Do you have any idea about this FIFO full issue in the nbody log? What causes these strange kernel errors in the trace? Can this issue be related with, that the DSP0 hyperlink-0 port is connected with DSP1 hyperlink-1 port ? I think this should be transparent.I am using mcsdk-3_01_04_07. This is the latest version of mcsdk. We could not migrate to processor SDK, because Hyperlink and openmpi not supported on that. Please provide support to us.

Thanks for Your help.

Dávid Huber 

  • Hy,

    I've forgot some extra information, which may be relevant in this case.

    1. The two DSPs let their hyperlink configured once per power cycle. Otherwise each of the test-programs fail. I think the

    hyplnkRet_e hyplnkReset(int portNum); function in the low-level-driver example assures repeated execution of the testprogram, however this is not the case. 
     
    2. The mpmcl program, which initiates the hyperlink modules under linux, fails if the other DSP not booted up. Is this normal, that the mpmcl program needs to initialize the hyperlink on both DSP at the same time to work properly? We get this error code on the first DSP in case the remote DSP not booted: Enabling Hyperlink 1 on DSP1: transport arm-remote-hyplnk-1 failed (error: -114) retval -1001

    Note that the mpmcl initialization of hyperlink must be disabled (this is the mpm-hlink.sh script) if we want to test the interface with the hyplnk-lld example, because of the first issue.
    Best regards
    Dávid Huber
  • Hi David,

    MCSDK is quite old, but as I can see from it's user guide, ti-openmpi should be part of the mcsdk package:
       http://processors.wiki.ti.com/index.php/MCSDK_HPC_3.x_Getting_Started_Guide

    You can download the mcsdk hpc from here:
        http://software-dl.ti.com/sdoemb/sdoemb_public_sw/mcsdk_hpc/latest/index_FDS.html

    Documentation on ti-openmpi can be found here:
      http://processors.wiki.ti.com/index.php/MCSDK_HPC_3.x_OpenMPI

      http://processors.wiki.ti.com/index.php/MCSDK_HPC_3.x_MPI_over_Hyperlink

    There are more links with documentation in the MCSDK HPC 3.x OpenMPI link above.

    Best Regards,
    Yordan

  • Hey Yordan,

    Thanks for your response.

    The documentation and the above links are well known. I know, that MCSDK is old, but TI doesn't provide MPI and Hyperlink support on newer SDK releases. This is the reason, why we are must use this legacy product.

    I am building the MCSDK root filesystem using the TI yocto project, provided by the oe-layersetup toolchain. We are using the latest mcsdk_3_1_04_07 version. The yocto project can provide all the HPC toolchain except the ti-openmpi. This is installed separately via opkg package manager. The compatibility is just a hope. But we are using this environment with ti-openmpi via TCP and no problems so far.

    I tried to boot a new filesystem on our custom board provided by TI, to exclude software incompatibility:

    The latest hpc package you have linked, lead me to the mcsdk_3_00_4_18 package. I thought, that this version is surely compatible with the HPC package, you have linked. I installed the filesystem provided by the mcsdk_linux_3_00_4_18, kernel and devicetree. I made modifications in the devicetree clocks and network to make able to run our hardware. I installed the latest mcsdk-hpc packages. The initialization of the hyperlink had issues. But successfully reproduced the "[DSP1:02191] BTL_HLINK no room in FIFO (send)!" error message of the nbody testprogram, when run via Hyperlink.

    I've run out of ideas. Please answer my questions! I summarize here:

    Does it mean that the Hyperlink is functioning properly, if the mpm_transport_hyplnk_remote testprogram runs successfully?

    - I made the following test: run the mpm_transport_hyplnk_remote on one DSP processor, and investigated the MSMC 0x0c000000 base address of the other DSP processor via JTAG (CCS9.1). The mpm_transport_hyplnk_remote program does data transfer with the read-write test and the data is present in the remote side at 0x0c000000!

    Does it mean that the Hyperlink is functioning properly, if the hyperlink LLD memorymappedExample testprogram runs successfully?

    If the Hyperlink seems to work properly, according to the tests, I've mentioned above, why the ti-openmpi produces that strange "[DSP1:02191] BTL_HLINK no room in FIFO (send)!" error message ?

    Why Hyperlink interface can only be configured once per power cycle?

    Why the Hyperlink modules of each side have to be configured simultaneously? I though that hyperlink is able to wake up the remote side device too.

    Why ti-openmpi source code is not available in a git repository?

    Does the following topology for Hyperlink valid?

    Best Regards,

    Dávid

  • Hy all,

    I found the solution I think.

    The problem is that the topology isn't supported by default. The ti-openmpi implementation expects node_id variable to be set for a given DSP, to be able to identify itself in the topology.

    I dived into the source code

    and I found that the btl_hlink_component.c loads the /proc/device-tree/sl1500/node/id device-tree variable to identify itself in the topology. I added the following lines to the device-tree, in the k2hk-evm.dts file:

    sl1500 {
    node{
    id = "1";
    };
    };
    This is the default setting for the DSP1 node. In the DSP0 device-tree I set this variable to "0".
    Now the DSP0 identifies itself with node_id="0" and the DSP1 identifies itself with node_id="1".
    The nbody test-program ran successfully.