This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

OpenMPI over SRIO Error: Cannot Resolve host name



Hi,

I have 2 k2h evm boards and a breakout board and I wanted to get the SRIO examples work.

Basically I followed this wiki page:

When I tried to run openmpi executable like:

/opt/ti-openmpi/bin/mpirun --mca btl self,srio -np 2 -host c1n1,c1n2 ./nbody 1000

I got error messages like:

/usr/bin/ssh: Exited: Error resolving 'c1n1' port '22'. Temporary failure in name resolution
[192.168.0.102:01778] [[39697,0],0] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 362
[192.168.0.102:01778] [[39697,0],0] attempted to send to [[39697,0],1]: tag 15
[192.168.0.102:01778] [[39697,0],0] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file base/grpcomm_base_xcast.c at line 166
[192.168.0.102:01778] [[39697,0],0] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 362
[192.168.0.102:01778] [[39697,0],0] attempted to send to [[39697,0],2]: tag 15
[192.168.0.102:01778] [[39697,0],0] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file base/grpcomm_base_xcast.c at line 166
/usr/bin/ssh: Exited: Error resolving 'c1n2' port '22'. Temporary failure in name resolution

It seems like it's still trying to use ssh over tcp to get to each other.

As for environment set up, I have updated the dtb file and kernel file, and srio is enabled when booting up. I can see rio_mport0~3 under /dev/ after booting.

I also have srio_hosts and srio_topology.bin files in my /etc/cluster.

srio_hosts looks like:

1 1 c1n1

1 2 c1n2

and SRIO_evm,json looks like:

{

   "cartridges":[

      {

         "name":"c1",

         "nodes":[

            {

               "name":"c1n1",

               "connections":[

                  {"port0":"c1n2"},

                  {"port1":"NONE"},

                  {"port2":"NONE"},

                  {"port3":"NONE"}

				 ]

            },

            {

               "name":"c1n2",

               "connections":[

                  {"port0":"c1n1"},

                  {"port1":"NONE"},

                  {"port2":"NONE"},

                  {"port3":"NONE"}

				 ]

            }

         ]

     }

  ]

}

The SRIO connection should be fine since I've tested it with other applications.

Thanks,

Max