Can't initialize RDMA device


When launching FLUENT in parallel you might see this error:

fluent_mpi.6.3.26: Rank 0:0: MPI_Init: Can't initialize RDMA device
fluent_mpi.6.3.26: Rank 0:0: MPI_Init: MPI BUG: Cannot initialize RDMAComments:

Can't initialize RDMA device error indicates an incompatible HPMPI or OFED version with FLUENT.

OFED 1.2 needs HPMPI 2.2.5.1 so you will need to be running Fluent 6.3.35

Determining what OFED version you are running:

"cat /usr/ofed/BUILD_ID" will give us this info or % find /usr -name BUILD_ID

Determining the Software Version

If InfiniBand drivers are already installed on the host, they may be installed in one of several locations.

To determine the version of the Cisco InfiniBand host drivers, log in to the host and enter the following commands at the shell prompt. If the first command produces output, the Cisco Commercial InfiniBand host drivers are installed. If the second or third commands produce a version number, OFED host drivers are installed.

host$ rpm -qa | grep topspin
topspin-ib-mpi-rhel4-3.2.0-118
topspin-ib-mod-rhel4-2.6.9-34.ELsmp-3.2.0-118
topspin-ib-rhel4-3.2.0-118
host$ ofed_info | grep OFED
OFED-1.1
host$ grep OFED /usr/local/ofed/BUILD_ID
OFED-1.1

NOTE: Also see the document titled: 'OFED and MPI Compatibility Notes' (ID: 2019423)

Cisco Documentation
<a target=_blank href="http://www.cisco.com/en/US/docs/server_nw_virtual/open_fabrics_enterprise_distribution/ofed_host_driver/release1.2/release_note/rn11537.html#wp16932">http://www.cisco.com/en/US/docs/server_nw_virtual/open_fabrics_enterprise_distribution/ofed_host_driver/release1.2/release_note/rn11537.html#wp16932</a>http://www.cisco.com/en/US/docs/server_nw_virtual/open_fabrics_enterprise_distribution/ofed_host_driver/release1.2/release_note/rn11537.html#wp16932





Show Form
No comments yet. Be the first to add a comment!