Encyclopaedia Index

Installation of PHOENICS 2012

6 Parallel PHOENICS with MPICH2

6.1 Introduction

This chapter describes the installation of MPICH2 for the user who purchased a licence for running Parallel PHOENICS on machines that are running the 64-bit version of Microsoft Windows (XP Professional, Windows 7, Server 2008 etc). It applies to installations on a single multi-core processor machine and clusters of multiple workstations or servers.

A typical configuration for running 64-bit parallel PHOENICS is a multi-core workstation with 2GB of RAM per core. However, if you are running large models then you may require double this capacity or more. Generally as computer power increases, users are building larger and larger models. So when buying new hardware allow room for memory expansion.

The workstations in a cluster should be joined by a switch or hub running at 100Mbps or faster. For efficient operation the processors in a cluster should be of equal performance: if not, the loads will be unbalanced and the slowest processor will determine the speed of the system. MPICH2 also requires the ability to make TCP/IP socket connections between all hosts.

6.2 Preliminary Considerations

6.2.1 Multi-core Workstations

The installation process for Parallel PHOENICS on a single workstation with multiple-cores/processors is simpler than the installation process for a cluster. It still requires the installation of MPICH2 to handle the message passing aspects, but one does not need to be concerned with firewalls nor with network connectivity issues. MPICH2 should be installed on the single workstation as outlined in section 6.3 (Items 1 - 5 only), and the PATH set appropriately so that mpiexec.exe is visible. SMPD should also be installed as a service on the workstation.

6.2.2 PCs on a Cluster

When running Parallel PHOENICS on a cluster it is usual to designate one of the workstations as the Head Node, from which all runs are launched. The other workstations in the cluster are then designated as Compute Nodes.

In order to perform computations across a cluster, it must be possible to use a single user account to log into all the machines in the cluster. Further that account should have sufficient privileges to be able to launch processes on a remote machine in the cluster. Therefore, before starting the installation make sure that each workstation in the cluster belongs to the same Workgroup or Domain. To check this, open the 'Control Panel' from the Start button, then click on the 'System and Security' link and then on 'System'. This opens a page headed 'View basic information about your computer'. (see figure below):

Here the computer name is CHAM-CFD1 and it belongs to the Workgroup WORKGROUP. You may choose the name of the workgroup, but all the workstations which are to be in the cluster must belong to the same one. If your workstations are part of a company network then you will find that you have a Domain specified here instead of a Workgroup. If this is the case then all the workstations in the cluster will need to belong to the same domain.

To change the Workgroup, on the left hand panel click on 'Advanced system settings'. At this point one may be prompted for Administrator login credentials, then once the dialog has appeared go to the tab 'Computer Name'. Here click on the 'Change...' button to reset the Workgroup of any workstations where it is necessary.

It is recommended to install MPICH2 on each workstation in the cluster. The two prerequisites for installing MPICH2 can usually be already found on a modern workstation. These are Microsoft .NET framework version 2.0 (or later) and Microsoft Visual C++ 2005 SP1 (or later). Both these packages are available as free downloads from the Microsoft website. For a 64-bit workstation the links to the packages are:

Microsoft .NET Framework version 2.0 Redistributable Package (x64)

Microsoft Visual C++ 2005 SP1 Redistributable Package (x64)

These links were still correct November 2013 and they also provide links to more recent versions of the packages at the base of page under the column 'What others are downloading'.

If you choose not to install MPICH2 on each workstation, you will still need to install SMPD as a service on each workstation. To install MPICH2 or to install the SMPD service, the user will need to be logged in using an account with Administrator permissions. This need not be the account on which you run PHOENICS, although it is preferable when running over a cluster to use an administrator account.

Most workstations will have a personal Firewall installed, since XP Service Pack 2 Windows has had the personal Firewall switched on by default. Experienced users may configure the firewall settings to allow parallel PHOENICS to run successfully, but while installing and until the user is sure that MPICH2 has been configured correctly, it is recommended that any personal firewalls are turned off. For Windows Firewall, open the 'System and Security' page of the Control Panel, then click on the 'Windows Firewall' link. Then click on the link 'Turn Windows Firewall on or off' on the left hand column - you will require Administrator permissions to do this - then turn off Windows Firewall.

After the user has established that MPICH2 is working correctly then the Firewall should be switched back on and configured to allow the working of Parallel PHOENICS (see section 6.5).

6.3 Installation of MPICH2

It will be assumed in what follows that PHOENICS has been successfully installed in accordance with the instructions in the earlier chapters of this document. A full PHOENICS installation need only be made on one machine in a parallel cluster (the head node from which jobs will be run); however it is recommended that an installation is made on each workstation. If PHOENICS is only installed on the head node then it will be necessary to share the \phoenics directory so that all compute nodes may see it. Each workstation in the cluster will also be required to have a valid local licence file 'phoenics.lic' accessible.

On Windows platforms parallel PHOENICS uses MPICH2 as the message passing interface (MPI) for the communication between the different processors. MPICH2 is freely available on the Internet but for compatibility it is recommended that the installation is made from the MPI provided with PHOENICS package. Installation instructions are as follows:

  1. In order to run the MPICH2 installation program you must first be logged onto the workstation using an account that has Administrator privileges.
  2. Open a Command Prompt window.
    If the workstation is running Vista Business, Windows 7 or Server 2008 you need to ensure that any Command Prompt window you use has Administrator privileges. It is thus necessary, when opening a Command Prompt window, to explicitly request 'Run as Administrator' from the shortcut menu. This shortcut menu is obtained by right-clicking the icon from which the Command Prompt window is launched. Note: A warning window from User Account Control may appear at this point, respond with 'yes' to allow changes to be made.
  3. Navigate to the folder \phoenics\d_allpro\d_windf\MPI then execute the command:
        > msiexec /I mpich2-1.2.1p1-win-x86-64.msi
    
    It is recommended that the user choose all the default options. This will install mpich2 within the C:\Program Files directory.
  4. The PHOENICS parallel run scripts assume that mpiexec.exe is located on the users PATH on the head node. If the default location was chosen, this should be C:\Program Files\MPICH2\bin. The PATH should be set through the System Properties dialog [launched from the Control Panel]. The image below shows the stages to set the System variable, path. Go to the Advance settings page and click on Environment Variables button. If the System variable path does not exist, click on New, otherwise highlight path and click on Edit. Add the necessary path entry for mpiexec - separate any path entries by semi colons.
  5. It is advisable at this stage to check that the MPICH2 Process Manager has been installed correctly. To do this open the Windows Task Manager. If you have the Services tab available switch to that tab and try to locate the service named mpich2_smpd. If you don't have the Services tab, then the process smpd.exe should be visible on the Processes tab when Show processs from all users is enabled.
    If for some reason the MPICH2 Process Manager is not running it will be necessary to install it manually. To do this open a Command Prompt window with Administrator permissions (see item 2 above) and execute the command:
        > smpd.exe -install
    
  6. It is recommended to register a user account from which to launch MPICH2 on the head node. While it is not essential, it does simplify matters if this account is an administrator on the workstations within the cluster. This account will then be used to launch parallel Phoenics on the compute nodes. To register an account, open a Command Prompt window [From the Start menu it is located under Accessories], and run the MPICH2 program mpiexec.exe with the option -register. You will be prompted for an account and password, e.g.
        > mpiexec.exe -register
        account:  cfd1
        password: ********
        confirm:  ********
        Do you want this action to be  persistent (y/n)? y
    
    If the cluster has been set up within a network Domain (rather than a Workgroup), then in the above, you should also specify the domain as part of the account name. For example, if 'phoenics' is the domain and 'cfd1' is the user account within that domain, enter phoenics\cfd1. The response 'y' ensures that this action is persistent, i.e. this registration process does not have to be repeated for each session on this workstation.
  7. If Phoenics is being run across a cluster (as opposed to running on a single multi-core workstation) then it will be useful to identify the workstations within the cluster. This may be done in a Command Prompt window, using smpd.exe. For example, to include the four PCs CHAM-CFD1, CHAM-CFD2, CHAM-CFD3 and CHAM-CFD4 in a cluster we type,
        > smpd.exe  -sethosts CHAM-CFD1 CHAM-CFD2 CHAM-CFD3 CHAM-CFD4
    
    To check which hosts have been added to the cluster one can type,
        > smpd.exe -hosts
    
    The response will be a list of the workstations in the cluster. While this step is not essential for running on a cluster, it does mean that when you subsequently run the parallel solver you do not need to keep specifying which nodes to use for computations, smpd will select the hosts from the list.

6.4 Running Parallel PHOENICS

6.4.1 Running the standard parallel Earth

The simplest way to launch parallel EARTH is from the VR-Editor, although it can be run from a Command Prompt window.

If a parallel PHOENICS licence has been purchased, an additional sub-menu, 'Parallel Solver', will appear under the 'Run' menu option in the VR-Editor. Once the parallel solver is chosen, a dialog box will appear on the screen where the user can either specify the number of processes to use or to specify a MPI configuration file.

The pulldown combo box provides the user with an option to select up to 64 processes in steps of 2. Those users who have more than 64 processors on their workstation cluster may type the appropriate number into the box.

The 'Cluster Host List' portion of the dialog enables the user to select which hosts in the cluster are used for computation. Here there are three options,

  1. 'Local Only', the default, will just use cores on the local machine (ie that on which the instance of VR-Editor is running).
  2. 'Any': will use a computer assigned distribution of processes on the nodes in the cluster. These must have been previously identified in the cluster (see step 7 in section 6.3 above).
  3. 'Specify in list': users may select hosts from the scroll list. By default this list should contain those hosts previously idenified in the cluster, but one can also add to the list by using the 'Add' button. Alternatively, one can supply a 'Machine List' file which contains a list of those workstations from which to select. This file is simply a text file with name of the workstations each on a separate line

This mode of running the parallel solver will always launch the root process on the local machine and a convergence monitoring window will appear on screen (as per the sequential solver).

If running across a cluster, then the run will attempt to launch an instance of the solver at the same location. If the default locations are used, this will be C:\phoenics\d_earth\d_windf\earexe.exe. If a Private Earth is to be used, then this also should be copied to the equivalent directory for each of the compute nodes.

When running across a cluster, it is important to consider the working directory on the compute nodes. This is because, by default, mpiexec will attempt to launch the process in the equivalent directory on all the workstations. So, if on the head node you are working in c:\phoenics\myprojects\projectbeta then this directory should also appear on all the workstations in the cluster otherwise the run will fail.

As it can difficult to always remember to create the working directory on all the cluster workstations there is an alternative. One can set up an environment variable PHOE_WORK_DIR on each of the cluster to point to an existing fixed directory
e.g. PHOE_WORK_DIR=C:\phoenics\mypar_runs
Then all processes (aside from the launching process) will write their output to this location.

PLEASE NOTE: The use of PHOE_WORK_DIR is not recommended if you are likely to make multiple parallel runs simultaneously. This is because the second run (and subsequnt runs) will overwrite the working files of the first.

The above methods of launching the parallel solver do not allow the user to fix the number of solver instances on each workstation. If you want that level of control, then the user will need to use the MPI Configuration file(see section 6.4.4 below).

6.4.2 Automatic domain decomposition

When using the default automatic domain decomposition, parallel PHOENICS only differs from sequential when Earth is run: the problem set-up and post-processing of results can be done in exactly the same way as for the sequential version. A case that has been run in sequential mode can be run in parallel without any changes being made. The output from a parallel PHOENICS simulation will be result and phi files, having the same format as for sequential simulations.

6.4.3 User-specified sub-domains

It is also possible to by-pass the automatic domain decomposition algorithm, and to specify how you want to decompose the calculation domain into sub-domains. This can be done by setting the appropriate date-for-solver arrays in the Q1 file.

For example, to split the domain into 8 sub-domains (2 in each direction), the following arrays must be set in the Q1 file:

LG(2)=T
IG(1)=2
IG(2)=2
IG(3)=2

The logical LG(2) will instruct the splitter to by-pass the automatic domain decomposition, and split the domain according to the settings defined in the IG array as follows.

IG(1) specifies the number of sub-domains in the x-direction;
IG(2) specifies the number of sub-domains in the y-direction;
IG(3) specifies the number of sub-domains in the z-direction;

In this case, the domain has been divided into sub-domains according to the settings made in the Q1 file.

6.4.4 Configuration File

The MPI configuration file option gives a more flexible way of launching the parallel solver. Assuming we have PHOENICS installed on each workstation in the cluster, the following config file will use the public earexe.exe to run a single process on each of the four workstations.

    -localroot -n 1 -host cham-cfd1 c:\phoenics\d_earth\d_windf\earexe.exe
    -n 1 -host cham-cfd2 c:\phoenics\d_earth\d_windf\earexe.exe
    -n 1 -host cham-cfd3 c:\phoenics\d_earth\d_windf\earexe.exe
    -n 1 -host cham-cfd4 c:\phoenics\d_earth\d_windf\earexe.exe

The following example launches two processes on each of two workstations where PHOENICS is installed only on the head node:

    -localroot -n 2 -host cham-cfd1 \\cham-cfd1\phoenics\d_earth\d_windf\earexe.exe
    -n 2 -host cham-cfd2 \\cham-cfd1\phoenics\d_earth\d_windf\earexe.exe

Users should create their own configuration and 'run' files, based on the examples provided, tailored to their own installation. These can either be located in \phoenics\d_utils\d_windf or the local working directory.

6.4.5 Cluster Operation

All Nodes in the cluster should belong to the same Workgroup or Domain, and the user should be logged into each Node on the Cluster using the same Workgroup/Domain User account.

PHOENICS must be installed on the Head Node, installation on the other Compute Nodes is strongly recommended. If PHOENICS is only installed on the Head Node then the phoenics folder will need to be shared, with at least Read and Execute permissions, for the other Compute Nodes in the cluster. - Write permissions may also be required if the working directory is within the phoenics folder.

The Shared name which is chosen when the folder is shared is used in the MPI Configuration file, and in the example file above the shared name is ‘phoenics’. In addition, on each Compute Node there must be a folder with the same pathname as that on the Head Node from which PHOENICS has been launched. For example, if, on the Head Node the program is run from C:\phoenics\d_priv1 then there must be a folder C:\phoenics\d_priv1 on each of the Compute Nodes. This folder must contain a copy of the FLEXlm licence file 'phoenics.lic', (which can be found in C:\phoenics/d_allpro on the Master workstation). The Workgroup/Domain User account used to log into each Compute Node must allow write access to this folder C:\phoenics\d_priv1.

If PHOENICS is installed on each Compute Node (in addition to the Head Node) then the Workgroup/Domain User account used to log into each Compute Node must allow read access to all PHOENICS folders, and write access to the folder C:\phoenics\d_priv1.
For cluster operation it is necessary for MPICH2 to know which processors to use for the run. This is achieved by means of a configuration file (see section 6.5.2 above), or by using the smpd command (see section 6.3 step 5 above).

6.4.6 Command mode operation

In a Command Prompt window, if the EARTH executable is launched directly, then the sequential solver will be used; to run the parallel solver, the program name ‘earexe’ is used as an argument to mpiexec.

A script RUNPAR.BAT [nnodes] is provided. The optional argument [nnodes] indicates the number of processes to be launched on the current workstation. The default is to launch two processes.

For example, RUNPAR 2 will execute the MPI command:

    > mpiexec –localroot -np 2 \phoenics\d_earth\d_windf\earexe

If a cluster has been defined by SMPD then, the command will execute on two processors in the cluster, otherwise it will launch multiple processes on the local machine. The argument -localroot indicates that the root process will be on the local machine. This argument is essential if any graphical monitoring is required during the solution.

6.4.7 Testing Parallel PHOENICS

The parallel installation should be tested by loading a library case. The different solver used for parallel operation requires a slight modification to the numerical controls. For example, the user may use main 'Menu' in the VR-Editor, and select 'Numerics' and then 'Iteration control': change the number of iterations for TEM1 (temperature) from 20 to 300. (Increasing the relaxation for the velocity components, U1 and W1, from 1.0 to 10.0 will also improve performance.) For parallel operation it is recommended that velocities should be solved whole-field (rather than slab-by-slab); this can be achieved from the VR Editor (under 'Models', 'Solution control/extra variables') or by direct editing of the q1 file (by setting 'Y' as the third logical in a SOLUTN command).

6.5 Windows Firewall settings

When the firewall is activated, running the PHOENICS solver, earexe, may generate a Windows Firewall Security Alert. When running in parallel mode it is essential that earexe is Unblocked, even if you are only using processes on the host workstation. If mpiexec is run with an MPI configuration file instead of the executable program as the argument, then there will be an additional security alert for the mpiexec program. Again, it is essential that this program be unblocked.

With the Windows Firewall, the user may choose to unblock the Earexe executable from the security alert dialog above. However, if you are operating across a cluster, this will not be sufficient to enable Parallel Phoenics to run. There are additional settings needed on both the head and compute nodes.

On the head node, for those using the Windows 7, open the Windows Firewall link from the Systems and Security page on the Control Panel. Then open the link 'Allow a program or feature through Windows Firewall'. You will then need Administrator permission to click on button 'Change settings' before click on the button 'Allow another program'. On the 'Add a Program...' dialog use the 'Browse...' button to add the following programs in turn:

C:\Phoenics\d_earth\d_windf\earexe.exe
C:\Program Files\MPICH2\binsmpd.exe
C:\Program Files\MPICH2\bin\mpiexec.exe

You may also use the 'Network locaion types..' button to restrict access to Home/Work (Private) or Public networks only.

On each of the compute nodes, you will need to add the programs

C:\Phoenics\d_earth\d_windf\earexe.exe
C:\Program Files\MPICH2\bin\smpd.exe

Users of other personal firewall will need to unblock the above programs in a manner suitable to their firewall software.

6.6 Further Information

A MPICH2 user guide is installed as part of the installation in PDF format; it is accessible from the MPICH menu item on the Start menu. Online documentation is available at

http://www.mcs.anl.gov/research/projects/mpich2/

6.7 Troubleshooting

If the head node has difficulty seeing one or more of the compute nodes and you are using static IP addresses in your cluster then you should consider defining the IP addresses in the host file on each of the workstations. The host file is normally located as

C:\Windows\system32\drivers\etc\hosts

An example 'hosts' file is:

# Copyright (c) 1993-2009 Microsoft Corp.
#
# This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
#
# This file contains the mappings of IP addresses to host names. Each
# entry should be kept on an individual line. The IP address should
# be placed in the first column followed by the corresponding host name.
# The IP address and the host name should be separated by at least one
# space.
#
# Additionally, comments (such as these) may be inserted on individual
# lines or following the machine name denoted by a '#' symbol.
#
# For example:
#
#   102.54.94.97  rhino.acme.com # source server
#    38.25.63.10  x.acme.com     # x client host

127.0.0.1 localhost
192.168.11.1 cham-cfd1
192.168.11.2 cham-cfd2
192.168.11.3 cham-cfd3
192.168.11.4 cham-cfd4

To determine the IP addresses, on each of the workstations in the cluster open a Command Prompt window and type the command ipconfig. Use the IPv4 Address located in the section Ethernet adapter Local Area Connection to define the IP address for the workstation in the hosts file.