5.703 Server High Availability

Torque can run in a redundant or high availability mode. This means that there can be multiple instances of the server running and waiting to take over processing in the event that the primary server fails.

The high availability feature is available in the 2.3 and later versions of Torque.

The "native" high availability implementation, as described here, is only suitable for Moab Basic Edition. Contact Adaptive Computing for information on high availability for Enterprise Edition.

For more details, see these sections:

Redundant server host machines
Enabling High Availability
Server High Availability
Enhanced High Availability with Moab
How Commands Select the Correct Server Host
Job Names
Persistence of the pbs_server Process
High Availability of the NFS Server
Installing Torque in High Availability Mode
Installing Torque in High Availability Mode on Headless Nodes
Example Setup of High Availability

5.703.1 Redundant server host machines

High availability enables Moab HPC Suite to continue running even if pbs_server is brought down. This is done by running multiple copies of pbs_server which have their torque/server_priv directory mounted on a shared file system.

Do not use symlinks when sharing the Torque home directory or server_priv directories. A workaround for this is to use mount --rbind /path/to/share /var/spool/torque. Also, it is highly recommended that you only share the server_priv and not the entire TORQUE_HOMEDIR.

The torque/server_name must include the host names of all nodes that run pbs_server. All MOM nodes also must include the host names of all nodes running pbs_server in their torque/server_name file. The syntax of the torque/server_name is a comma delimited list of host names.

For example:

host1,host2,host3

When configuring high availability, do not use $pbsserver in the pbs_mom configuration to specify the server host names. You must use the TORQUE_HOMEDIR/server_name file.

All instances of pbs_server need to be started with the --ha command line option that allows the servers to run at the same time. Only the first server to start will complete the full startup. The second server to start will block very early in the startup when it tries to lock the file torque/server_priv/server.lock. When the second server cannot obtain the lock, it will spin in a loop and wait for the lock to clear. The sleep time between checks of the lock file is one second.

Notice that not only can the servers run on independent server hardware, there can also be multiple instances of the pbs_server running on the same machine. This was not possible before as the second one to start would always write an error and quit when it could not obtain the lock.

5.703.2 Enabling High Availability

To use high availability, you must start each instance of pbs_server with the --ha option.

Three server options help manage high availability. The server parameters are lock_file, lock_file_update_time, and lock_file_check_time.

The lock_file option allows the administrator to change the location of the lock file. The default location is torque/server_priv. If the lock_file option is used, the new location must be on the shared partition so all servers have access.

The lock_file_update_time and lock_file_check_time parameters are used by the servers to determine if the primary server is active. The primary pbs_server will update the lock file based on the lock_file_update_time (default value of 3 seconds). All backup pbs_servers will check the lock file as indicated by the lock_file_check_time parameter (default value of 9 seconds). The lock_file_update_time must be less than the lock_file_check_time. When a failure occurs, the backup pbs_server takes up to the lock_file_check_time value to take over.

> qmgr -c "set server lock_file_check_time=5"

In the above example, after the primary pbs_server goes down, the backup pbs_server takes up to 5 seconds to take over. It takes additional time for all MOMs to switch over to the new pbs_server.

The clock on the primary and redundant servers must be synchronized in order for high availability to work. Use a utility such as NTP to ensure your servers have a synchronized time.

Do not use anything but a simple NFS fileshare that is not used by anything else (i.e., only Moab can use the fileshare).

Do not use a general-purpose NAS, parallel file system, or company-wide shared infrastructure to set up Moab high availability using "native" high availability.

5.703.3 Enhanced High Availability with Moab

When Torque is run with an external scheduler such as Moab, and the pbs_server is not running on the same host as Moab, pbs_server needs to know where to find the scheduler. To do this, use the -l option as demonstrated in the example below (the port is required and the default is 15004).

Red Hat 6 or SUSE 11-based systems

> pbs_server -l <moabhost:port>

Red Hat 7 or SUSE 12-based systems

Set the PBS_ARGS environment variable in the /etc/sysconfig/pbs_server file. Set PBS_ARGS=-l <moabhost:port> where moabhost is the name of the alternate server node and port is the port on which Moab on the alternate server node is listening (default 15004).

If Moab is running in HA mode, set the -l option for each redundant server.

Red Hat 6 or SUSE 11-based systems

> pbs_server -l <moabhost1:port> -l <moabhost2:port>

Red Hat 7 or SUSE 12-based systems

Set the PBS_ARGS environment variable in the /etc/sysconfig/pbs_server file to PBS_ARGS=-l <moabhost:port> -l <moabhost2:port>.

If pbs_server and Moab run on the same host, use the --ha option as demonstrated in the example below.

Red Hat 6 or SUSE 11-based systems

> pbs_server --ha

Red Hat 7 or SUSE 12-based systems

Set the PBS_ARGS environment variable in the /etc/sysconfig/pbs_server file toPBS_ARGS=--ha.

The root user of each Moab host must be added to the operators and managers lists of the server. This enables Moab to execute root level operations in Torque.

5.703.4 How Commands Select the Correct Server Host

The various commands that send messages to pbs_server usually have an option of specifying the server name on the command line, or if none is specified will use the default server name. The default server name comes either from the environment variable PBS_DEFAULT or from the file torque/server_name.

When a command is executed and no explicit server is mentioned, an attempt is made to connect to the first server name in the list of hosts from PBS_DEFAULT or torque/server_name. If this fails, the next server name is tried. If all servers in the list are unreachable, an error is returned and the command fails.

Note that there is a period of time after the failure of the current server during which the new server is starting up where it is unable to process commands. The new server must read the existing configuration and job information from the disk, so the length of time that commands cannot be received varies. Commands issued during this period of time might fail due to timeouts expiring.

5.703.5 Job Names

Job names normally contain the name of the host machine where pbs_server is running. When job names are constructed, only the server name in $PBS_DEFAULT or the first name from the server specification list, TORQUE_HOME/server_name, is used in building the job name.

5.703.6 Persistence of the pbs_server Process

The system administrator must ensure that pbs_server continues to run on the server nodes. This could be as simple as a cron job that counts the number of pbs_server's in the process table and starts some more if needed.

5.703.7 High Availability of the NFS Server

Before installing a specific NFS HA solution please contact Adaptive Computing Support for a detailed discussion on NFS HA type and implementation path.

One consideration of this implementation is that it depends on NFS file system also being redundant. NFS can be set up as a redundant service. See the following.

There are also other ways to set up a shared file system. See the following:

5.703.8 Installing Torque in High Availability Mode

The following procedure demonstrates a Torque installation in high availability (HA) mode.

To install Torque in HA mode

Stop all firewalls or update your firewall to allow traffic from Torque services.

Red Hat 6 or SUSE 11-based systems

> service iptables stop
> chkconfig iptables off

Red Hat 7 or SUSE 12-based systems

> systemctl stop firewalld
> systemctl disable firewalld

If you are unable to stop the firewall due to infrastructure restriction, open the following ports:

15001[tcp,udp]
15002[tcp,udp]
15003[tcp,udp]

Disable SELinux

> vi /etc/sysconfig/selinux
SELINUX=disabled

Update your main ~/.bashrc profile to ensure you are always referencing the applications to be installed on all servers.

Torque
export TORQUE_HOME=/var/spool/torque
 
# Library Path
 
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${TORQUE_HOME}/lib
 
# Update system paths
export PATH=${TORQUE_HOME}/bin:${TORQUE_HOME}/sbin:${PATH}

Verify server1 and server2 are resolvable via either DNS or looking for an entry in the /etc/hosts file.
Configure the NFS Mounts by following these steps:

Create mount point folders on fileServer.

fileServer# mkdir -m 0755 /var/spool/torque
fileServer# mkdir -m 0750 /var/spool/torque/server_priv

Update /etc/exports on fileServer. The IP addresses should be that of server2.

/var/spool/torque/server_priv 192.168.0.0/255.255.255.0(rw,sync,no_root_squash)

Update the list of NFS exported file systems.

fileServer# exportfs -r

If the NFS daemons are not already running on fileServer, start them.

Red Hat 6 or SUSE 11-based systems

> service rpcbind restart
> service nfs-server start
> service nfs-lock start
> service nfs-idmap start

Red Hat 7 or SUSE 12-based systems

> systemctl restart rpcbind.service
> systemctl start nfs-server.service
> systemctl start nfs-lock.service
> systemctl start nfs-idmap.service

Mount the exported file systems on server1 by following these steps:

Create the directory reference and mount them.

server1# mkdir /var/spool/torque/server_priv

Repeat this process for server2.

Update /etc/fstab on server1 to ensure that NFS mount is performed on startup.

fileServer:/var/spool/torque/server_priv /var/spool/torque/server_priv nfs rsize= 8192,wsize=8192,timeo=14,intr

Repeat this step for server2.

Install Torque by following these steps:

Download and extract Torque6.1.0 on server1.

server1# wget http://github.com/adaptivecomputing/torque/branches/6.1.0/torque-6.1.0.tar.gz
server1# tar -xvzf torque-6.1.0.tar.gz

Navigate to the Torque directory and compile Torque on server1.

server1# configure
server1# make
server1# make install
server1# make packages

If the installation directory is shared on both head nodes, then run make install on server1.

server1# make install

If the installation directory is not shared, repeat step 8a-b (downloading and installing Torque) on server2.

Start trqauthd.

Red Hat 6 or SUSE 11-based systems

server1# service trqauthd start

Red Hat 7 or SUSE 12-based systems

server1# systemctl start trqauthd

Configure Torque for HA.

List the host names of all nodes that run pbs_server in the torque/server_name file. You must also include the host names of all nodes running pbs_server in the torque/server_name file of each MOM node. The syntax of torque/server_name is a comma-delimited list of host names.

server1,server2

Create a simple queue configuration for Torque job queues on server1.

server1# pbs_server -t create
server1# qmgr -c “set server scheduling=true”
server1# qmgr -c “create queue batch queue_type=execution”
server1# qmgr -c “set queue batch started=true”
server1# qmgr -c “set queue batch enabled=true”
server1# qmgr -c “set queue batch resources_default.nodes=1”
server1# qmgr -c “set queue batch resources_default.walltime=3600”
server1# qmgr -c “set server default_queue=batch”

Because server_priv/* is a shared drive, you do not need to repeat this step on server2.

Add the root users of Torque to the Torque configuration as an operator and manager.

server1# qmgr -c “set server managers += root@server1”
server1# qmgr -c “set server managers += root@server2”
server1# qmgr -c “set server operators += root@server1”
server1# qmgr -c “set server operators += root@server2”

Because server_priv/* is a shared drive, you do not need to repeat this step on Server 2.

You must update the lock file mechanism for Torque in order to determine which server is the primary. To do so, use the lock_file_update_time and lock_file_check_time parameters. The primary pbs_server will update the lock file based on the specified lock_file_update_time (default value of 3 seconds). All backup pbs_servers will check the lock file as indicated by the lock_file_check_time parameter (default value of 9 seconds). The lock_file_update_time must be less than the lock_file_check_time. When a failure occurs, the backup pbs_server takes up to the lock_file_check_time value to take over.

server1# qmgr -c “set server lock_file_check_time=5”
server1# qmgr -c “set server lock_file_update_time=3”

Because server_priv/* is a shared drive, you do not need to repeat this step on server2.

List the servers running pbs_server in the Torqueacl_hosts file.

server1# qmgr -c “set server acl_hosts += server1”
server1# qmgr -c “set server acl_hosts += server2”

Because server_priv/* is a shared drive, you do not need to repeat this step on server2.

Restart the running pbs_server in HA mode.

Red Hat 6 or SUSE 11-based systems

service pbs_server stop

Red Hat 7 or SUSE 12-based systems

systemctl stop pbs_server

Start the pbs_server on the secondary server.

Red Hat 6 or SUSE 11-based systems

service pbs_server start

Red Hat 7 or SUSE 12-based systems

systemctl start pbs_server

Check the status of Torque in HA mode.

server1# qmgr -c “p s”
server2# qmgr -c “p s”

The commands above returns all settings from the active Torque server from either node.

Drop one of the pbs_servers to verify that the secondary server picks up the request.

Red Hat 6 or SUSE 11-based systems

service pbs_server stop

Red Hat 7 or SUSE 12-based systems

systemctl stop pbs_server

Stop the pbs_server on server2 and restart pbs_server on server1 to verify that both nodes can handle a request from the other.

Install a pbs_mom on the compute nodes.

Copy the install scripts to the compute nodes and install.
Navigate to the shared source directory of Torque and run the following:

node1 torque-package-mom-linux-x86_64.sh --install
node2 torque-package-clients-linux-x86_64.sh --install

Repeat this for each compute node. Verify that the /var/spool/torque/server-name file shows all your compute nodes.

On server1 or server2, configure the nodes file to identify all available MOMs. To do so, edit the /var/spool/torque/server_priv/nodes file.

node1 np=2
node2 np=2

Change the np flag to reflect number of available processors on that node.

Recycle the pbs_servers to verify that they pick up the MOM configuration.

Red Hat 6 or SUSE 11-based systems

service pbs_server stop

Red Hat 7 or SUSE 12-based systems

systemctl stop pbs_server

Start the pbs_mom on each execution node.

Red Hat 6 or SUSE 11-based systems

service pbs_mom start

Red Hat 7 or SUSE 12-based systems

systemctl start pbs_mom

5.703.9 Installing Torque in High Availability Mode on Headless Nodes

The following procedure demonstrates a Torque installation in high availability (HA) mode on nodes with no local hard drive.

To install Torque in HA mode on a node with no local hard drive

Stop all firewalls or update your firewall to allow traffic from Torque services.

Red Hat 6 or SUSE 11-based systems

> service iptables stop
> chkconfig iptables off

Red Hat 7 or SUSE 12-based systems

> systemctl stop firewalld
> systemctl disable firewalld

If you are unable to stop the firewall due to infrastructure restriction, open the following ports:

15001[tcp,udp]
15002[tcp,udp]
15003[tcp,udp]

Disable SELinux

> vi /etc/sysconfig/selinux

SELINUX=disabled

Update your main ~/.bashrc profile to ensure you are always referencing the applications to be installed on all servers.

# Torque
export TORQUE_HOME=/var/spool/torque
 
# Library Path
 
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${TORQUE_HOME}/lib
 
# Update system paths
export PATH=${TORQUE_HOME}/bin:${TORQUE_HOME}/sbin:${PATH}

Verify server1 and server2 are resolvable via either DNS or looking for an entry in the /etc/hosts file.
Configure the NFS Mounts by following these steps:

Create mount point folders on fileServer.

fileServer# mkdir -m 0755 /var/spool/torque

Update /etc/exports on fileServer. The IP addresses should be that of server2.

/var/spool/torque/ 192.168.0.0/255.255.255.0(rw,sync,no_root_squash)

Update the list of NFS exported file systems.

fileServer# exportfs -r

If the NFS daemons are not already running on fileServer, start them.

Red Hat 6 or SUSE 11-based systems

> service rpcbind restart
> service nfs-server start
> service nfs-lock start
> service nfs-idmap start

Red Hat 7 or SUSE 12-based systems

> systemctl restart rpcbind.service
> systemctl start nfs-server.service
> systemctl start nfs-lock.service
> systemctl start nfs-idmap.service

Mount the exported file systems on server1 by following these steps:

Create the directory reference and mount them.

server1# mkdir /var/spool/torque

Repeat this process for server2.

Update /etc/fstab on server1 to ensure that NFS mount is performed on startup.

fileServer:/var/spool/torque/server_priv /var/spool/torque/server_priv nfs rsize= 8192,wsize=8192,timeo=14,intr

Repeat this step for server2.

Install Torque by following these steps:

Download and extract Torque6.1.0 on server1.

server1# wget http://github.com/adaptivecomputing/torque/branches/6.1.0/torque-6.1.0.tar.gz
server1# tar -xvzf torque-6.1.0.tar.gz

Navigate to the Torque directory and compile Torque with the HA flag on server1.

server1# configure --prefix=/var/spool/torque
server1# make
server1# make install
server1# make packages

If the installation directory is shared on both head nodes, then run make install on server1.

server1# make install

If the installation directory is not shared, repeat step 8a-b (downloading and installing Torque) on server2.

Start trqauthd.

Red Hat 6 or SUSE 11-based systems

server1# service trqauthd start

Red Hat 7 or SUSE 12-based systems

server1# systemctl start trqauthd

Configure Torque for HA.

List the host names of all nodes that run pbs_server in the torque/server_name file. You must also include the host names of all nodes running pbs_server in the torque/server_name file of each MOM node. The syntax of torque/server_name is a comma-delimited list of host names.

server1,server2

Create a simple queue configuration for Torque job queues on server1.

server1# pbs_server -t create
server1# qmgr -c “set server scheduling=true”
server1# qmgr -c “create queue batch queue_type=execution”
server1# qmgr -c “set queue batch started=true”
server1# qmgr -c “set queue batch enabled=true”
server1# qmgr -c “set queue batch resources_default.nodes=1”
server1# qmgr -c “set queue batch resources_default.walltime=3600”
server1# qmgr -c “set server default_queue=batch”

Because TORQUE_HOME is a shared drive, you do not need to repeat this step on server2.

Add the root users of Torque to the Torque configuration as an operator and manager.

server1# qmgr -c “set server managers += root@server1”
server1# qmgr -c “set server managers += root@server2”
server1# qmgr -c “set server operators += root@server1”
server1# qmgr -c “set server operators += root@server2”

Because TORQUE_HOME is a shared drive, you do not need to repeat this step on server2.

You must update the lock file mechanism for Torque in order to determine which server is the primary. To do so, use the lock_file_update_time and lock_file_check_time parameters. The primary pbs_server will update the lock file based on the specified lock_file_update_time (default value of 3 seconds). All backup pbs_servers will check the lock file as indicated by the lock_file_check_time parameter (default value of 9 seconds). The lock_file_update_time must be less than the lock_file_check_time. When a failure occurs, the backup pbs_server takes up to the lock_file_check_time value to take over.

server1# qmgr -c “set server lock_file_check_time=5”
server1# qmgr -c “set server lock_file_update_time=3”

Because TORQUE_HOME is a shared drive, you do not need to repeat this step on server2.

List the servers running pbs_server in the Torqueacl_hosts file.

server1# qmgr -c “set server acl_hosts += server1”
server1# qmgr -c “set server acl_hosts += server2”

Because TORQUE_HOME is a shared drive, you do not need to repeat this step on server2.

Restart the running pbs_server in HA mode.

Red Hat 6 or SUSE 11-based systems

service pbs_server stop

Red Hat 7 or SUSE 12-based systems

systemctl stop pbs_server

Start the pbs_server on the secondary server.

Red Hat 6 or SUSE 11-based systems

service pbs_server start

Red Hat 7 or SUSE 12-based systems

systemctl start pbs_server

Check the status of Torque in HA mode.

server1# qmgr -c “p s”
server2# qmgr -c “p s”

The commands above returns all settings from the active Torque server from either node.

Drop one of the pbs_servers to verify that the secondary server picks up the request.

Red Hat 6 or SUSE 11-based systems

service pbs_server stop

Red Hat 7 or SUSE 12-based systems

systemctl stop pbs_server

Stop the pbs_server on server2 and restart pbs_server on server1 to verify that both nodes can handle a request from the other.

Install a pbs_mom on the compute nodes.

On server1 or server2, configure the nodes file to identify all available MOMs. To do so, edit the / var/spool/torque/server_priv/nodes file.

node1 np=2
node2 np=2

Change the np flag to reflect number of available processors on that node.

Recycle the pbs_servers to verify that they pick up the MOM configuration.

Red Hat 6 or SUSE 11-based systems

service pbs_server stop

Red Hat 7 or SUSE 12-based systems

systemctl stop pbs_server

You can specify command line arguments for pbs_server using the PBS_ARGS environment variable in the /etc/sysconfig/pbs_server file. Set PBS_ARGS=--ha -l <host>:<port> where <host> is the name of the alternate server node and <port> is the port on which pbs_server on the alternate server node is listening (default 15004).

Start the pbs_mom on each execution node.

Red Hat 6 or SUSE 11-based systems

service pbs_mom start

Red Hat 7 or SUSE 12-based systems

systemctl start pbs_mom

5.703.10 Example Setup of High Availability

The machines running pbs_server must have access to a shared server_priv/ directory (usually an NFS share on a MoM).
All MoMs must have the same content in their server_name file. This can be done manually or via an NFS share. The server_name file contains a comma-delimited list of the hosts that run pbs_server.

# List of all servers running pbs_server
server1,server2

The machines running pbs_server must be listed in acl_hosts.

> qmgr -c "set server acl_hosts += server1"
> qmgr -c "set server acl_hosts += server2"

Start pbs_server with the --ha option.

Red Hat 6 or SUSE 11-based systems

[root@server1]$ pbs_server --ha
[root@server2]$ pbs_server --ha

Red Hat 7 or SUSE 12-based systems

[root@server1]$ systemctl start pbs_server
[root@server2]$ systemctl start pbs_server