(Click to open topic with navigation)
Torque can run in a redundant or high availability mode. This means that there can be multiple instances of the server running and waiting to take over processing in the event that the primary server fails.
The high availability feature is available in the 2.3 and later versions of Torque.
The "native" high availability implementation, as described here, is only suitable for Moab Basic Edition. Contact Adaptive Computing for information on high availability for Enterprise Edition.
For more details, see these sections:
5.703.1 Redundant server host machines
High availability enables Moab HPC Suite to continue running even if pbs_server is brought down. This is done by running multiple copies of pbs_server which have their torque/server_priv directory mounted on a shared file system.
Do not use symlinks when sharing the Torque home directory or server_priv directories. A workaround for this is to use mount --rbind /path/to/share /var/spool/torque. Also, it is highly recommended that you only share the server_priv and not the entire TORQUE_HOMEDIR.
The torque/server_name must include the host names of all nodes that run pbs_server. All MOM nodes also must include the host names of all nodes running pbs_server in their torque/server_name file. The syntax of the torque/server_name is a comma delimited list of host names.
For example:
host1,host2,host3
When configuring high availability, do not use $pbsserver in the pbs_mom configuration to specify the server host names. You must use the TORQUE_HOMEDIR/server_name file.
All instances of pbs_server need to be started with the --ha command line option that allows the servers to run at the same time. Only the first server to start will complete the full startup. The second server to start will block very early in the startup when it tries to lock the file torque/server_priv/server.lock. When the second server cannot obtain the lock, it will spin in a loop and wait for the lock to clear. The sleep time between checks of the lock file is one second.
Notice that not only can the servers run on independent server hardware, there can also be multiple instances of the pbs_server running on the same machine. This was not possible before as the second one to start would always write an error and quit when it could not obtain the lock.
5.703.2 Enabling High Availability
To use high availability, you must start each instance of pbs_server with the --ha option.
Three server options help manage high availability. The server parameters are lock_file, lock_file_update_time, and lock_file_check_time.
The lock_file option allows the administrator to change the location of the lock file. The default location is torque/server_priv. If the lock_file option is used, the new location must be on the shared partition so all servers have access.
The lock_file_update_time and lock_file_check_time parameters are used by the servers to determine if the primary server is active. The primary pbs_server will update the lock file based on the lock_file_update_time (default value of 3 seconds). All backup pbs_servers will check the lock file as indicated by the lock_file_check_time parameter (default value of 9 seconds). The lock_file_update_time must be less than the lock_file_check_time. When a failure occurs, the backup pbs_server takes up to the lock_file_check_time value to take over.
> qmgr -c "set server lock_file_check_time=5"
In the above example, after the primary pbs_server goes down, the backup pbs_server takes up to 5 seconds to take over. It takes additional time for all MOMs to switch over to the new pbs_server.
The clock on the primary and redundant servers must be synchronized in order for high availability to work. Use a utility such as NTP to ensure your servers have a synchronized time.
Do not use anything but a simple NFS fileshare that is not used by anything else (i.e., only Moab can use the fileshare).
Do not use a general-purpose NAS, parallel file system, or company-wide shared infrastructure to set up Moab high availability using "native" high availability.
5.703.3 Enhanced High Availability with Moab
When Torque is run with an external scheduler such as Moab, and the pbs_server is not running on the same host as Moab, pbs_server needs to know where to find the scheduler. To do this, use the -l option as demonstrated in the example below (the port is required and the default is 15004).
> pbs_server -l <moabhost:port>
Set the PBS_ARGS environment variable in the /etc/sysconfig/pbs_server file. Set PBS_ARGS=-l <moabhost:port>
where moabhost
is the name of the alternate server node and port
is the port on which Moab on the alternate server node is listening (default 15004).
If Moab is running in HA mode, set the -l option for each redundant server.
> pbs_server -l <moabhost1:port> -l <moabhost2:port>
Set the PBS_ARGS environment variable in the /etc/sysconfig/pbs_server file to PBS_ARGS=-l <moabhost:port> -l <moabhost2:port>
.
If pbs_server and Moab run on the same host, use the --ha option as demonstrated in the example below.
> pbs_server --ha
Set the PBS_ARGS environment variable in the /etc/sysconfig/pbs_server file toPBS_ARGS=--ha
.
The root user of each Moab host must be added to the operators and managers lists of the server. This enables Moab to execute root level operations in Torque.
5.703.4 How Commands Select the Correct Server Host
The various commands that send messages to pbs_server usually have an option of specifying the server name on the command line, or if none is specified will use the default server name. The default server name comes either from the environment variable PBS_DEFAULT or from the file torque/server_name.
When a command is executed and no explicit server is mentioned, an attempt is made to connect to the first server name in the list of hosts from PBS_DEFAULT or torque/server_name. If this fails, the next server name is tried. If all servers in the list are unreachable, an error is returned and the command fails.
Note that there is a period of time after the failure of the current server during which the new server is starting up where it is unable to process commands. The new server must read the existing configuration and job information from the disk, so the length of time that commands cannot be received varies. Commands issued during this period of time might fail due to timeouts expiring.
Job names normally contain the name of the host machine where pbs_server is running. When job names are constructed, only the server name in $PBS_DEFAULT or the first name from the server specification list, TORQUE_HOME/server_name, is used in building the job name.
5.703.6 Persistence of the pbs_server Process
The system administrator must ensure that pbs_server continues to run on the server nodes. This could be as simple as a cron job that counts the number of pbs_server's in the process table and starts some more if needed.
5.703.7 High Availability of the NFS Server
Before installing a specific NFS HA solution please contact Adaptive Computing Support for a detailed discussion on NFS HA type and implementation path.
One consideration of this implementation is that it depends on NFS file system also being redundant. NFS can be set up as a redundant service. See the following.
There are also other ways to set up a shared file system. See the following:
5.703.8 Installing Torque in High Availability Mode
The following procedure demonstrates a Torque installation in high availability (HA) mode.
To install Torque in HA mode
> service iptables stop > chkconfig iptables off
> systemctl stop firewalld > systemctl disable firewalld
If you are unable to stop the firewall due to infrastructure restriction, open the following ports:
> vi /etc/sysconfig/selinux SELINUX=disabled
Torque
export TORQUE_HOME=/var/spool/torque
# Library Path
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${TORQUE_HOME}/lib
# Update system paths
export PATH=${TORQUE_HOME}/bin:${TORQUE_HOME}/sbin:${PATH}
fileServer# mkdir -m 0755 /var/spool/torque fileServer# mkdir -m 0750 /var/spool/torque/server_priv
/var/spool/torque/server_priv 192.168.0.0/255.255.255.0(rw,sync,no_root_squash)
fileServer# exportfs -r
> service rpcbind restart > service nfs-server start > service nfs-lock start > service nfs-idmap start
> systemctl restart rpcbind.service > systemctl start nfs-server.service > systemctl start nfs-lock.service > systemctl start nfs-idmap.service
server1# mkdir /var/spool/torque/server_priv
Repeat this process for server2.
fileServer:/var/spool/torque/server_priv /var/spool/torque/server_priv nfs rsize= 8192,wsize=8192,timeo=14,intr
Repeat this step for server2.
server1# wget http://github.com/adaptivecomputing/torque/branches/6.1.0/torque-6.1.0.tar.gz server1# tar -xvzf torque-6.1.0.tar.gz
server1# configure server1# make server1# make install server1# make packages
server1# make install
If the installation directory is not shared, repeat step 8a-b (downloading and installing Torque) on server2.
server1# service trqauthd start
server1# systemctl start trqauthd
List the host names of all nodes that run pbs_server in the torque/server_name file. You must also include the host names of all nodes running pbs_server in the torque/server_name file of each MOM node. The syntax of torque/server_name is a comma-delimited list of host names.
server1,server2
server1# pbs_server -t create server1# qmgr -c “set server scheduling=true” server1# qmgr -c “create queue batch queue_type=execution” server1# qmgr -c “set queue batch started=true” server1# qmgr -c “set queue batch enabled=true” server1# qmgr -c “set queue batch resources_default.nodes=1” server1# qmgr -c “set queue batch resources_default.walltime=3600” server1# qmgr -c “set server default_queue=batch”
Because server_priv/* is a shared drive, you do not need to repeat this step on server2.
server1# qmgr -c “set server managers += root@server1” server1# qmgr -c “set server managers += root@server2” server1# qmgr -c “set server operators += root@server1” server1# qmgr -c “set server operators += root@server2”
Because server_priv/* is a shared drive, you do not need to repeat this step on Server 2.
server1# qmgr -c “set server lock_file_check_time=5” server1# qmgr -c “set server lock_file_update_time=3”
Because server_priv/* is a shared drive, you do not need to repeat this step on server2.
server1# qmgr -c “set server acl_hosts += server1” server1# qmgr -c “set server acl_hosts += server2”
Because server_priv/* is a shared drive, you do not need to repeat this step on server2.
service pbs_server stop
systemctl stop pbs_server
service pbs_server start
systemctl start pbs_server
server1# qmgr -c “p s” server2# qmgr -c “p s”
The commands above returns all settings from the active Torque server from either node.
service pbs_server stop
systemctl stop pbs_server
node1 torque-package-mom-linux-x86_64.sh --install node2 torque-package-clients-linux-x86_64.sh --install
Repeat this for each compute node. Verify that the /var/spool/torque/server-name file shows all your compute nodes.
node1 np=2 node2 np=2
Change the np flag to reflect number of available processors on that node.
service pbs_server stop
systemctl stop pbs_server
service pbs_mom start
systemctl start pbs_mom
5.703.9 Installing Torque in High Availability Mode on Headless Nodes
The following procedure demonstrates a Torque installation in high availability (HA) mode on nodes with no local hard drive.
To install Torque in HA mode on a node with no local hard drive
> service iptables stop > chkconfig iptables off
> systemctl stop firewalld > systemctl disable firewalld
If you are unable to stop the firewall due to infrastructure restriction, open the following ports:
> vi /etc/sysconfig/selinux SELINUX=disabled
# Torque
export TORQUE_HOME=/var/spool/torque
# Library Path
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${TORQUE_HOME}/lib
# Update system paths
export PATH=${TORQUE_HOME}/bin:${TORQUE_HOME}/sbin:${PATH}
fileServer# mkdir -m 0755 /var/spool/torque
/var/spool/torque/ 192.168.0.0/255.255.255.0(rw,sync,no_root_squash)
fileServer# exportfs -r
> service rpcbind restart > service nfs-server start > service nfs-lock start > service nfs-idmap start
> systemctl restart rpcbind.service > systemctl start nfs-server.service > systemctl start nfs-lock.service > systemctl start nfs-idmap.service
server1# mkdir /var/spool/torque
Repeat this process for server2.
fileServer:/var/spool/torque/server_priv /var/spool/torque/server_priv nfs rsize= 8192,wsize=8192,timeo=14,intr
Repeat this step for server2.
server1# wget http://github.com/adaptivecomputing/torque/branches/6.1.0/torque-6.1.0.tar.gz server1# tar -xvzf torque-6.1.0.tar.gz
server1# configure --prefix=/var/spool/torque server1# make server1# make install server1# make packages
server1# make install
If the installation directory is not shared, repeat step 8a-b (downloading and installing Torque) on server2.
server1# service trqauthd start
server1# systemctl start trqauthd
List the host names of all nodes that run pbs_server in the torque/server_name file. You must also include the host names of all nodes running pbs_server in the torque/server_name file of each MOM node. The syntax of torque/server_name is a comma-delimited list of host names.
server1,server2
server1# pbs_server -t create server1# qmgr -c “set server scheduling=true” server1# qmgr -c “create queue batch queue_type=execution” server1# qmgr -c “set queue batch started=true” server1# qmgr -c “set queue batch enabled=true” server1# qmgr -c “set queue batch resources_default.nodes=1” server1# qmgr -c “set queue batch resources_default.walltime=3600” server1# qmgr -c “set server default_queue=batch”
Because TORQUE_HOME is a shared drive, you do not need to repeat this step on server2.
server1# qmgr -c “set server managers += root@server1” server1# qmgr -c “set server managers += root@server2” server1# qmgr -c “set server operators += root@server1” server1# qmgr -c “set server operators += root@server2”
Because TORQUE_HOME is a shared drive, you do not need to repeat this step on server2.
server1# qmgr -c “set server lock_file_check_time=5” server1# qmgr -c “set server lock_file_update_time=3”
Because TORQUE_HOME is a shared drive, you do not need to repeat this step on server2.
server1# qmgr -c “set server acl_hosts += server1” server1# qmgr -c “set server acl_hosts += server2”
Because TORQUE_HOME is a shared drive, you do not need to repeat this step on server2.
service pbs_server stop
systemctl stop pbs_server
service pbs_server start
systemctl start pbs_server
server1# qmgr -c “p s” server2# qmgr -c “p s”
The commands above returns all settings from the active Torque server from either node.
service pbs_server stop
systemctl stop pbs_server
node1 np=2 node2 np=2
Change the np flag to reflect number of available processors on that node.
service pbs_server stop
systemctl stop pbs_server
You can specify command line arguments for pbs_server using the PBS_ARGS environment variable in the /etc/sysconfig/pbs_server file. Set PBS_ARGS=--ha -l <host>:<port>
where <host>
is the name of the alternate server node and <port>
is the port on which pbs_server on the alternate server node is listening (default 15004).
service pbs_mom start
systemctl start pbs_mom
5.703.10 Example Setup of High Availability
# List of all servers running pbs_server server1,server2
> qmgr -c "set server acl_hosts += server1" > qmgr -c "set server acl_hosts += server2"
[root@server1]$ pbs_server --ha [root@server2]$ pbs_server --ha
[root@server1]$ systemctl start pbs_server [root@server2]$ systemctl start pbs_server
Related Topics