Moab Workload Manager

21.2 Enabling High Availability Features

21.2.1 High Availability Overview

High availability allows Moab to run on two different machines: a primary and secondary server. The configuration method to achieve this behavior takes advantage of a networked file system to configure two Moab servers with only one operating at a time.

When configured to run on a networked file system — any networked file system that supports file locking is supported — the first Moab server that starts locks a particular file. The second Moab server waits on that lock and only begins scheduling when it gains control of the lock on the file. This method achieves near instantaneous turnover between failures and eliminates the need for two Moab servers to synchronize information periodically as the two Moab servers access the same database/checkpoint file.

Note As Moab uses timestamping in the lock file to implement high availability, the clocks on both servers require synchronization; all machines in a cluster must be synchronized to the same time server.

Moab high availability and TORQUE high availability operate independently of each other. If a job is submitted with msub and the primary Moab server is down, msub tries to connect to the fallback Moab server. Once the job is given to TORQUE, if TORQUE can't connect to the primary pbs_server, it tries to connec to the the fallback pbs_server. For example:

A job is submitted with msub, but Moab is down on server01, so msub contacts Moab running on server02.

A job is submitted with msub and Moab hands it off to TORQUE, but pbs_server is down on server01, so qsub contacts pbs_server running on server02.

21.2.2.1 Configuring High Availability on a Networked File System

Because the two Moab servers access the same files, configuration is only required in the moab.cfg file. The two hosts that run Moab must be configured with the SERVER and FBSERVER parameters. Enable file lock with the FLAGS=filelockha parameter. Specify the lock file with the HALOCKFILE parameter. The following example illustrates a possible configuration:

SCHEDCFG[Moab]	SERVER=host1:42559
SCHEDCFG[Moab]	FBSERVER=host2
SCHEDCFG[Moab]	FLAGS=filelockha

SCHEDCFG[Moab]	HALOCKFILE=/opt/moab/.moab_lock

Use the HALOCKUPDATETIME parameter to specify how frequently the primary server updates the timestamp on the lock file. Use the HALOCKCHECKTIME parameter to specify how frequently the secondary server checks the timestamp on the lock file.

HALOCKCHECKTIME 9
HALOCKUPDATETIME 3

In the preceding example, the secondary server checks the lock file for updates every 9 seconds. The HALOCKUPDATETIME parameter is set to 3 seconds, permitting the primary server three opportunities to update the timestamp for each time the secondary server checks the timestamp on the lock file.

Note FBSERVER does not take a port number. The primary server's port is used for both the primary server and the fallback server.

21.2.2.2 Confirming High Availability on a Networked File System

Adminstrators can run the mdiag -S -v command to view which Moab server is currently scheduling and responding to client requests.

21.2.3 Other High Availability Configuration

Moab has many features to improve the availability of a cluster beyond the ability to automatically relocate to another execution server. The following table describes some of these features.

Feature Description
If a node allocated to an active job fails, it is possible for the job to continue running indefinitely even though the output it produces is of no value. Setting this parameter allows the scheduler to automatically preempt these jobs when a node failure is detected, possibly allowing the job to run elsewhere and also allowing other allocated nodes to be used by other jobs.
If a catastrophic failure event occurs (SIGSEGV or SIGILL signal is triggered), Moab can be configured to automatically restart, trap the failure, ignore the failure, or behave in the default manner for the specified signal. These actions are specified using the values RESTART, TRAP, IGNORE, or DIE, as in the following example:

SCHEDCFG[bas] MODE=NORMAL RECOVERYACTION=RESTART