Moab Workload Manager

2.5 Initial Moab Testing

Moab has been designed with a number of key features that allow testing to occur in a no risk environment. These features allow you to safely run Moab in test mode even with another scheduler running whether it be an earlier version of Moab or another scheduler altogether. In test mode, Moab collects real-time job and node information from your resource managers and acts as if it were scheduling live. However, its ability to actually affect jobs (that is, start, modify, cancel, charge, and so forth) is disabled.

Moab offers the following test modes to provide a means for verifying such things as proper configuration and operation:

2.5.1 Minimal Configuration Required To Start

2.5.1 Scheduler Modes

Central to Moab testing is the MODE attribute of the SCHEDCFG parameter. This parameter attribute allows administrators to determine how Moab will run. The possible values for MODE are NORMAL, MONITOR, INTERACTIVE, and SIMULATION. For example, to request monitor mode operation, include the line SCHEDCFG MODE=MONITOR in the moab.cfg file.

2.5.1.1 Normal Mode

If initial evaluation is complete or not required, you can place the scheduler directly into production by setting the MODE attribute of the SCHEDCFG parameter to NORMAL and (re)starting the scheduler.

2.5.1.2 Monitor Mode (or Test Mode)

Monitor mode allows evaluation of new Moab releases, configurations, and policies in a risk-free manner. In monitor mode, the scheduler connects to the resource manager(s) and obtains live resource and workload information. Using the policies specified in the moab.cfg file, the monitor-mode Moab behaves identical to a live or normal-mode Moab except the ability to start, cancel, or modify jobs is disabled. In addtion, allocation management does not occur in monitor mode. This allows safe diagnosis of the scheduling state and behavior using the various diagnostic client commands. Further, the log output can also be evaluated to see if any unexpected situations have arisen. At any point, the scheduler can be dynamically changed from monitor to normal mode to begin live scheduling.

To set up Moab in monitor mode, do the following:

> vi moab.cfg
  (change the MODE attribute of the SCHEDCFG parameter from NORMAL to MONITOR)
> moab

Remember that Moab running in monitor mode will not interfere with your production scheduler.

2.5.1.2.1 Running Multiple Moab Instances Simultaneously

If running multiple instances of Moab, whether in simulation, normal, or monitor mode, make certain that each instance resides in a different home directory to prevent conflicts with configuration, log, and statistics files. Before starting each additional Moab, set the MOABHOMEDIR environment variable in the execution environment to point to the local home directory. Also, each instance of Moab should run using a different port to avoid conflicts.

If running multiple versions of Moab, not just different Moab modes or configurations, set the $PATH variable to point to the appropriate Moab binaries.

To point Moab client commands (such as showq) to the proper Moab server, use the appropriate command line arguments or set the environment variable MOABHOMEDIR in the client execution environment as in the following example:

# point moab clients/server to new configuration
> export MOABHOMEDIR=/opt/moab-monitor

# set path to new binaries (optional)
> export PATH=/opt/moab-monitor/bin:/opt/moab-monitor/sbin:$PATH

# start Moab server
> moab

# query Moab server
> showq

> moabd is a safe and recommended method of starting Moab if things are not installed in their default locations.

2.5.1.3 Interactive Mode

Interactive mode allows for evaluation of new versions and configurations in a manner different from monitor mode. Instead of disabling all resource and job control functions, Moab sends the desired change request to the screen and asks for permission to complete it. For example, before starting a job, Moab may post something like the following to the screen:

Command:  start job 1139.ncsa.edu on node list test013,test017,test018,test021
Accept:  (y/n) [default: n]?

The administrator must specifically accept each command request after verifying that it correctly meets desired site policies. Moab will then execute the specified command. This mode is useful in validating scheduler behavior and can be used until configuration is appropriately tuned and all parties are comfortable with the scheduler's performance. In most cases, sites will want to set the scheduling mode to normal after verifying correct behavior.

2.5.1.4 Simulation Mode

Simulation mode is of value in performing a test drive of the scheduler or when a stable production system exists and an evaluation is desired of how various policies can improve the current performance.

The initial test drive simulation can be configured using the following steps:

(Consider viewing the simulation configuration demo.)

> vi moab.cfg

(change the MODE attribute of the SCHEDCFG parameter from NORMAL to SIMULATION)
(add 'SIMRESOURCETRACEFILE  traces/Resource.Trace1')
(add 'SIMWORKLOADTRACEFILE  traces/Workload.Trace1')

> moab &

In simulation mode, the scheduler does not background itself as it does in monitor and normal modes.

The sample workload and resource traces files allow the simulation to emulate a 192 node IBM SP. In this mode, all Moab commands can be run as if on a normal system. The mschedctl command can be used to advance the simulation through time. The Simulation section describes the use of the simulator in detail.

If you are familiar with Moab, you may want to use the simulator to tune scheduling policies for your own workload and system. The resource and workload traces are described further in the Collecting Traces section. Generally, at least a week's worth of workload should be collected to make the results of the simulation statistically meaningful. Once the traces are collected, the simulation can be started with some initial policy settings. Typically, the scheduler is able to simulate between 10 and 100 minutes of wallclock time per second for medium to large systems. As the simulation proceeds, various statistics can be monitored if desired. At any point, the simulation can be ended and the statistics of interest recorded. One or more policies can be modified, the simulation re-run, and the results compared. Once you are satisfied with the scheduling results, you can run the scheduler live with the tuned policies.