Moab supports a scheduling mode called MONITOR. In this mode, the scheduler initializes, contacts the resource manager and other peer services, and conducts scheduling cycles exactly as it would if running in NORMAL or production mode. Jobs are prioritized, reservations created, policies and limits enforced, and administrator and end-user commands enabled. The key difference is that although live resource management information is loaded, MONITOR mode disables Moab's ability to start, preempt, cancel, or otherwise modify jobs or resources. Moab continues to attempt to schedule exactly as it would in NORMAL mode, but its ability to actually impact the system is disabled. Using this mode, a site can quickly verify correct resource manager configuration and scheduler operation. This mode can also be used to validate new policies and constraints. In fact, Moab can be run in MONITOR mode on a production system while another scheduler or even another version of Moab is running on the same system. This unique ability can allow new versions and configurations to be fully tested without any exposure to potential failures and with no cluster downtime.
To run Moab in MONITOR mode, simply set the MODE attribute of the SCHEDCFG parameter to MONITOR and start Moab. Normal scheduler commands can be used to evaluate configuration and performance. Diagnostic commands can be used to look for any potential issues. Further, the Moab log file can be used to determine which jobs Moab attempted to start, and which resources Moab attempted to allocate.
If another instance of Moab is running in production and a site adminstrator wants to evaluate an alternate configuration or new version, this is easily done but care should be taken to avoid conflicts with the primary scheduler. Potential conflicts include statistics files, logs, checkpoint files, and user interface ports. One of the easiest ways to avoid these conflicts is to create a new test directory with its own log and statisticss subdirectories. The new moab.cfg file can be created from scratch or based on the existing moab.cfg file already in use. In either case, make certain that the PORT attribute of the SCHEDCFG parameter differs from that used by the production scheduler by at least two ports. If testing with the production binary executable, the MOABHOMEDIR environment variable should be set to point to the new test directory to prevent Moab from loading the production moab.cfg file.
TEST mode behaves much like MONITOR mode with the exception that Moab will log the scheduling actions it would have taken to the stats/<DAY>.events file. Using this file, sites can determine the actions Moab would have taken if running in NORMAL mode and verify all actions are in agreement with expected behavior.
INTERACTIVE mode allows for evaluation of new versions and configurations in a manner different from MONITOR mode. Instead of disabling all resource and job control functions, Moab sends the desired change request to the screen and requests permission to complete it. For example, before starting a job, Moab may print something like the following to the screen:
Command: start job 1139.ncsa.edu on node list test013,test017,test018,test021 Accept: (y/n) [default: n]?
The administrator must specifically accept each command request after verifying it correctly meets desired site policies. Moab will then execute the specified command. This mode is highly useful in validating scheduler behavior and can be used until configuration is appropriately tuned and all parties are comfortable with the scheduler's performance. In most cases, sites will want to set the scheduling mode to NORMAL after verifying correct behavior.
By default, Moab runs in a mode called NORMAL, which indicates that it is responsible for the cluster. It loads workload and resource information, and is responsible for managing that workload according to mission objectives and policies. It starts, cancels, preempts, and modifies jobs according to these policies.
If Moab is configured to use a mode called TEST, it loads all information, performs all analysis, but, instead of actually starting or modifying a job, it merely logs the fact that it would have done so. A test instance of Moab can run at the same time as a production instance of Moab. A test instance of Moab can also run while a production scheduler of another type (such as PBS, LSF, or SLURM) is simultaneously running. This multi-scheduler ability allows stability and performance tests to be conducted that can help answer the following questions:
In test mode, all of Moab's commands and services operate normally allowing the use of client commands to perform analysis. In most cases, the mdiag command is of greatest value, displaying loaded values as well as reporting detected failures, inconsistencies, and object corruption. The following table highlights the most common diagnostics performed.
Command | Object |
---|---|
mdiag -n | Compute nodes, storage systems, network systems, and generic resources |
mdiag -j | Applications, dynamic and static jobs |
mdiag -u mdiag -g mdiag -a |
User, group, and account credentials |
mdiag -c | Queues and policies |
mdiag -R | Resource manager interface and performance |
mdiag -S | Scheduler/system level failures introduced by corrupt information |
These commands will not only verify proper scheduling objects but will also analyze the behavior of each resource manager, recording failures, and delivered performance. If any misconfiguration, corruption, interface failure, or internal failure is detected, it can be addressed in the test mode instance of Moab with no urgency or risk to production cluster activities.
The first aspect of verifying a new policy is verifying correct syntax and semantics. If using Moab Cluster Manager, this step is not necessary as this tool automatically verifies proper policy specification. If manually editing the moab.cfg file, the following command can be used for validation:
> mdiag -C
This command will validate the configuration file and report any misconfiguration.
If concern exists over the impact of a new policy, an administrator can babysit Moab by putting it into INTERACTIVE mode. In this mode, Moab will schedule according to all mission objectives and policies, but before taking any action, it will request that the administrator confirm the action. See the interactive mode overview for more information.
In this mode, only actions approved by the administrator will be carried out. Once proper behavior is verified, the Moab mode can be set to NORMAL.
If a new policy has the potential to impact long-term performance or resource distribution, it may be desirable to run a Moab simulation to evaluate this change. Simulations allow locally recorded workload to be translated into simulation jobs and execute on a virtual cluster that emulates local resources. Simulations import all job and resource attributes that are loaded in a production environment as well as all policies specified in any configuration file. While running, all Moab commands and statistics are fully supported.
Using simulation, a control run can be made using the original policies and the behavior of this run compared to a second run that contains the specified change. Moab Cluster Manager's charting, graphing, and reporting features can be used to report on and visualize the differences in these two runs. Typically, a two-month real-time simulation can be completed in under an hour. For more information on simulations, see the Simulation Overview.
Moab provides an additional evaluation method that allows a production cluster or other resource to be logically partitioned along resource and workload boundaries and allows different instances of Moab to schedule different partitions. The parameters IGNORENODES, IGNORECLASSES, IGNOREJOBS, and IGNOREUSERS are used to specify how the system is to be partitioned. In the following example, a small portion of an existing cluster is partitioned for temporary grid testing so that there is no impact on the production workload.
SCHEDCFG[prod] MODE=NORMAL SERVER=orion.cxz.com:42020 RMCFG[TORQUE] TYPE=PBS IGNORENODES node61,node62,node63,node64 IGNOREUSERS gridtest1,gridtest2 ...
SCHEDCFG[prod] MODE=NORMAL SERVER=orion.cxz.com:42030 RMCFG[TORQUE] TYPE=PBS IGNORENODES !node61,node62,node63,node64 IGNOREUSERS !gridtest1,gridtest2 ...
In the previous example, two completely independent Moab servers schedule the cluster. The first server handles all jobs and nodes except for the ones involved in the test. The second server handles only test nodes and test jobs. While both servers actively talk and interact with a single TORQUE resource manager, the IGNORE* parameters cause them to not schedule, nor even see the other partition and its associated workload.
When enabling Moab side-by-side, each Moab server should have an independent home directory to prevent logging and statistics conflicts. Also, in this environment, each Moab server should communicate with its client commands using a different port as shown in the previous example. |
When specifying the IGNORENODES parameter, the exact node names, as returned by the resource manager, should be specified. |