Moab Workload Manager

18.1 Establishing Script Interaction between Moab and a Power Management Tool

On the same node on which Moab is running, there must be a command for Moab to call to switch power on or off. The command is usually a script that interacts with your preferred power management tool. Moab calls the script as follows:

<Script Name> <Node Name/List> <ON|OFF>

> /opt/moab/tools/node.power.pl node002 ON

<Node List> is a comma delimited list of nodes. To power off multiple nodes, try:

> /opt/moab/tools/node.power.pl node002,node003,node004 OFF

To enable script interaction between Moab and your preferred power management tool, configure the NODEPOWERURL parameter on a per-resource manager basis.

RMCFG[prov]     TYPE=NATIVE RESOURCETYPE=PROV
RMCFG[prov]     NODEPOWERURL=exec://$HOME/tools/prov/node.power.pl
RMCFG[prov]     CLUSTERQUERYURL=exec://$HOME/tools/prov/cluster.query.pl

Run mdiag -R -v to see if the provision resource manager configured successfully. The output will display a number of nodes if Moab accessed the node power script and Cluster Query URL script properly. If a number of nodes is not reported, the script is either not running properly or not reporting to Moab correctly. If the script is not working correctly, verify that the script can be run manually and that no environment variables are required for the script to run, such as:
Note The Cluster Query script can call out commands that are normally in the $PATH when they are not set. When this occurs, the script fails and the commands may be difficult to find.

The scipt's purpose is to provide Moab information about whether power to nodes is on or off; such information is relayed for each node with a POWER value of either "on" or "off." The actual state of nodes is internally tracked by Moab. When Moab powers off a node, it is still eligible to run jobs, so even though it is powered off, its state is idle. Thus, this script must report a STATE of "Unknown" to prevent Moab from considering the node unavailable. When a node is powered off outside of a Moab action (such as a node failure), then Moab recognizes the state being reported from its resource manager as down (rendering it unavailable). To prevent Moab from considering a node unavailable, as previously mentioned, the STATE must be reported as "Unknown." The following is sample output for cluster.query.pl:

x01 POWER=on STATE=Unknown
x02 POWER=off STATE=Unknown
x03 POWER=off STATE=Unknown
x04 POWER=off STATE=Unknown
x05 POWER=off STATE=Unknown