Moab allows organizations to enable generic performance metrics. These metrics allow decisions to be made and reports to be generated based on site specific environmental factors. This increases Moab's awareness of what is occurring within a given cluster environment, and allows arbitrary information to be associated with resources and the workload within the cluster. Uses of these metrics are widespread and can cover anything from tracking node temperature, to memory faults, to application effectiveness.
A new generic metric is automatically created and tracked at the server level if it is reported by either a node or a job.
To associate a generic metric with a job or node, a native resource manager must be set up and the GMETRIC attribute must be specified. For example, to associate a generic metric of temp with each node in a TORQUE cluster, the following could be reported by a native resource manager:
# temperature output node001 GMETRIC[temp]=113 node002 GMETRIC[temp]=107 node003 GMETRIC[temp]=83 node004 GMETRIC[temp]=85 ...
Generic metrics are tracked as floating point values allowing virtually any number to be reported. |
In the preceding example, the new metric, temp, can now be used to monitor system usage and performance or to allow the scheduler to take action should certain thresholds be reached. Some uses include the following:
Generic metric values can be viewed using checkjob, checknode, mdiag -n,mdiag -j, or Moab Cluster Manager Charting and Reporting Features.
Historical job and node generic metric statistics can be cleared using the mjobctl and mnodectl commands. |
As an example, consider a cluster with two primary purposes for generic metrics. The first purpose is to track and adjust scheduling behavior based on node temperature to mitigate overheating nodes. The second purpose is to track and charge for utilization of a locally developed data staging service.
The first step in enabling a generic metric is to create probes to monitor and report this information. Depending on the environment, this information may be distributed or centralized. In the case of temperature monitoring, this information is often centralized by a hardware monitoring service and available via command line or an API. If monitoring a locally developed data staging service, this information may need to be collected from multiple remote nodes and aggregated to a central location. The following are popular freely available monitoring tools:
Tool | Link |
---|---|
BigBrother | http://www.bb4.org |
Ganglia | http://ganglia.sourceforge.net |
Monit | http://www.tildeslash.com/monit |
Nagios | http://www.nagios.org |
Once the needed probes are in place, a native resource manager interface must be created to report this information to Moab. Creating a native resource manager interface should be very simple, and in most cases a script similar to those found in the $TOOLSDIR ($PREFIX/tools) directory can be used as a template. For this example, we will assume centralized information and will use the RM script that follows:
#!/usr/bin/perl # 'hwctl outputs information in format '<NODEID> <TEMP>' open(TQUERY,"/usr/sbin/hwctl -q temp |"); while (<TQUERY>) { my $nodeid,$temp = split /\w+/; $dstage=GetDSUsage($nodeid); print "$nodeid GMETRIC[temp]=$temp GMETRIC[dstage]=$dstage "; }
With the script complete, the next step is to integrate this information into Moab. This is accomplished with the following configuration line:
RMCFG[local] TYPE=NATIVE CLUSTERQUERYURL=exec://$TOOLSDIR/node.query.local.pl ...
Moab can now be recycled and temperature and data staging usage information will be integrated into Moab compute node reports. If the checknode command is run, output similar to the following is reported:
> checknode cluster013 ... Generic Metrics: temp=113.2,dstage=23748 ...
Moab Cluster Manager reports full current and historical generic metric information in its visual cluster overview screen.
The next step in configuring Moab is to inform Moab to take certain actions based on the new information it is tracking. For this example, there are two purposes. The first purpose is to get jobs to avoid hot nodes when possible. This is accomplished using the GMETRIC attribute of the Node Allocation Priority function as in the following example:
NODEALLOCATIONPOLICY PRIORITY NODECFG[DEFAULT] PRIORITYF=PRIORITY-10*GMETRIC[temp] ...
This simple priority function reduces the priority of the hottest nodes making such less likely to be allocated. See Node Allocation Priority Factors for a complete list of available priority factors.
The example cluster is also interested in notifying administrators if the temperature of a given node ever exceeds a critical threshold. This is accomplished using a trigger. The following line will send email to administrators any time the temperature of a node exceeds 120 degrees.
NODECFG[DEFAULT] TRIGGER=atype=mail,etype=threshold,threshold=gmetric[temp]>120,action='warning: node $OID temp high' ...