12.0 Resource Managers and Interfaces > Managing Resources Directly with the Native Interface

Conventions

12.5 Managing Resources Directly with the Native Interface

12.5-A Native Interface Overview

The Native interface allows a site to augment or even fully replace a resource manager for managing resources. In some situations, the full capabilities of the resource manager are not needed and a lower cost or lower overhead alternative is preferred. In other cases, the nature of the environment may make use of a resource manager impossible due to lack of support. Still, in other situations it is desirable to provide information about additional resource attributes, constraints, or state from alternate sources.

In any case, Moab provides the ability to directly query and manage resources alongside of or without the use of a resource manager. This interface, called the NATIVE interface can also be used to launch, cancel, and otherwise manage jobs. This NATIVE interface offers several advantages including the following:

However, the NATIVE interface may also have some drawbacks.

At a high level, the native interface works by launching threaded calls to perform standard resource manager activities such as managing resources and jobs. The desired calls are configured within Moab and used whenever an action or updated information is required.

12.5-B Configuring the Native Interface

Using the native interface consists of defining the interface type and location. As mentioned earlier, a single object may be fully defined by multiple interfaces simultaneously with each interface updating a particular aspect of the object.

Configuring the Resource Manager

The Native resource manager must be configured using the RMCFG parameter. To specify the native interface, the TYPE attribute must be set to NATIVE.

RMCFG[local] TYPE=NATIVE
RMCFG[local] CLUSTERQUERYURL=exec:///tmp/query.sh

Reporting Resources

To indicate the source of the resource information, the CLUSTERQUERYURL attribute of the RMCFG parameter should be specified. This attribute is specified as a URL where the protocols FILE, EXEC, and SQL are allowed. If a protocol is not specified, the protocol EXEC is assumed.

Format Description
EXEC Execute the script specified by the URL path. Use the script stdout as data.
FILE Load the file specified by the URL path. Use the file contents as data.
SQL Load data directly from an SQL database using the FULL format described below.

Moab considers a NativeRM script to have failed if it returns with a non-zero exit code, or if the CHILDSTDERRCHECK parameter is set and its appropriate conditions are met. In addition, the NativeRM script associated with a job submit URL will be considered as having failed if its standard output stream contains the text ERROR.

This simple example queries a file on the server for information about every node in the cluster. This differs from Moab remotely querying the status of each node individually.

RMCFG[local]    TYPE=NATIVE
RMCFG[local]    CLUSTERQUERYURL=file:///tmp/query.txt

12.5-C Generating Cluster Query Data

Flat Cluster Query Data

If the EXEC or FILE protocol is specified in the CLUSTERQUERYURL attribute, the data should provide flat text strings indicating the state and attributes of the node. The format follows the Moab Resource Manager Language Interface Specification where attributes are delimited by white space rather than ';' (See Resource Data Format):

Describes any set of node attributes with format: <NAME><ATTR>=<VAL> [<ATTR>=<VAL>]...

<NAME> Name of node
<ATTR> Node attribute
<VAL> Value of node attribute
n17 CPROC=4 AMEMORY=100980 STATE=idle

12.5-D Interfacing to FLEXlm

Moab can interface with FLEXlm to provide scheduling based on license availability. Informing Moab of license dependencies can reduce the number of costly licenses required by your cluster by allowing Moab to intelligently schedule around license limitations.

Provided with Moab in the tools directory is a Perl script, license.mon.flexLM.pl. This script queries a FLEXlm license server and gathers data about available licenses. This script then formats this data for Moab to read through a native interface. This script can easily be used by any site to help facilitate FLEXlm integration--the only modification necessary to the script is setting the @FLEXlmCmd to specify the local command to query FLEXlm. To make this change, edit license.mon.flexLM.pl and, near the top of the file, look for the line:

my @FLEXlmCmd = ("SETME");

Set the @FLEXlmCmd to the appropriate value for your system to query a license server and license file (if applicable). If lmutil is not in the PATH variable, specify its full path. Using the lmutil -a argument will cause it to report all licenses. The -c option can be used to specify an optional license file.

To test this script, run it manually. If working correctly, it will produce output similar to the following:

> ./license.mon.flexLM.pl
GLOBAL UPDATETIME=1104688300 STATE=idle ARES=autoCAD:130,idl_mpeg:160 CRES=autoCAD:200,idl_mpeg:330

If the output looks incorrect, set the $LOGLEVEL variable inside of license.mon.flexLM.pl, run it again, and address the reported failure.

Once the license interface script is properly configured, the next step is to add a license native resource manager to Moab via the moab.cfg file:

RMCFG[FLEXlm]   TYPE=NATIVE RESOURCETYPE=LICENSE
RMCFG[FLEXlm]   CLUSTERQUERYURL=exec://$TOOLSDIR/flexlm/license.mon.flexLM.pl
...

Once this change is made, restart Moab. The command mdiag -R can be used to verify that the resource manager is properly configured and is in the state Active. Detailed information regarding configured and utilized licenses can be viewed by issuing the mdiag -n. Floating licenses (non-node-locked) will be reported as belonging to the GLOBAL node.

Due to the inherent conflict with the plus sign (+), the provided license manager script replaces occurrences of the plus sign in license names with the underscore symbol (_). This replacement requires that licenses with a plus sign in their names be requested with an underscore in place of any plus signs.

Interfacing to Multiple License Managers Simultaneously

If multiple license managers are used within a cluster, Moab can interface to each of them to obtain the needed license information. In the case of FLEXlm, this can be done by making one copy of the license.mon.flexLM.pl script for each license manager and configuring each copy to point to a different license manager. Then, within Moab, create one native resource manager interface for each license manager and point it to the corresponding script as in the following example:

RMCFG[FLEXlm1]   TYPE=NATIVE RESOURCETYPE=LICENSE
RMCFG[FLEXlm1]   CLUSTERQUERYURL=exec://$TOOLSDIR/flexlm/license.mon.flexLM1.pl
RMCFG[FLEXlm2]   TYPE=NATIVE RESOURCETYPE=LICENSE
RMCFG[FLEXlm2]   CLUSTERQUERYURL=exec://$TOOLSDIR/flexlm/license.mon.flexLM2.pl
RMCFG[FLEXlm3]   TYPE=NATIVE RESOURCETYPE=LICENSE
RMCFG[FLEXlm3]   CLUSTERQUERYURL=exec://$TOOLSDIR/flexlm/license.mon.flexLM3.pl
...

For an overview of license management, including job submission syntax, see Section 13.7, License Management.

It may be necessary to increase the default limit, MMAX_GRES. See Appendix D for more implementation details.

12.5-E Interfacing to Nagios

Moab can interface with Nagios to provide scheduling based on network hosts and services availability.

Nagios installation and configuration documentation can be found at Nagios.org.

Provided with Moab in the tools directory is a Perl script, node.query.nagios.pl. This script reads the Nagios status.dat file and gathers data about network hosts and services. This script then formats data for Moab to read through a native interface. This script can be used by any site to help facilitate Nagios integration. To customize the data that will be formatted for Moab, make the changes in this script.

You may need to customize the associated configuration file in the etc directory, config.nagios.pl. The statusFile line in this script tells Moab where the Nagios status.dat file is located. Make sure that the path name specified is correct for your site. Note that the interval which Nagios updates the Nagios status.dat file is specified in the Nagios nagios.cfg file. Refer to Nagios documentation for further details.

To make these changes, familiarize yourself with the format of the Nagios status.dat file and make the appropriate additions to the script to include the desired Moab RM language (formerly WIKI) Interface attributes in the Moab output.

To test this script, run it manually. If working correctly, it will produce output similar to the following:

> ./node.query.nagios.pl
gateway STATE=Running
localhost STATE=Running CPULOAD=1.22 ADISK=75332

Once the Nagios interface script is properly configured, the next step is to add a Nagios native resource manager to Moab via the moab.cfg file:

RMCFG[nagios] TYPE=NATIVE
RMCFG[nagios] CLUSTERQUERYURL=exec://$TOOLSDIR/node.query.nagios.pl
...

Once this change is made, restart Moab. The command mdiag -R can be used to verify that the resource manager is properly configured and is in the state Active. Detailed information regarding configured Nagios node information can be viewed by issuing the mdiag -n -v.

> mdiag -n -v
compute node summary
Name                    State   Procs      Memory         Disk          Swap      Speed   Opsys   Arch Par   Load Rsv Classes                        Network                        Features              
gateway               Running    0:0         0:0           0:0           0:0       1.00       -      - dav   0.00   0 -                              -                              -                   
  WARNING:  node 'gateway' is busy/running but not assigned to an active job
  WARNING:  node 'gateway' has no configured processors
localhost             Running    0:0         0:0       75343:75347       0:0       1.00       -      - dav   0.48   0 -                              -                              -                   
  WARNING:  node 'localhost' is busy/running but not assigned to an active job
  WARNING:  node 'localhost' has no configured processors
-----                     ---    3:8      1956:1956    75345:75349    5309:6273  
Total Nodes: 2  (Active: 2  Idle: 0  Down: 0)

12.5-F Interfacing to Supermon

Moab can integrate with Supermon to gather additional information regarding the nodes in a cluster. A Perl script is provided in the tools directory that allows Moab to connect to the Supermon server. By default the Perl script assumes that Supermon has been started on port 2709 on localhost. These defaults can be modified by editing the respective parameter in config.supermon.pl in the etc directory. An example setup is shown below.

RMCFG[TORQUE]  TYPE=pbs
RMCFG[supermon] TYPE=NATIVE CLUSTERQUERYURL=exec://$HOME/tools/node.query.supermon.pl

To confirm that Supermon is properly connected to Moab, issue mdiag -R -v. The output should be similar to the following example, specifically there are no errors about the CLUSTERQURYURL.

diagnosing resource managers
RM[TORQUE]  State: Active
  Type:               PBS  ResourceType: COMPUTE
  Server:             keche
  Version:            '2.2.0-snap.200707181818'
  Job Submit URL:     exec:///usr/local/bin/qsub
  Objects Reported:   Nodes=3 (6 procs)  Jobs=0
  Flags:              executionServer
  Partition:          TORQUE
  Event Management:   EPORT=15004  (no events received)
  Note:  SSS protocol enabled
  Submit Command:     /usr/local/bin/qsub
  DefaultClass:       batch
  RM Performance:     AvgTime=0.26s  MaxTime=1.04s  (4 samples)
  RM Languages:       PBS
  RM Sub-Languages:   -
RM[supermon]  State: Active
  Type:               NATIVE  ResourceType: COMPUTE
  Cluster Query URL:  exec://$HOME/node.query.supermon.pl
  Objects Reported:   Nodes=3 (0 procs)  Jobs=0
  Partition:          supermon
  Event Management:   (event interface disabled)
  RM Performance:     AvgTime=0.03s  MaxTime=0.11s  (4 samples)
  RM Languages:       NATIVE
  RM Sub-Languages:   -

Note:  use 'mrmctl -f messages ' to clear stats/failures

Run the Perl script by itself. The script's results should look similar to this:

vm01 GMETRIC[CPULOAD]=0.571428571428571 GMETRIC[NETIN]=133 GMETRIC[NETOUT]=702 GMETRIC[NETUSAGE]=835
vm02 GMETRIC[CPULOAD]=0.428571428571429 GMETRIC[NETIN]=133 GMETRIC[NETOUT]=687 GMETRIC[NETUSAGE]=820
keche GMETRIC[CPULOAD]=31 GMETRIC[NETIN]=5353 GMETRIC[NETOUT]=4937 GMETRIC[NETUSAGE]=10290

If the preceding functioned properly, issue a checknode command on one of the nodes that Supermon is gathering statistics for. The output should look similar to below.

node keche
State:      Idle  (in current state for 00:32:43)
Configured Resources: PROCS: 2  MEM: 1003M  SWAP: 3353M  DISK: 1M
Utilized   Resources: ---
Dedicated  Resources: ---
Generic Metrics:  CPULOAD=33.38,NETIN=11749.00,NETOUT=9507.00,NETUSAGE=21256.00
  MTBF(longterm):   INFINITY  MTBF(24h):   INFINITY
Opsys:      linux     Arch:      ---   
Speed:      1.00      CPULoad:   0.500
Network Load: 0.87 kB/s
Flags:      rmdetected
Network:    DEFAULT
Classes:    [batch 2:2][interactive 2:2]
RM[TORQUE]: TYPE=PBS
EffNodeAccessPolicy: SHARED
Total Time: 2:03:27  Up: 2:03:27 (100.00%)  Active: 00:00:00 (0.00%)
Reservations:  ---

12.5-G Configuring Resource Types

Native Resource managers can also perform special tasks when they are given a specific resource type. These types are specified using the RESOURCETYPE attribute of the RMCFG parameter.

Type Description
COMPUTE Normal compute resources (no special handling)
FS File system resource manager (see Multiple Resource Managers for an example)
LICENSE Software license manager (see Interfacing with FLEXlm and License Management)
NETWORK Network resource manager
PROV Provisioning resource manager. This is the RM Moab uses to modify the OS of a node (not a VM) and to power a node on or off.

12.5-H Creating New Tools to Manage the Cluster

Using the scripts found in the $TOOLSDIR ($INSTDIR/tools) directory as a template, new tools can be quickly created to monitor or manage most any resource. Each tool should be associated with a particular resource manager service and specified using one of the following resource manager URL attributes.

CLUSTERQUERYURL
Description Queries resource state, configuration, and utilization information for compute nodes, networks, storage systems, software licenses, and other resources. For more details, see RM configuration.
Output Node status and configuration for one or more nodes. See Resource Data Format.
Example
RMCFG[v-stor] CLUSTERQUERYURL=exec://$HOME/storquery.pl

Moab will execute the storquery.pl script to obtain information about 'v-stor' resources.

JOBCANCELURL
Description Specifies how Moab cancels jobs via the resource manager. For more details, see RM configuration.
Input <protocol>://[<host>[:<port>]][<path>]
Example
RMCFG[base] JOBCANCELURL=exec:///opt/moab/job.cancel.lsf.pl

Moab executes /opt/moab/job.cancel.lsf.pl to cancel specific jobs.

JOBMODIFYURL
Description Modifies a job or application. For more details, see RM configuration.
Input [-j <JOBEXPR>] [--s[et]|--c[lear]|--i[ncrement]|--d[ecrement]] <ATTR>[=<VALUE>] [<ATTR>[=<VALUE>]]...
Example
RMCFG[v-stor] JOBMODIFYURL=exec://$HOME/jobmodify.pl

Moab will execute the jobmodify.pl script to modify the specified job.



JOBREQUEUEURL
Description Requeues a job.
Input <JOBID>
Example
RMCFG[v-stor] JOBREQUEUEURL=exec://$HOME/requeue.pl
							

Moab will execute the requeue.pl script to requeue jobs.

JOBRESUMEURL
Description Resumes a suspended job or application.
Input <JOBID>
Example
RMCFG[v-stor] JOBRESUMEURL=exec://$HOME/jobresume.pl

Moab will execute the jobresume.pl script to resume suspended jobs.

JOBSTARTURL
Description Launches a job or application on a specified set of resources.
Input <JOBID><TASKLIST><USERNAME> [ARCH=<ARCH>] [OS=<OPSYS>] [IDATA=<STAGEINFILEPATH>[,<STAGEINFILEPATH>]...] [EXEC=<EXECUTABLEPATH>]
Example
RMCFG[v-stor] JOBSTARTURL=exec://$HOME/jobstart.pl

Moab will execute the jobstart.pl script to execute jobs.

JOBSUBMITURL
Description Submits a job to the resource manager, but it does not execute the job. The job executes when the JOBSTARTURL is called.
Input [ACCOUNT=<ACCOUNT>] [ERROR=<ERROR>] [GATTR=<GATTR>] [GNAME=<GNAME>] [GRES=<GRES>:<Value>[,<GRES>:<Value>]*] [HOSTLIST=<HOSTLIST>] [INPUT=<INPUT>] [IWD=<IWD>] [NAME=<NAME>] [OUTPUT=<OUTPUT>] [RCLASS=<RCLASS>] [REQUEST=<REQUEST>] [RFEATURES=<RFEATURES>] [RMFLAGS=<RMFLAGS>] [SHELL=<SHELL>] [TASKLIST=<TASKLIST>] [TASKS=<TASKS>] [TEMPLATE=<TEMPLATE>] [UNAME=<UNAME>] [VARIABLE=<VARIABLE>] [WCLIMIT=<WCLIMIT>] [ARGS=<Value>[ <Value>]*]

ARGS must be the last submitted attribute because there can be multiple space-separated values for ARGS.

Example
RMCFG[v-stor] JOBSUBMITURL=exec://$HOME/jobsubmit.pl

Moab submits the job to the jobsubmit.pl script for future job execution.

JOBSUSPENDURL
Description Suspends in memory an active job or application.
Input <JOBID>
Example
RMCFG[v-stor] JOBSUSPENDURL=exec://$HOME/jobsuspend.pl

Moab will execute the jobsuspend.pl script to suspend active jobs.



NODEMODIFYURL
Description Provide method to dynamically modify/provision compute resources including operating system, applications, queues, node features, power states, etc.
Input <NODEID>[,<NODEID>] [--force] {--set <ATTR>=<VAL>|--clear <ATTR>}
ATTR is one of the node attributes listed in Resource Data Format
Example
RMCFG[warewulf] NODEMODIFYURL=exec://$HOME/provision.pl

Moab will reprovision compute nodes using the provision.plscript.

NODEPOWERURL
Description Allows Moab to issue IPMI power commands.
Input <NODEID>[,<NODEID>] ON | OFF
Example
RMCFG[node17rm] NODEPOWERURL=exec://$TOOLSDIR/ipmi.power.pl

Moab will issue a power command contained in the ipmi.power.plscript.

SYSTEMMODIFYURL
Description Provide method to dynamically modify aspects of the compute environment which are directly associated with cluster resources. For more details, see RM configuration.
SYSTEMQUERYURL
Description Provide method to dynamically query aspects of the compute environment which are directly associated with cluster resources. For more details, see RM configuration.
Input default <ATTR>
ATTR is one of images
Output <STRING>
Example
RMCFG[warewulf] SYSTEMQUERYURL=exec://$HOME/checkimage.pl

Moab will load the list of images available from warewulf using the checkimage.pl script.

WORKLOADQUERYURL
Description: Provide method to dynamically query the system workload (jobs, services, etc) of the compute environment which is associated with managed resources.

Job/workload information should be reported back from the URL (script, file, webservice, etc.) using the Moab RM language (formerly WIKI).

For more details, see RM configuration.
Output: <STRING>
Example:
RMCFG[xt] WORKLOADQUERYURL=exec://$HOME/job.query.xt3.pl

Moab will load job/workload information by executing the job.query.xt3.pl script.

Related topics