12.0 Resource Managers and Interfaces > Resource Manager Configuration

Conventions

12.2 Resource Manager Configuration

12.2-A Defining and Configuring Resource Manager Interfaces

Moab resource manager interfaces are defined using the RMCFG parameter. This parameter allows specification of key aspects of the interface. In most cases, only the TYPE attribute needs to be specified and Moab determines the needed defaults required to activate and use the selected interface. In the following example, an interface to a Loadleveler resource manager is defined.

RMCFG[orion] TYPE=LL...

Note that the resource manager is given a label of orion. This label can be any arbitrary site-selected string and is for local usage only. For sites with multiple active resource managers, the labels can be used to distinguish between them for resource manager specific queries and commands.

Resource Manager Attributes

The following table lists the possible resource manager attributes that can be configured.

ADMINEXEC
AUTHTYPE
BANDWIDTH
CHECKPOINTSIG
CHECKPOINTTIMEOUT
CLIENT
CLUSTERQUERYURL
CONFIGFILE
DATARM
DEFAULTCLASS
DEFAULTHIGHSPEEDADAPTER
DESCRIPTION
ENV
EPORT
FAILTIME
FLAGS
FNLIST
HOST
IGNHNODES
JOBCANCELURL
JOBEXTENDDURATION
JOBIDFORMAT
JOBMODIFYURL
JOBRSVRECREATE
JOBSTARTURL
JOBSUBMITURL
JOBSUSPENDURL
JOBVALIDATEURL
MAXDSOP
MAXITERATIONFAILURECOUNT
MAXJOBPERMINUTE
MAXJOBS
MINETIME
NMPORT
NODEFAILURERSVPROFILE
NODESTATEPOLICY
OMAP
PORT
PROVDURATION
PTYSTRING
RESOURCECREATEURL
RESOURCETYPE
RMSTARTURL
RMSTOPURL
SBINDIR
SLURMFLAGS
SOFTTERMSIG
STAGETHRESHOLD
STARTCMD
SUBMITCMD
SUBMITPOLICY
SUSPENDSIG
SYNCJOBID
SYSTEMMODIFYURL
SYSTEMQUERYURL
TARGETUSAGE
TIMEOUT
TRIGGER
TYPE
USEVNODES
VARIABLES
VERSION
VMOWNERRM
WORKLOADQUERYURL
ADMINEXEC
Format "jobsubmit"
Default NONE
Description Normally, when the JOBSUBMITURL is executed, Moab will drop to the UID and GID of the user submitting the job. Specifying an ADMINEXEC of jobsubmit causes Moab to use its own UID and GID instead (usually root). This is useful for some native resource managers where the JOBSUBMITURL is not a user command (such as qsub) but a script that interfaces directly with the resource manager.
Example
RMCFG[base] ADMINEXEC=jobsubmit

Moab will not use the user's UID and GID for executing the JOBSUBMITURL.

AUTHTYPE
Format One of CHECKSUM, OTHER, PKI, SECUREPORT, or NONE.
Default CHECKSUM
Description Specifies the security protocol to be used in scheduler-resource manager communication.

Only valid with WIKI based interfaces.

Example
RMCFG[base] AUTHTYPE=CHECKSUM

Moab requires a secret key-based checksum associated with each resource manager message.

BANDWIDTH
Format: <FLOAT>[{M|G|T}]
Default: -1 (unlimited)
Description: Specifies the maximum deliverable bandwidth between the Moab server and the resource manager for staging jobs and data. Bandwidth is specified in units per second and defaults to a unit of MB/s. If a unit modifier is specified, the value is interpreted accordingly (M - megabytes/sec, G - gigabytes/sec, T - terabytes/sec).
Example:
RMCFG[base] BANDWIDTH=340G

Moab will reserve up to 340 GB of network bandwidth when scheduling job and data staging operations to and from this resource manager.

CHECKPOINTSIG
Format One of suspend, <INTEGER>, or SIG<X>
Description Specifies what signal to send the resource manager when a job is checkpointed (See Checkpoint Overview.).
Example
RMCFG[base] CHECKPOINTSIG=SIGKILL

Moab routes the signal SIGKILL through the resource manager to the job when a job is checkpointed.

CHECKPOINTTIMEOUT
Format [[[DD:]HH:]MM:]SS
Default 0 (no timeout)
Description Specifies how long Moab waits for a job to checkpoint before canceling it. If set to 0, Moab does not cancel the job if it fails to checkpoint (See Checkpoint Overview.).
Example
RMCFG[base] CHECKPOINTTIMEOUT=5:00

Moab cancels any job that has not exited 5 minutes after receiving a checkpoint request.

CLIENT
Format <PEER>
Default Use name of resource manager for peer client lookup
Description If specified, the resource manager will use the peer value to authenticate remote connections. (See configuring peers). If not specified, the resource manager will search for a CLIENTCFG[<X>] entry of RM:<RMNAME>in the moab-private.cfg file.
Example
RMCFG[clusterBI] CLIENT=clusterB

Moab will look up and use information for peer clusterB when authenticating the clusterBI resource manager.

CLUSTERQUERYURL
Format [file://<path> | http://<address> | <path>]

If file:// is specified, Moab treats the destination as a flat text file. If http:// is specified, Moab treats the destination as a hypertext transfer protocol file. If just a path is specified, Moab treats the destination as an executable.
Description Specifies how Moab queries the resource manager (See Native RM, URL Notes, and interface details.).
Example
RMCFG[base] CLUSTERQUERYURL=file:///tmp/cluster.config

Moab reads /tmp/cluster.config when it queries base resource manager.

CONFIGFILE
Format <STRING>
Description Specifies the resource manager specific configuration file that must be used to enable correct API communication.

Only valid with LL- and SLURM-based interfaces.

Example
RMCFG[base] TYPE=LL CONFIGFILE=/home/loadl/loadl_config

The scheduler uses the specified file when establishing the resource manager/scheduler interface connection.

DATARM
Format <RM NAME>
Description If specified, the resource manager uses the given storage resource manager to handle staging data in and out.
Example
RMCFG[clusterB] DATARM=clusterB_storage

When data staging is required by jobs starting/completing on clusterB, Moab uses the storage interface defined by clusterB_storage to stage and monitor the data.

DEFAULTCLASS
Format <STRING>
Description Specifies the class to use if jobs submitted via this resource manager interface do not have an associated class.
Example
RMCFG[internal] DEFAULTCLASS=batch

Moab assigns the class batch to all jobs from the resource manager internal that do not have a class assigned.

If you are using PBS as the resource manager, a job will never come from PBS without a class, and the default will never apply.

DEFAULTHIGHSPEEDADAPTER
Format: <STRING>
Default: sn0
Description: Specifies the default high speed switch adapter to use when starting LoadLeveler jobs (supported in version 4.2.2 and higher of Moab and 3.2 of LoadLeveler).
Example:
RMCFG[base]     DEFAULTHIGHSPEEDADAPTER=sn1

The scheduler will start jobs requesting a high speed adapter on sn1.

DESCRIPTION
Format <STRING>
Description Specifies the human-readable description for the resource manager interface. If white space is used, the description should be quoted.
Example
RMCFG[torque] TYPE=NATIVE DESCRIPTION='Torque RM for launching jobs'

Moab annotates the TORQUE resource manager accordingly.

ENV
Format Semi-colon-delimited (;) list of <KEY>=<VALUE> pairs
Default MOABHOMEDIR=<MOABHOMEDIR>
Description Specifies a list of environment variables that will be passed to URLs of type exec:// for that resource manager.
Example
RMCFG[base] ENV=HOST=node001;RETRYTIME=50
RMCFG[base] CLUSTERQUERYURL=exec:///opt/moab/tools/cluster.query.pl
RMCFG[base] WORKLOADQUERYURL=exec:///opt/moab/tools/workload.query.pl

The environment variables HOST and RETRYTIME (with values node001 and 50 respectively) are passed to the /opt/moab/tools/cluster.query.pl and /opt/moab/tools/workload.query.pl when they are executed.

EPORT
Format: <INTEGER>
Description: Specifies the event port to use to receive resource manager based scheduling events.
Example:
RMCFG[base] EPORT=15017

The scheduler will look for scheduling events from the resource manager host at port 15017.

FAILTIME
Format: [[[DD:]HH:]MM:]SS
Description: Specifies how long a resource manager must be down before any failure triggers associated with the resource manager fire.
Example:
RMCFG[base] FAILTIME=3:00

If the base resource manager is down for three minutes, any resource manager failure triggers fire.

FLAGS
Format Comma-delimited list of zero or more of the following: asyncdelete, asyncstart, autostart, autosync, client, fullcp, executionServer, hostingCenter, ignqueuestate, private, pushslavejobupdates, report, shared, slavepeer or static
Description Specifies various attributes of the resource manager. See Flag Details for more information.
Example
RMCFG[base] FLAGS=static,slavepeer

Moab uses this resource manager to perform a single update of node and job objects reported elsewhere.

FNLIST
Format Comma-delimited list of zero or more of the following: clusterquery, jobcancel, jobrequeue, jobresume, jobstart, jobsuspend, queuequery, resourcequery or workloadquery
Description By default, a resource manager utilizes all functions supported to query and control batch objects. If this parameter is specified, only the listed functions are used.
Example
RMCFG[base] FNLIST=queuequery

Moab only uses this resource manager interface to load queue configuration information.

HOST
Format <STRING>
Default localhost
Description The host name of the machine on which the resource manager server is running.
Example
RMCFG[base] host=server1
IGNHNODES
Format <BOOLEAN>
Default FALSE
Description Specifies whether to read in the PBSPro host nodes. This parameter is used in conjunction with USEVNODES. When both are set to TRUE, the host nodes are not queried.
Example
RMCFG[pbs] IGNHNODES=TRUE
JOBCANCELURL
Format <protocol>://[<host>[:<port>]][<path>]
Default ---
Description Specifies how Moab cancels jobs via the resource manager. (See URL Notes below.)
Example
RMCFG[base] JOBCANCELURL=exec:///opt/moab/job.cancel.lsf.pl

Moab executes /opt/moab/job.cancel.lsf.pl to cancel specific jobs.

JOBEXTENDDURATION
Format [[[DD:]HH:]MM:]SS[,[[[DD:]HH:]MM:]SS][!][<] (or <MIN TIME>[,<MAX TIME>][!])
Default ---
Description

Specifies the minimum and maximum amount of time that can be added to a job's walltime if it is possible for the job to be extended. (See MINWCLIMIT.) As the job runs longer than its current specified minimum wallclock limit (-l minwclimit, for example), Moab attempts to extend the job's limit by the minimum JOBEXTENDDURATION. This continues until either the extension can no longer occur (it is blocked by a reservation or job), the maximum JOBEXTENDDURATION is reached, or the user's specified wallclock limit (-l wallclock) is reached. When a job is extended, it is marked as PREEMPTIBLE, unless the ! is appended to the end of the configuration string. If the < is at the end of the string, however, the job is extended the maximum amount possible.

JOBEXTENDDURATION and JOBEXTENDSTARTWALLTIME TRUE cannot be configured together. If they are in the same moab.cfg or are both active, then the JOBEXTENDDURATION will not be honored.

For example, comment out the JOBEXTENDSTARTWALLTIME.

RMCFG[base] JOBEXTENDDURATION=30,1:00:00
#JOBEXTENDSTARTWALLTIME TRUE
Example
RMCFG[base] JOBEXTENDDURATION=30,1:00:00

Moab extends a job's walltime by 30 seconds each time the job is about to run out of walltime until it is bound by one hour, a reservation/job, or the job's original "maximum" wallclock limit.

JOBIDFORMAT
Format INTEGER
Default ---
Description Specifies that Moab should use numbers to create job IDs. This eliminates multiple job IDs associated with a single job.
Example
RMCFG[base] JOBIDFORMAT=INTEGER

Job IDs are generated as numbers.

JOBMODIFYURL
Format <protocol>://[<host>[:<port>]][<path>]
Default ---
Description Specifies how Moab modifies jobs via the resource manager. (See URL Notes, and interface details.)
Example
RMCFG[base] JOBMODIFYURL=exec://$TOOLSDIR/job.modify.dyn.pl

Moab executes /opt/moab/job.modify.dyn.pl to modify specific jobs.

JOBRSVRECREATE
Format Boolean
Default TRUE
Description Specifies whether Moab will re-create a job reservation each time job information is updated by a resource manager (See Considerations for Large Clusters for more information.).
Example
RMCFG[base] JOBRSVRECREATE=FALSE

Moab only creates a job reservation once when the job first starts.

JOBSTARTURL
Format <protocol>://[<host>[:<port>]][<path>]
Default TRUE
Description Specifies how Moab starts jobs via the resource manager. (See URL Notes below.)
Example
RMCFG[base] JOBSTARTURL=http://orion.bsu.edu:1322/moab/jobstart.cgi

Moab triggers the jobstart.cgi script via http to start specific jobs.

JOBSUBMITURL
Format <protocol>://[<host>[:<port>]][<path>]
Description Specifies how Moab submits jobs to the resource manager (See URL Notes below.).
Example
RMCFG[base] JOBSUBMITURL=exec://$TOOLSDIR/job.submit.dyn.pl

Moab submits jobs directly to the database located on host dbserver.flc.com.

JOBSUSPENDURL
Format <protocol>://[<host>[:<port>]][<path>]
Description Specifies how Moab suspends jobs via the resource manager. (See URL Notes below.)
Example
RMCFG[base] JOBSUSPENDURL=EXEC://$HOME/scripts/job.suspend

Moab executes the job.suspend script when jobs are suspended.

JOBVALIDATEURL
Format <protocol>://[<host>[:<port>]][<path>]
Description Specifies how Moab validates newly submitted jobs (See URL Notes below.). If the script returns with a non-zero exit code, the job is rejected. (See User Proxying/Alternate Credentials.)
Example
RMCFG[base] JOBVALIDATEURL=exec://$TOOLS/job.validate.pl

Moab executes the 'job.validate.pl' script when jobs are submitted to verify they are acceptable.

MAXDSOP
Format <INTEGER>
Default -1 (unlimited)
Description Specifies the maximum number of data staging operations that may be simultaneously active.
Example
RMCFG[ds] MAXDSOP=16
MAXITERATIONFAILURECOUNT
Format <INTEGER>
Default 80
Description Specifies the number of times the RM must fail within a certain iteration before Moab considers it down or corrupt. When an RM is down or corrupt, Moab will not attempt to interact with it.
Example
RMCFG[base] MAXITERATIONFAILURECOUNT=25

The RM base must fail 25 times in a single iteration for Moab to consider it down and cease interacting with it.

MAXJOBPERMINUTE
Format <INTEGER>
Default -1 (unlimited)
Description Specifies the maximum number of jobs allowed to start per minute via the resource manager.
Example
RMCFG[base] MAXJOBPERMINUTE=5

The scheduler only allows five jobs per minute to launch via the resource manager base.

MAXJOBS
Format <INTEGER>
Default 0 (limited only by the Moab MAXJOB setting)
Description Specifies the maximum number of active jobs that this interface is allowed to load from the resource manager.

Only works with Moab peer resource managers at this time.

Example
RMCFG[cluster1] SERVER=moab://cluster1 MAXJOBS=200

The scheduler loads up to 200 active jobs from the remote Moab peer cluster1.

MINETIME
Format <INTEGER>
Default 1
Description Specifies the minimum time in seconds between processing subsequent scheduling events.
Example
RMCFG[base] MINETIME=5

The scheduler batch-processes scheduling events that occur less than five seconds apart.

NMPORT
Format <INTEGER>
Default (any valid port number)
Description Allows specification of the resource manager's node manager port and is only required when this port has been set to a non-default value.
Example
RMCFG[base] NMPORT=13001

The scheduler contacts the node manager located on each compute node at port 13001.

NODEFAILURERSVPROFILE
Format <STRING>
Description Specifies the rsv template to use when placing a reservation onto failed nodes (See also NODEFAILURERESERVETIME.).
Example
# moab.cfg
RMCFG[base] NODEFAILURERSVPROFILE=long
RSVPROFILE[long]        DURATION=25:00RSVPROFILE[long]        USERLIST=john

The scheduler will use the long rsv profile when creating reservations over failed nodes belonging to base.

NODESTATEPOLICY
Format One of OPTIMISTIC or PESSIMISTIC
Default PESSIMISTIC
Description Specifies how Moab should determine the state of a node when multiple resource managers are reporting state.
OPTIMISTIC specifies that if any resource manager reports a state of up, that state will be used.
PESSIMISTIC specifies that if any resource manager reports a state of down, that state will be used.
Example
# moab.cfg
RMCFG[native] TYPE=NATIVE NODESTATEPOLICY=OPTIMISTIC
OMAP
Format <protocol>://[<host>[:<port>]][<path>]
Description Specifies an object map file that is used to map credentials and other objects when using this resource manager peer
Example
moab.cfg
RMCFG[peer1] OMAP=file:///opt/moab/omap.dat

When communicating with the resource manager peer1, objects are mapped according to the rules defined in the /opt/moab/omap.dat file.

PORT
Format <INTEGER>
Default 0
Description Specifies the port on which the scheduler should contact the associated resource manager. The value 0 specifies that the resource manager default port should be used.
Example
RMCFG[base] TYPE=PBS HOST=cws PORT=20001

Moab attempts to contact the PBS server daemon on host cws, port 20001.

PROVDURATION
Format [[[DD:]HH:]MM:]SS
Default 2:30
Description Specifies the upper bound (walltime) of a provisioning request. After this duration, Moab will consider the privisioning attempt failed.
Example
RMCFG[base] PROVDURATION=5:00

When RM base provisions a node for more than 5 minutes, Moab considers the provisioning as having failed.

PTYSTRING
Format <STRING>
Default srun -n1 -N1 --pty
Description

When a SLURM interactive job is submitted, it builds an salloc command that gets the requested resources and an srun command that creates a terminal session on one of the nodes. The srun command is called the PTYString. PTYString is configured in moab.cfg.

There are two special things you can do with PTYString:

  1. You can have PTYSTRING=$salloc which says to use the default salloc command (SallocDefaultCommand, look in the slurm.conf man page) defined in slurm.conf. Internally, Moab won't add a PTYString because SLURM will call the SallocDefaultCommand.
  2. As in the example below, you can add $SHELL. $SHELLwill be expanded to either what you request on the command line (such as msub -S /bin/tcsh -l) or to the value of $SHELL in your current session.

PTYString works only with SLURM.

Example
RMCFG[slurm] PTYSTRING="srun -n1 -N1 --pty --preserve-env $SHELL"
RESOURCECREATEURL
Format <STRING>
Default [exec://<path> | http://<address> | <path>]

If exec:// is specified, Moab treats the destination as an executable file; if http:// is specified, Moab treats the destination as a hypertext transfer protocol file.
Description Specifies a script or method that can be used by Moab to create resources dynamically, such as creating a virtual machine on a hypervisor.
Example
RMCFG[base] RESOURCECREATEURL=exec:///opt/script/vm.provision.py

Moab invokes the vm.provision.py script, passing in data as command line arguments, to request a creation of new resources.


RESOURCETYPE
Format {COMPUTE|FS|LICENSE|NETWORK|PROV}
Description Specifies which type of resource this resource manager is configured to control. See Native Resource Managers for more information.
Example
RMCFG[base] TYPE=NATIVE RESOURCETYPE=FS

Resource manager base will function as a NATIVE resource manager and control file systems.


RMSTARTURL
Format [exec://<path> | http://<address> | <path>]

If exec:// is specified, Moab treats the destination as an executable file; if http:// is specified, Moab treats the destination as a hypertext transfer protocol file.
Description Specifies how Moab starts the resource manager.
Example
RMCFG[base] RMSTARTURL=exec:///tmp/nat.start.pl

Moab executes /tmp/nat.start.pl to start the resource manager base.

RMSTOPURL
Format [exec://<path> | http://<address> | <path>]

If exec:// is specified, Moab treats the destination as an executable file; if http:// is specified, Moab treats the destination as a hypertext transfer protocol file.
Description Specifies how Moab stops the resource manager.
Example
RMCFG[base] RMSTOPURL=exec:///tmp/nat.stop.pl

Moab executes /tmp/nat.stop.pl to stop the resource manager base.

SBINDIR
Format <PATH>
Description For use with TORQUE; specifies the location of the TORQUE system binaries (supported in TORQUE 1.2.0p4 and higher).
Example
RMCFG[base] TYPE=pbs  SBINDIR=/usr/local/torque/sbin

Moab tells TORQUE that its system binaries are located in /usr/local/torque/sbin.

SERVER
Format <URL>
Description Specifies the resource management service to use. If not specified, the scheduler locates the resource manager via built-in defaults or, if available, with an information service.
Example
RMCFG[base] server=ll://supercluster.org:9705

Moab attempts to use the Loadleveler scheduling API at the specified location.

SLURMFLAGS
Format <STRING>
Description Specifies characteristics of the SLURM resource manager interface. The COMPRESSOUTPUT flag instructs Moab to use the compact host list format for job submissions to SLURM. The flag NODEDELTAQUERY instructs Moab to request delta node updates when it queries SLURM for node configuration.
Example
RMCFG[slurm] SLURMFLAGS=COMPRESSOUTPUT

Moab uses the COMPRESSOUTPUT flag to determine interface characteristics with SLURM.

SOFTTERMSIG
Format <INTEGER>or SIG<X>
Description Specifies what signal to send the resource manager when a job reaches its soft wallclock limit. (See JOBMAXOVERRUN.)
Example
RMCFG[base] SOFTTERMSIG=SIGUSR1

Moab routes the signal SIGUSR1 through the resource manager to the job when a job reaches its soft wallclock limit.

STAGETHRESHOLD
Format [[[DD:]HH:]MM:]SS
Description Specifies the maximum time a job waits to start locally before considering being migrated to a remote peer. In other words, if a job's start time on a remote cluster is less than the start time on the local cluster, but the difference between the two is less than STAGETHRESHOLD, then the job is scheduled locally. The aim is to avoid job/data staging overhead if the difference in start times is minimal.

If this attribute is used, backfill is disabled for the associated resource manager.

Example
RMCFG[remote_cluster] STAGETHRESHOLD=00:05:00

Moab only migrates jobs to remote_cluster if the jobs can start five minutes sooner on the remote cluster than they could on the local cluster.

STARTCMD
Format <STRING>
Description Specifies the full path to the resource manager job start client. If the resource manager API fails, Moab executes the specified start command in a second attempt to start the job.

Moab calls the start command with the format <CMD><JOBID> -H <HOSTLIST> unless the environment variable MOABNOHOSTLIST is set in which case Moab will only pass the job ID.

Example
RMCFG[base] STARTCMD=/usr/local/bin/qrun

Moab uses the specified start command if API failures occur when launching jobs.

SUBMITCMD
Format <STRING>
Description Specifies the full path to the resource manager job submission client.
Example
RMCFG[base] SUBMITCMD=/usr/local/bin/qsub

Moab uses the specified submit command when migrating jobs.

SUBMITPOLICY
Format One of NODECENTRIC or PROCCENTRIC
Default PROCCENTRIC
Description If set to NODECENTRIC, each specified node requested by the job is interpreted as a true compute host, not as a task or processor.
Example
RMCFG[base] SUBMITPOLICY=NODECENTRIC

Moab uses the specified submit policy when migrating jobs.

SUSPENDSIG
Format <INTEGER> (valid UNIX signal between 1 and 64)
Default RM-specific default
Description If set, Moab sends the specified signal to a job when a job suspend request is issued.
Example
RMCFG[base] SUSPENDSIG=19

Moab uses the specified suspend signal when suspending jobs within the base resource manager.

SUSPENDSIG should not be used with TORQUE or other PBS-based resource managers.

SYNCJOBID
Format <BOOLEAN>
Description

Specifies that Moab should migrate jobs to the local resource manager with the job's Moab-assigned job ID. In a grid, the grid-head will only pass dependencies to the underlying Moab if SYNCJOBID is set. This attribute can be used with the JOBIDFORMAT attribute and PROXYJOBSUBMISSION flag in order to synchronize job IDs between Moab and the resource manager. For more information about all steps necessary to synchronize job IDs between Moab and TORQUE, see Synchronizing Job IDs in TORQUE and Moab.

Example
RMCFG[slurm] TYPE=wiki:slurm SYNCJOBID=TRUE
SYSTEMMODIFYURL
Format [exec://<path> | http://<address> | <path>]

If exec:// is specified, Moab treats the destination as an executable file; if http:// is specified, Moab treats the destination as a hypertext transfer protocol file.
Description Specifies how Moab modifies attributes of the system. This interface is used in data staging.
Example
RMCFG[base] SYSTEMMODIFYURL=exec:///tmp/system.modify.pl

Moab executes /tmp/system.modify.pl when it modifies system attributes in conjunction with the resource manager base.

SYSTEMQUERYURL
Format [exec://<path> | http://<address> | <path>]

If file:// is specified, Moab treats the destination as a flat text file; if http:// is specified, Moab treats the destination as a hypertext transfer protocol file; if just a path is specified, Moab treats the destination as an executable.
Description Specifies how Moab queries attributes of the system. This interface is used in data staging.
Example
RMCFG[base] SYSTEMQUERYURL=file:///tmp/system.query

Moab reads /tmp/system.query when it queries the system in conjunction with base resource manager.

TARGETUSAGE
Format <INTEGER>[%]
Default 90%
Description Amount of resource manager resources to explicitly use. In the case of a storage resource manager, indicates the target usage of data storage resources to dedicate to active data migration requests. If the specified value contains a percent sign (%), the target value is a percent of the configured value. Otherwise, the target value is considered to be an absolute value measured in megabytes (MB).
Example
RMCFG[storage] TYPE=NATIVE RESOURCETYPE=storage
RMCFG[storage] TARGETUSAGE=80%

Moab schedules data migration requests to never exceed 80% usage of the storage resource manager's disk cache and network resources.

TIMEOUT
Format <INTEGER>
Default 30
Description Time (in seconds) the scheduler waits for a response from the resource manager.
Example
RMCFG[base] TIMEOUT=40

Moab waits 40 seconds to receive a response from the resource manager before timing out and giving up. Moab tries again on the next iteration.

TRIGGER
Format <TRIG_SPEC>
Description A trigger specification indicating behaviors to enforce in the event of certain events associated with the resource manager, including resource manager start, stop, and failure.
Example
RMCFG[base] TRIGGER=<X>
TYPE
Format <RMTYPE>[:<RMSUBTYPE>] where <RMTYPE> is one of the following: TORQUE, NATIVE, PBS, RMS, SSS, or WIKI and the optional <RMSUBTYPE> value is one of RMS.
Default PBS
Description Specifies type of resource manager to be contacted by the scheduler.

For TYPE WIKI, AUTHTYPE must be set to CHECKSUM. The <RMSUBTYPE> option is currently only used to support Compaq's RMS resource manager in conjunction with PBS. In this case, the value PBS:RMS should be specified.

Example
RMCFG[clusterA] TYPE=PBS HOST=clusterA PORT=15003
RMCFG[clusterB] TYPE=PBS HOST=clusterB PORT=15005

Moab interfaces to two different PBS resource managers, one located on server clusterA at port 15003 and one located on server clusterB at port 15005.

USEVNODES
Format <BOOLEAN>
Default FALSE
Description Specifies whether to schedule on PBS virtual nodes. When set to TRUE, Moab queries PBSPro for vnodes and puts jobs on vnodes rather than hosts. In some systems, such as PBS + Altix, it may not be desirable to read in the host nodes; for such situations refer to the IGNHNODES attribute.
Example
RMCFG[pbs] USEVNODES=TRUE
VARIABLES
Format <VAR>=<VAL>[,<VAR>=<VAL>]
Description Opaque resource manager variables.
Example
RMCFG[base] VARIABLES=SCHEDDHOST=head1

Moab associates the variable SCHEDDHOST with the value head1 on resource manager base.

VERSION
Format <STRING>
Default SLURM: 10200 (i.e., 1.2.0)
Description Resource manager-specific version string.
Example
RMCFG[base] VERSION=10124

Moab assumes that resource manager base has a version number of 1.1.24.

VMOWNERRM
Format <STRING>
Description Used with provisioning resource managers that can create VMs. It specifies the resource manager that will own any VMs created by the resource manager.
Example
RMCFG[torque]
RMCFG[prov] RESOURCETYPE=PROV VMOWNERRM=torque
WORKLOADQUERYURL
Format [file://<path> | http://<address> | <path>]

If file:// is specified, Moab treats the destination as a flat text file; if http:// is specified, Moab treats the destination as a hypertext transfer protocol file; if just a path is specified, Moab treats the destination as an executable.
Description Specifies how Moab queries the resource manager for workload information. (See Native RM, URL Notes, and interface details.)
Example
RMCFG[TORQUE] WORKLOADQUERYURL=exec://$TOOLSDIR/job.query.dyn.pl

Moab executes /opt/moab/tools/job.query.dyn.pl to obtain updated workload information from resource manager TORQUE.

URL notes

URL parameters can load files by using the file, exec, and http protocols.

For the protocol file, Moab loads the data directly from the text file pointed to by path.

RMCFG[base] SYSTEMQUERYURL=file:///tmp/system.query

For the protocol exec, Moab executes the file pointed to by path and loads the output written to STDOUT. If the script requires arguments, you can use a question mark (?) between the script name and the arguments, and an ampersand (&) for each space.

RMCFG[base] JOBVALIDATEURL=exec://$TOOLS/job.validate.pl
RMCFG[native] CLUSTERQUERYURL=exec://opt/moab/tools/cluster.query.pl?-group=group1&-arch=x86

Synchronizing Job IDs in TORQUE and Moab

Unless you use an msub submit filter or you're in a grid, it is recommended that you use your RM-specific job submission command (for instance, qsub).

In order to synchronize your job IDs between TORQUE and Moabyou must perform the following steps:

  1. Verify that you are using TORQUE version 2.5.6 or later.
  2. Set SYNCJOBID to TRUE in all resource managers.
  3. RMCFG[torque] TYPE=PBS SYNCJOBID=TRUE
  4. Set the PROXYJOBSUBMISSION flag. With PROXYJOBSUBMISSION enabled, you must run Moab as a TORQUE manager or operator. Verify that other users can submit jobs using msub. Moab, as a non-root user, should still be able to submit jobs to TORQUE and synchronize job IDs.
    RMCFG[torque] TYPE=PBS SYNCJOBID=TRUE
    RMCFG[torque] FLAGS=PROXYJOBSUBMISSION
  5. Add JOBIDFORMAT=INTEGER to the internal RM. Adding this parameter forces Moab to only use numbers as job IDs and those numbers to synchronize across Moab, TORQUE, and the entire grid. This enhances the end-user experience as it eliminates multiple job IDs associated with a single job.
    RMCFG[torque] TYPE=PBS SYNCJOBID=TRUE
    RMCFG[torque] FLAGS=PROXYJOBSUBMISSION
     
    RMCFG[internal] JOBIDFORMAT=INTEGER

12.2-B Resource Manager Configuration Details

As with all scheduler parameters, follows the syntax described within the Parameters Overview.

Resource Manager Types

The RMCFG parameter allows the scheduler to interface to multiple types of resource managers using the TYPE or SERVER attributes. Specifying these attributes, any of the following listed resource managers may be supported.

Type Resource managers Details
Moab Moab Workload Manager Use the Moab peer-to-peer (grid) capabilities to enable grids and other configurations. (See Grid Configuration.)
Native Moab Native Interface Used for connecting directly to scripts, files, databases, and Web services. (See Managing Resources Directly with the Native Interface.)
PBS TORQUE (all versions) N/A
SSS Scalable Systems Software Project version 2.0 and higher N/A
WIKI Wiki interface specification version 1.0 and higher Used for LRM, YRM, ClubMASK, BProc, SLURM, and others.

Resource Manager Name

Moab can support more than one resource manager simultaneously. Consequently, the RMCFG parameter takes an index value such as RMCFG[clusterA]. This index value essentially names the resource manager (as done by the deprecated parameter RMNAME). The resource manager name is used by the scheduler in diagnostic displays, logging, and in reporting resource consumption to the allocation manager. For most environments, the selection of the resource manager name can be arbitrary.

Resource Manager Location

The HOST, PORT, and SERVER attributes can be used to specify how the resource manager should be contacted. For many resource managers the interface correctly establishes contact using default values. These parameters need only to be specified for resource managers such as the WIKI interface (that do not include defaults) or with resources managers that can be configured to run at non-standard locations (such as PBS). In all other cases, the resource manager is automatically located.

Resource Manager Flags

The FLAGS attribute can be used to modify many aspects of a resources manager's behavior.

Flag Description
ASYNCSTART Jobs started on this resource manager start asynchronously. In this case, the scheduler does not wait for confirmation that the job correctly starts before proceeding. (See Large Cluster Tuning for more information.)
AUTOSTART Jobs staged to this resource manager do not need to be explicitly started by the scheduler. The resource manager itself handles job launch.
AUTOSYNC Resource manager starts and stops together with Moab.

This requires that the resource manager support a resource manager start and stop API or the RMSTARTURL and RMSTOPURL attributes are set.

BECOMEMASTER Nodes reported by this resource manager will transfer ownership to this resource manager if they are currently owned by another resource manager that does not have this flag set.
CLIENT A client resource manager object is created for diagnostic/statistical purposes or to configure Moab's interaction with this resource manager. It represents an external entity that consumes server resources or services, allows a local administrator to track this usage, and configures specific policies related to that resource manager. A client resource manager object loads no data and provides no services.
CLOCKSKEWCHECKING Setting CLOCKSKEWCHECKING allows you to configure clock skew adjustments. Most of the time it is sufficient to use an NTP server to keep the clocks in your system synchronized.
COLLAPSEDVIEW

Does not work — not supported

The resource manager masks details about local workload and resources and presents only information relevant to the remote server.

DYNAMICCRED The resource manager creates credentials within the cluster as needed to support workload. (See Identity Manager Overview.)
EXECUTIONSERVER The resource manager is capable of launching and executing batch workload.
FSISREMOTE Add this flag if the working file system doesn't exist on the server to prevent Moab from validating files and directories at migration.
FULLCP Always checkpoint full job information (useful with Native resource managers).
HOSTINGCENTER The resource manager interface is used to negotiate an adjustment in dynamic resource access.
IGNQUEUESTATE The queue state reported by the resource manager should be ignored. May be used if queues must be disabled inside of a particular resource manager to allow an external scheduler to properly operate.
IGNWORKLOADSTATE

When this flag is applied to a native resource manager, any jobs that are reported via that resource manager's "workload query URL" have their reported state ignored. For example, if an RM has the IgnWorkloadState flag and it reports that a set of jobs have a state of "Running," this state is ignored and the jobs will either have a default state set or will inherit the state from another RM reporting on that same set of jobs.

This flag only changes the behavior of RMs of type NATIVE.

LOCALWORKLOADEXPORT When set, destination peers share information about local and remote jobs, allowing job management of different clusters at a single peer. For more information, see Workload Submission and Control.
MIGRATEALLJOBATTRIBUTES When set, this flag causes additional job information to be migrated to the resource manager; additional job information includes things such as node features applied via CLASSCFG[name] DEFAULT.FEATURES, the account to which the job was submitted, and job walltime limit.
NOAUTORES If the resource manager does not report CPU usage to Moab because CPU usage is at 0%, Moab assumes full CPU usage. When set, Moab recognizes the resource manager report as 0% usage. This is only valid for PBS.
NOCREATERESOURCE To use resources discovered from this resource manager, they must be created by another resource manager first. For example, if you set NOCREATERESOURCE on RM A, which reports nodes 1 and 2, and RM B only reports node 1, then node 2 will not be created because RM B did not report it.
PRIVATE The resources and workload reported by the resource manager are not reported to non-administrator users.
PROXYJOBSUBMISSION Enables Admin proxy job submission, which means administrators may submit jobs in behalf of other users.
PUSHSLAVEJOBUPDATES Enables job changes made on a grid slave to be pushed to the grid head or master. Without this flag, jobs being reported to the grid head do not show any changes made on the remote Moab server (via mjobctl and so forth).
RECORDGPUMETRICS Enables the recording of GPU metrics for nodes.
RECORDMICMETRICS Enables the recording of MIC metrics for nodes.
REPORT N/A
SHARED Resources of this resource manager may be scheduled by multiple independent sources and may not be assumed to be owned by any single source.
SLAVEPEER Information from this resource manager may not be used to identify new jobs or nodes. Instead, this information may only be used to update jobs and nodes discovered and loaded from other non-slave resource managers.
STATIC This resource manager only provides partial object information and this information does not change over time. Consequently, this resource manager may only be called once per object to modify job and node information.
USEPHYSICALMEMORY

This tells Moab to use a node's physical memory instead of the swap space.

For example, if a node has 12 GB of RAM and an additional 12 GB of swap space, it has 24 GB of virtual memory. If a 4 GB job is assigned to that node, the reported available memory shows 12 GB because the job is using the swap space not the physical memory. The reported available memory doesn't decrease until the swap space is used up.

When this flag is set, the 4 GB job immediately reduces the available memory to 8 GB (physical memory - used memory).

USERSPACEISSEPARATE This tells Moab to ignore validating the user's uid and gid in the case that information doesn't exist on the Moab server.
 

Example

# resource manager 'torque' should use asynchronous job start 
# and report resources in 'grid' mode
RMCFG[torque] FLAGS=asyncstart,grid

12.2-C Scheduler/Resource Manager Interactions

In the simplest configuration, Moab interacts with the resource manager using the following four primary functions:

FunctionDescription
GETJOBINFOCollect detailed state and requirement information about idle, running, and recently completed jobs.
GETNODEINFOCollect detailed state information about idle, busy, and defined nodes.
STARTJOBImmediately start a specific job on a particular set of nodes.
CANCELJOBImmediately cancel a specific job regardless of job state.

Using these four simple commands, Moab enables nearly its entire suite of scheduling functions. More detailed information about resource manager specific requirements and semantics for each of these commands can be found in the specific resource manager (such as WIKI) overviews.

In addition to these base commands, other commands are required to support advanced features such as suspend/resume, gang scheduling, and scheduler initiated checkpoint restart.

Information on creating a new scheduler resource manager interface can be found in the Adding New Resource Manager Interfaces section.

© 2014 Adaptive Computing