This section describes how to configure, request, and reserve cluster file system space and bandwidth, software licenses, and generic cluster resources.
Shared cluster resources such as file systems, networks, and licenses can be managed through creating a pseudo-node. You can configure a pseudo-node via the NODECFG parameter much as a normal node would be but additional information is required to allow the scheduler to contact and synchronize state with the resource.
In the following example, a license manager is added as a cluster resource by defining the GLOBAL pseudo-node and specifying how the scheduler should query and modify its state.
NODECFG[GLOBAL] RMLIST=NATIVE NODECFG[GLOBAL] QUERYCMD=/usr/local/bin/flquery.sh NODECFG[GLOBAL] MODIFYCMD=/usr/local/bin/flmodify.sh
If defining a pseudo-node other than GLOBAL, the node name will need to be added to the RESOURCELIST list. |
In some cases, pseudo-node resources may be very comparable to node-locked generic resources however there are a few fundamental differences which determine when one method of describing resources should be used over the other. The following table contrasts the two resource types.
Attribute | Pseudo-Node | Generic Resource |
---|---|---|
Node-Locked | No Resources can be encapsulated as an independent node. |
Yes Must be associated with an existing compute node. |
Requires exclusive batch system control over resource | No Resources (such as file systems and licenses) may be consumed both inside and outside of batch system workload. |
Yes Resources must only be consumed by batch workload. Use outside of batch control results in loss of resource synchronization. |
Allows scheduler level allocation of resources | Yes If required, the scheduler can take external administrative action to allocate the resource to the job. |
No The scheduler can only maintain logical allocation information and cannot take any external action to allocate resources to the job. |
Consumable floating resources are configured in the same way as node-locked generic resources with the exception of using the GLOBAL node instead of a particular node.
NODECFG[GLOBAL] GRES=tape:4,matlab:2 ...
In this setup, four resources of type tape and 2 of type matlabare floating and available across all nodes.
Floating resources are requested on a per task basis using native resource manager job submission methods or using the GRES resource manager extensions.
Moab allows both the file space and bandwidth attributes or a cluster file system to be tracked, reserved, and scheduled. With this capability, a job or reservation may request a particular quantity of file space and a required amount of I/O bandwidth to this file system. While file system resources are managed as a cluster generic resource, they are specified using the FS attribute of the NODECFGparameter as in the following example:
NODECFG[GLOBAL] FS=PV1:10000@100,PV2:5000@100 ...
In this example, PV1 defines a 10 GB file system with a maximum throughput of 100 MB/s while PV2 defines a 5 GB file system also possessing a maximum throughput of 100 MB/s.
A job may request cluster file system resources using the fs resource manager extension. For a TORQUE based system, the following could be used:
>qsub -l nodes=1,walltime=1:00:00 -W x=fs:10@50
Jobs may request and reserve software licenses using native methods or using the GRES resource manager extension. If the cluster license manager does not support a query interface, license availability may be specified within Moab using the GRES attribute of the NODECFG parameter.
Example
Configure Moab to support four floating quickcalc and two floating matlab licenses.
NODECFG[GLOBAL] GRES=quickcalc:4,matlab:2 ...
Submit a TORQUE job requesting a node-locked or floating quickcalc license.
> qsub -l nodes=1,software=quickcalc,walltime=72000 testjob.cmd
Moab can be configured to treat generic resources as features in order to provide more control over server access. For instance, if a node is configured with a certain GRES and that GRES is turned off, jobs requesting the node will not run. To turn a GRES into a feature, set the FEATUREGRES attribute of GRESCFG to TRUE in the moab.cfg file.
GRESCFG[gres1] FEATUREGRES=TRUE
Moab now treats gres1
as a scheduler-wide feature rather than a normal generic resource. Note that jobs are submitted normally using the same GRES syntax.
If you are running a grid, verify that FEATUREGRES=TRUE is set on all members of the grid. |
You can safely upgrade an existing cluster to use the feature while jobs are running. If you are in a grid, upgrade all clusters at the same time. |
Two methods exist for managing GRES features: via Moab commands and via the resource manager. Using Moab commands means that feature changes are not checkpointed; they do not remain in place when Moab restarts. Using the resource manager causes changes to be reported by the RM, so any changes made before a Moab restart are still present after it.
These methods are mutually exclusive. Use one or the other, but do not mix methods.
In the following example, gres1
and gres2
are configured in the moab.cfg file. gres1
is not currently functioning correctly, so it is set to 0, turning the feature off. Values above 0 and non-specified values turn the feature on.
NODECFG[GLOBAL] GRES=gres1:0
NODECFG[GLOBAL] GRES=gres2:10000
GRESCFG[gres1] FEATUREGRES=TRUE
GRESCFG[gres2] FEATUREGRES=TRUE
Moab now treats gres1
and gres2
as features. To verify that this is set up correctly, run mdiag -S -v. It returns the following:
> mdiag -S -v
...
Scheduler FeatureGres: gres1:off,gres2:on
Once Moab has started, use mschedctl -m to modify whether the feature is turned on or off.
mschedctl -m sched featuregres:gres1=on
INFO: FeatureGRes 'gres1' turned on
You can verify that the feature turned on or off by once again running mdiag -S -v.
If Moab restarts, it will not checkpoint the state of these changed feature general resources. Instead, it will read the moab.cfg file to determine whether the feature GRES is on or off. |
With feature GRES configured, jobs are submitted normally, requesting GRES type gres1
and gres2
. Moab ignores GRES counts and reads the feature simply as on or off.
> msub -l nodes=1,walltime=600,gres=gres1
1012
> checkjob 1012
job 1012
AName: STDIN
State: Running
.....
StartTime: Tue Jul 3 15:33:28
Feature GRes: gres1
Total Requested Tasks: 1
If you request a feature that is currently turned off, the state is not reported as Running
, but as Idle
. A message like the following returns:
BLOCK MSG: requested feature gres 'gres2' is off
You can automate the process of having a feature GRES turn on and off by setting up an external tool and configuring Moab to query the tool the same way that Moab queries a license manager. For example:
RMCFG[myRM] CLUSTERQUERYURL=file:///$HOME/tools/myRM.dat TYPE=NATIVE RESOURCETYPE=LICENSE
GRESCFG[gres1] FEATUREGRES=TRUE
GRESCFG[gres2] FEATUREGRES=TRUE
LICENSE
means that the RM does not contain any compute resources and that Moab should not attempt to use it to manage any jobs (start, cancel, submit, etc.).
The myRM.dat file should contain something like the following:
GLOBAL state=Idle cres=gres1:0 cres=gres2:10
External tools can easily update the file based on filesystem availability. Switching any of the feature GRES to 0 turns it off and switching it to a positive value turns it on. If you use this external mechanism, you do not need to use mschedctl -m to turn a feature GRES on or off. You also do not need to worry about whether Moab has checkpointed the information or not, since the information is provided by the RM and not by any external commands.
Copyright © 2012 Adaptive Computing Enterprises, Inc.®