The documentation below describes how to configure Moab to interface with SLURM.
For Moab-SLURM integration, Moab 6.0 or higher and SLURM 2.2 or higher are recommended. From the downloads page, the generic version is needed to install SLURM. |
slurm.conf
SchedulerType=sched/wiki2
The SchedulerType parameter controls the communication protocol used between Moab and SLURM. This interface can be customized using the wiki.conf configuration file located in the same directory and further documented in the SLURM Admin Manual.
Note: To allow sharing of nodes, the SLURM partition should be configured with 'Shared=yes' attribute.
To configure Moab to use SLURM, the parameter 'RMCFG' should be set to use the WIKI:SLURM protocol as in the example below.
moab.cfg
SCHEDCFG[base] MODE=NORMAL RMCFG[base] TYPE=WIKI:SLURM ...
Note: The RMCFG index (set to base in the example above) can be any value chosen by the site. Also, if SLURM is running on a node other than the one on which Moab is running, then the SERVER attribute of the RMCFG parameter should be set.
Note: SLURM possesses a SchedulerPort parameter which is used to communicate with the scheduler. Moab will auto-detect this port and communicate with SLURM automatically with no explicit configuration required. Do NOT set Moab's SCHEDCFG[] PORT attribute to this value, this port controls Moab client communication and setting it to match the SchedulerPort value will cause conflicts. With no changes, the default configuration will work fine.
Note: If the SLURM client commands/executables are not available on the machine running Moab, SLURM partition and other certain configuration information will not be automatically imported from SLURM, thereby requiring a manual setup of this information in Moab. In addition, the SLURM VERSION should be set as an attribute on the RMCFG parameter. If it is not set, the default is version 1.2.0. The following example shows how to set this line if SLURM v1.1.24 is running on a host named Node01 (set using the SERVER attribute).
moab.cfg with SLURM on Host Node01
RMCFG[base] TYPE=WIKI:SLURM SERVER=Node01 VERSION=10124 ...
Authorizing Users to Use 'Expedite'
To allow users to request 'expedite' jobs, the user will need to be added to the 'expedite' QoS. This can be accomplished using the MEMBERULIST attribute as in the following example:
MEMBERULIST
# allow josh, steve, and user c1443 to submit 'expedite' jobs QOSCFG[expedite] MEMBERULIST=josh,steve,c1443 ...
Excluding Nodes for 'Expedite' and 'Standby' Usage
Both 'expedite' and 'standby' jobs can be independently excluded from certain nodes by creating a QoS-based standing reservation. Specifically, this is accomplished by creating a reservation with a logical-not QoS ACL and a hostlist indicating which nodes are to be exempted as in the following example:
MEMBERULIST
# block expedite jobs from reserved nodes SRCFG[expedite-blocker] QOSLIST=!expedite SRCFG[expedite-blocker] HOSTLIST=c001[3-7],c200 SRCFG[expedite-blocker] PERIOD=INFINITY # block standby jobs from rack 13 SRCFG[standby-blocker] QOSLIST=!standby SRCFG[standby-blocker] HOSTLIST=R:r13-[0-13] SRCFG[standby-blocker] PERIOD=INFINITY ...
moab.cfg
SCHEDCFG[cluster1] MODE=NORMAL SERVER=head.cluster1.org RMCFG[slurm] TYPE=wiki:slurm NODEALLOCATIONPOLICY CONTIGUOUS ...
/usr/local/etc/wiki.conf
AuthKey=4322953 ...
moab.cfg
RMCFG[slurm] TYPE=wiki:slurm AUTHTYPE=CHECKSUM ...
moab-private.cfg
CLIENTCFG[RM:slurm] KEY=4322953 ...
Note: For the CHECKSUM authorization method, the key value specified in the moab-private.cfg file must be a decimal, octal, or hexadecimal value, it cannot be an arbitrary non-numeric string.
slurm.conf
PartitionName=batch Nodes=node[1-64] Default=YES MaxTime=INFINITE State=UP Shared=FORCE
Note: To use SLURM high availability, the SLURM parameter StateSaveLocation must point to a shared directory which is readable and writable by both the primary and backup hosts. See the slurm.conf man page for additional information.