All resource managers are not created equal. There is a wide range in what capabilities are available from system to system. Additionally, there is a large body of functionality that many, if not all, resource managers have no concept of. A good example of this is job QoS. Since most resource managers do not have a concept of quality of service, they do not provide a mechanism for users to specify this information. In many cases, Moab is able to add capabilities at a global level. However, a number of features require a per job specification. Resource manager extensions allow this information to be associated with the job.
Specifying resource manager extensions varies by resource manager. TORQUE, OpenPBS, PBSPro, Loadleveler, LSF, S3, and Wiki each allow the specification of an extension field as described in the following table:
Resource Manager | Specification Method | ||
---|---|---|---|
TORQUE 2.0+ |
-l > qsub -l nodes=3,qos=high sleepy.cmd |
||
TORQUE 1.x/OpenPBS |
-W x= > qsub -l nodes=3 -W x=qos:high sleepy.cmd
|
||
Loadleveler |
#@comment #@nodes = 3 #@comment = qos:high |
||
LSF |
-ext > bsub -ext advres:system.2 |
||
PBSPro |
-l > qsub -l advres=system.2
|
||
Wiki |
comment comment=qos:high |
Using the resource manager specific method, the following job extensions are currently available:
ADVRES | |||||
Format: | [!]<RSVID> | ||||
Default: | --- | ||||
Description: |
Specifies that reserved resources are required to run the job. If <RSVID> is specified, then only resources within the specified reservation may be allocated (see Job to Reservation Binding). You can request to not use a specific reservation by using advres=!<reservationname>. |
||||
Example: |
> qsub -l advres=grid.3 > qsub -l advres=!grid.5 |
||||
BANDWIDTH | |||||
Format: | <DOUBLE> (in MB/s) | ||||
Default: | --- | ||||
Description: | Minimum available network bandwidth across allocated resources. (See Network Management.) | ||||
Example: |
> bsub -ext bandwidth=120 chemjob.txt |
||||
DDISK | |||||
Format: | <INTEGER> | ||||
Default: | 0 | ||||
Description: | Dedicated disk per task in MB. | ||||
Example: |
qsub -l ddisk=2000 |
||||
DEADLINE | |||||
Format: | [[[DD:]HH:]MM:]SS | ||||
Default: | --- | ||||
Description: | Relative completion deadline of job (from job submission time). | ||||
Example: |
> qsub -l deadline=2:00:00,nodes=4 /tmp/bio3.cmd |
||||
DEPEND | |||||
Format: | [<DEPENDTYPE>:][{jobname|jobid}.]<ID>[:[{jobname|jobid}.]<ID>]... | ||||
Default: | --- | ||||
Description: | Allows specification of job dependencies for compute or system jobs. If no ID prefix (jobname or jobid) is specified, the ID value is interpreted as a job ID. | ||||
Example: |
# submit job which will run after job 1301 and 1304 complete > msub -l depend=orion.1301:orion.1304 test.cmd orion.1322 # submit jobname-based dependency job > msub -l depend=jobname.data1005 dataetl.cmd orion.1428 |
||||
DMEM | |||||
Format: | <INTEGER> | ||||
Default: | 0 | ||||
Description: | Dedicated memory per task in bytes. | ||||
Example: |
msub -l DMEM=20480 |
||||
EPILOGUE | |||||
Format: | <STRING> | ||||
Default: | --- | ||||
Description: | Specifies a user owned epilogue script which is run
before the system epilogue and epilogue.user scripts at the completion of a job.
The syntax isepilogue=<file>. The file can be designated
with an absolute or relative path.
|
||||
Example: |
msub -l epilogue=epilogue_script.sh job.sh |
||||
EXCLUDENODES | |||||
Format: | {<nodeid>|<node_range>}[:...] | ||||
Default: | --- | ||||
Description: | Specifies nodes that should not be considered for the given job. | ||||
Example: |
msub -l excludenodes=k1:k2:k[5-8] # Comma separated ranges work only with SLURM msub -l excludenodes=k[1-2,5-8] |
||||
FEATURE | |||||
Format: | <FEATURE>[{:|}<FEATURE>]... | ||||
Default: | --- | ||||
Description: | Required list of node attribute/node features.
|
||||
Example: |
> qsub -l feature='fastos:bigio' testjob.cmd |
||||
GATTR | |||||
Format: | <STRING> | ||||
Default: | --- | ||||
Description: | Generic job attribute associated with job. The maximum size for an attribute is 63 bytes (the core Moab size limit of 64, including a null byte) | ||||
Example: |
> qsub -l gattr=bigjob |
||||
GEOMETRY | |||||
Format: | {(<TASKID>[,<TASKID>[,...]])[(<TASKID>[,...])...]} | ||||
Default: | --- | ||||
Description: | Explicitly specified task geometry. | ||||
Example: |
> qsub -l nodes=2:ppn=4 -W x=geometry:'{(0,1,4,5)(2,3,6,7)}' quanta2.cmd |
||||
GMETRIC | |||||
Format: | generic metric requirement for allocated nodes where the requirement is specified using the format <GMNAME>[:{lt:,le:,eq:,ge:,gt:,ne:}<VALUE>] | ||||
Default: | --- | ||||
Description: | Indicates generic constraints that must be found on all allocated nodes. If a <VALUE> is not specified, the node must simply possess the generic metric. (See Generic Metrics for more information.) | ||||
Example: |
> qsub -l gmetric=bioversion:ge:133244 testj.txt |
||||
GPUs | |||||
Format: |
msub -l nodes=<VALUE>:ppn=<VALUE>:gpus=<VALUE>[:mode][:reseterr] Where mode is one of: exclusive - The default setting. The GPU is used exclusively by one process thread. exclusive_thread - The GPU is used exclusively by one process thread. exclusive_process - The GPU is used exclusively by one process regardless of process thread.
If present, Moab passes the modeand
|
||||
Default: | --- | ||||
Description: | Moab schedules GPUs as a special type of node-locked generic resources. When TORQUE reports GPUs to Moab, Moab can schedule jobs and correctly assign GPUs to ensure that jobs are scheduled efficiently. To have Moab schedule GPUs, configure them in TORQUE then submit jobs using the "GPU" attribute. Moab automatically parses the "GPU" attribute and assigns them in the correct manner. For information about GPU metrics, see GPGPUMetrics. | ||||
Examples: |
> msub -l nodes=2:ppn=2:gpus=1:exclusive_process:reseterr > msub -l nodes=4:gpus=1,tpn=2 > msub -l nodes=4:gpus=1:reseterr > msub -l nodes=4:gpus=2+1:ppn=2,walltime=600 |
||||
GRES and SOFTWARE | |||||
Format: | Percent sign (%) delimited list of generic resources where each resource is specified using the format <RESTYPE>[{+|:}<COUNT>] | ||||
Default: | --- | ||||
Description: | Indicates generic resources required by the job. If the generic resource is node-locked, it is a per-task count. If a <COUNT> is not specified, the resource count defaults to 1. | ||||
Example: |
> qsub -W x=GRES:tape+2%matlab+3 testj.txt
> qsub -l gres=tape+2%matlab+3 testj.txt > qsub -l software=matlab:2 testj.txt |
||||
HOSTLIST | |||||
Format: | '+' delimited list of hostnames; also, ranges and regular expressions | ||||
Default: | --- | ||||
Description: | Indicates an exact set, superset, or subset of nodes on
which the job must run.
|
||||
Examples: |
> msub -l hostlist=nodeA+nodeB+nodeE hostlist=foo[1-5] hostlist=foo1+foo[3-9] hostlist=foo[1,3-9] hostlist=foo[1-3]+bar[72-79] |
||||
JGROUP | |||||
Format: | <JOBGROUPID> | ||||
Default: | --- | ||||
Description: | ID of job group to which this job belongs (different from the GID of the user running the job). | ||||
Example: |
> msub -l JGROUP=bluegroup |
||||
JOBFLAGS (aka FLAGS) | |||||
Format: | one or more of the following colon delimited job flags including ADVRES[:RSVID], NOQUEUE, NORMSTART, PREEMPTEE, PREEMPTOR, RESTARTABLE, or SUSPENDABLE(see job flag overview for a complete listing) | ||||
Default: | --- | ||||
Description: | Associates various flags with the job. | ||||
Example: |
> qsub -l nodes=1,walltime=3600,jobflags=advres myjob.py
|
||||
JOBREJECTPOLICY | |||||
Format: | One or more of CANCEL, HOLD, IGNORE (beta), MAIL, or RETRY | ||||
Default: | HOLD | ||||
Details: |
Specifies the action to take when the scheduler determines that a job can never run. CANCEL issues a call to the resource manager to cancel the job. HOLD places a batch hold on the job preventing the job from being further evaluated until released by an administrator. (Note: Administrators can dynamically alter job attributes and possibly fix the job with mjobctl -m.) With IGNORE (currently in beta), the scheduler will allow the job to exist within the resource manager queue but will neither process it nor report it. MAIL will send email to both the admin and the user when rejected jobs are detected. If RETRY is set, then Moab will allow the job to remain idle and will only attempt to start the job when the policy violation is resolved. Any combination of attributes may be specified. See QOSREJECTPOLICY. This is a per-job policy specified with msub -l. JOBREJECTPOLICY also exists as a global parameter. |
||||
Example: |
> msub -l jobrejectpolicy=cancel:mail |
||||
LOGLEVEL | |||||
Format: | <INTEGER> | ||||
Default: | --- | ||||
Description: | Per job log verbosity. | ||||
Example: |
> qsub -l -W x=loglevel:5 bw.cmd |
||||
MAXMEM | |||||
Format: | <INTEGER> (in megabytes) | ||||
Default: | --- | ||||
Description: | Maximum amount of memory the job may consume across all tasks before the JOBMEM action is taken. | ||||
Example: |
> qsub -W x=MAXMEM:1000mb bw.cmd |
||||
MAXPROC | |||||
Format: | <INTEGER> | ||||
Default: | --- | ||||
Description: | Maximum CPU load the job may consume across all tasks before the JOBPROC action is taken. | ||||
Example: |
> qsub -W x=MAXPROC:4 bw.cmd |
||||
MINPREEMPTTIME | |||||
Format: | [[DD:]HH:]MM:]SS | ||||
Default: | --- | ||||
Description: | Minimum time job must run before being eligible for preemption.
|
||||
Example: |
> qsub -l minpreempttime=900 bw.cmd |
||||
MINPROCSPEED | |||||
Format: | <INTEGER> | ||||
Default: | 0 | ||||
Description: | Minimum processor speed (in MHz) for every node that this job will run on. | ||||
Example: |
> qsub -W x=MINPROCSPEED:2000 bw.cmd |
||||
MINWCLIMIT | |||||
Format: | [[DD:]HH:]MM:]SS | ||||
Default: | 1:00:00 | ||||
Description: | Minimum wallclock limit job must run before being eligible for extension. (See JOBEXTENDDURATION or JOBEXTENDSTARTWALLTIME.) | ||||
Example: |
> qsub -l minwclimit=300,walltime=16000 bw.cmd |
||||
MSTAGEIN | |||||
Format: | [<SRCURL>[|<SRCRUL>...]%]<DSTURL> | ||||
Default: | --- | ||||
Description: | Indicates a job has data
staging requirements. The source URL(s) listed will be transferred to the execution system for use by the job. If more than one source URL is specified, the
destination URL must be a directory. The format of <SRCURL> is: [PROTO://][HOST][:PORT]][/PATH]where the path is local. The format of <DSTURL> is: [PROTO://][HOST][:PORT]][/PATH]where the path is remote. PROTO can be any of the following protocols: ssh, file, or gsiftp. HOST is the name of the host where the file resides. PATH is the path of the source or destination file. The destination path may be a directory when sending a single file and must be a directory when sending multiple files. If a directory is specified, it must end with a forward slash (/). Valid variables include: $JOBID $HOME - Path the script was run from $RHOME - Home dir of the user on the remote system $SUBMITHOST $DEST - This is the Moab where the job will run $LOCALDATASTAGEHEAD
|
||||
Example: |
> msub -W x='mstagein=file://$HOME/helperscript.sh|file:///home/dev/datafile.txt%ssh://host/home/dev/' script.sh |
||||
MSTAGEOUT | |||||
Format: | [<SRCURL>[|<SRCRUL>...]%]<DSTURL> | ||||
Default: | --- | ||||
Description: | Indicates whether a job has data
staging requirements. The source URL(s) listed will be transferred from the execution system after the completion of the job. If more than one source URL is specified, the
destination URL must be a directory. The format of <SRCURL> is: [PROTO://][HOST][:PORT]][/PATH]where the path is remote. The format of <DSTURL> is: [PROTO://][HOST][:PORT]][/PATH]where the path is local. PROTO can be any of the following protocols: ssh, file, or gsiftp. HOST is the name of the host where the file resides. PATH is the path of the source or destination file. The destination path may be a directory when sending a single file and must be a directory when sending multiple files. If a directory is specified, it must end with a forward slash (/). Valid variables include: $JOBID $HOME - Path the script was run from $RHOME - Home dir of the user on the remote system $SUBMITHOST $DEST - This is the Moab where the job will run $LOCALDATASTAGEHEAD
|
||||
Example: |
> msub -W x='mstageout=ssh://$DEST/$HOME/resultfile1.txt|ssh://host/home/dev/resultscript.sh%file:///home/dev/' script.sh |
||||
NACCESSPOLICY
|
|||||
Format: | one of SHARED,
SINGLEJOB, SINGLETASK, SINGLEUSER, or UNIQUEUSER |
||||
Default: | --- |
||||
Description: | Specifies how node resources should be accessed. (See Node Access Policies for more information).
|
||||
Example: |
> qsub -l naccesspolicy=singleuser bw.cmd > bsub -ext naccesspolicy=singleuser lancer.cmd |
||||
NALLOCPOLICY
|
|||||
Format: | one of the valid settings for the parameter NODEALLOCATIONPOLICY | ||||
Default: | --- |
||||
Description: | Specifies how node resources should be selected and allocated to the job. (See Node Allocation Policies for more information.) | ||||
Example: |
> qsub -l nallocpolicy=minresource bw.cmd |
||||
NCPUS
|
|||||
Format: | <INTEGER> | ||||
Default: | --- |
||||
Description: | The number of processors in one task where a task
cannot span nodes. If NCPUS is used, then the resource manager's SUBMITPOLICY should be set to
NODECENTRIC to get correct behavior. -l
ncpus=<#> is equivalent to -l nodes=1:ppn=<#> when JOBNODEMATCHPOLICY is set to EXACTNODE.
NCPUS is used when submitting jobs to an SMP. When using GPUs to submit to an SMP, use -1 ncpus=<#>:GPUs=<#> . |
||||
NMATCHPOLICY
|
|||||
Format: | one of the valid settings for the parameter JOBNODEMATCHPOLICY | ||||
Default: | --- |
||||
Description: | Specifies how node resources should be selected and allocated to the job. | ||||
Example: |
> qsub -l nodes=2 -W x=nmatchpolicy:exactnode bw.cmd |
||||
NODESET | |||||
Format: | <SETTYPE>:<SETATTR>[:<SETLIST>] | ||||
Default: | --- | ||||
Description: | Specifies nodeset constraints for job resource allocation. (See the NodeSet Overview for more information.) | ||||
Example: |
> qsub -l nodeset=ONEOF:PROCSPEED:350:400:450 bw.cmd |
||||
NODESETCOUNT | |||||
Format: | <INTEGER> | ||||
Default: | --- | ||||
Description: | Specifies how many node sets a job uses. | ||||
Example: |
> msub -l nodesetcount=2 |
||||
NODESETDELAY | |||||
Format: | [[DD:]HH:]MM:]SS | ||||
Default: | --- | ||||
Description: |
Causes Moab to attempt to span a job evenly across nodesets unless doing so delays the job beyond the requested NODESETDELAY.
|
||||
Example: |
> qsub -l nodesetdelay=300,walltime=16000 bw.cmd |
||||
NODESETISOPTIONAL | |||||
Format: | <BOOLEAN> | ||||
Default: | --- | ||||
Description: | Specifies whether the nodeset constraint is optional. (See the NodeSet Overview for more information.)
|
||||
Example: |
> msub -l nodesetisoptional=true bw.cmd |
||||
OPSYS | |||||
Format: | <OperatingSystem> | ||||
Default: | --- | ||||
Description: | Specifies the job's required operating system. | ||||
Example: |
> qsub -l nodes=1,opsys=rh73 chem92.cmd |
||||
PARTITION | |||||
Format: | <STRING>[{,|:}<STRING>]... | ||||
Default: | --- | ||||
Description: | Specifies the partition (or partitions) in
which the job must run.
|
||||
Example: |
> qsub -l nodes=1,partition=math:geology |
||||
PLACEMENT | |||||
Format: | [numa=X][[:]sockets=Y][:usethreads] | ||||
Default: | --- | ||||
Description: | Specifies the task placement of jobs. | ||||
Example: |
> msub -l nodes=4:ppn=2,placement=numa=2 This means to place the job on 4 compute nodes with 2 processors per node, with 2 different NUMA nodes per compute node, and 1 processor per NUMA node. |
||||
PREF | |||||
Format: | [{feature|variable}:]<STRING>[:<STRING>]...
|
||||
Default: | --- | ||||
Description: | Specifies which node
features are preferred by the job and should be allocated if available. If
preferred node criteria are specified, Moab favors the allocation of matching
resources but is not bound to only consider these resources.
|
||||
Example: |
> qsub -l nodes=1,pref=bigmem The job may run on any nodes but prefers to allocate nodes with the bigmem feature. |
||||
PROCS | |||||
Format: | <INTEGER> | ||||
Default: | --- | ||||
Description: |
Requests a specific amount of processors for the job. Instead of users trying to determine the amount of nodes they need, they can instead decide how many processors they need and Moab will automatically request the appropriate amount of nodes from the RM. This also works with feature requests, such as procs=12[:feature1[:feature2[-]]].
|
||||
Example: |
msub -l procs=32 myjob.pl Moab will request as many nodes as is necessary to meet the 32-processor requirement for the job. |
||||
PROLOGUE | |||||
Format: | <STRING> | ||||
Default: | --- | ||||
Description: | Specifies a user owned prologue script which will be run after the system
prologue and prologue.user scripts at the beginning of a job. The syntax
isprologue=<file>. The file can be designated with an
absolute or relative path.
|
||||
Example: |
msub -l prologue=prologue_script.sh job.s |
||||
QoS | |||||
Format: | <STRING> | ||||
Default: | --- | ||||
Description: | Requests the specified QoS for the job. | ||||
Example: |
> qsub -l walltime=1000,qos=highprio biojob.cmd |
||||
QUEUEJOB | |||||
Format: |
<BOOLEAN> |
||||
Default: | TRUE | ||||
Description: | Indicates whether or not the scheduler should queue the job if resources are not available to run the job immediately | ||||
Example: |
msub -l nodes=1,queuejob=false test.cmd |
||||
REQATTR | |||||
Format: | Required node attributes with version number support: <ATTRIBUTE>[{>=|>|<=|<|=}<VERSION>] | ||||
Default: | --- | ||||
Description: | Indicates required node attributes. | ||||
Example: |
> qsub -l reqattr=matlab=7.1 testj.txt |
||||
RESFAILPOLICY | |||||
Format: | one of CANCEL, HOLD, IGNORE, NOTIFY, or REQUEUE | ||||
Default: | --- | ||||
Description: | Specifies the action to take on an executing job if one or more allocated nodes fail. This setting overrides the global value specified with the NODEALLOCRESFAILUREPOLICY parameter. | ||||
Example: |
msub -l resfailpolicy=ignore |
||||
RMTYPE | |||||
Format: | <STRING> | ||||
Default: | --- | ||||
Description: | One of the resource manager types currently available within the cluster or grid. Typically, this is one of PBS, LSF, LL, SGE, SLURM, BProc, and so forth. | ||||
Example: |
msub -l rmtype=ll |
||||
SIGNAL | |||||
Format: | <INTEGER>[@<OFFSET>] | ||||
Default: | --- | ||||
Description: | Specifies the pre-termination signal to be sent to a job prior to it reaching its walltime limit or being terminated by Moab. The optional offset value specifies how long before job termination the signal should be sent. By default, the pre-termination signal is sent one minute before a job is terminated | ||||
Example: |
> msub -l signal=32@120 bio45.cmd |
||||
SPRIORITY | |||||
Format: | <INTEGER> | ||||
Default: | 0 | ||||
Description: | Allows Moab administrators to set a system priority on a job (similar to setspri). This only works if the job submitter is an administrator. | ||||
Example: |
> qsub -l nodes=16,spriority=100 job.cmd |
||||
TEMPLATE | |||||
Format: | <STRING> | ||||
Default: | --- | ||||
Description: | Specifies a job template to be used as a set template. The requested template must have SELECT=TRUE (See Job Templates.) | ||||
Example: |
> msub -l walltime=1000,nodes=16,template=biojob job.cmd |
||||
TERMTIME | |||||
Format: | <TIMESPEC> | ||||
Default: | 0 | ||||
Description: | Specifies the time at which Moab should cancel a queued or active job. (See Job Deadline Support.) | ||||
Example: |
> msub -l nodes=10,walltime=600,termtime=12:00_Jun/14 job.cmd |
||||
TPN | |||||
Format: | <INTEGER>[+] | ||||
Default: | 0 | ||||
Description: | Tasks per node allowed on allocated hosts. If the plus (+) character is
specified, the tasks per node value is interpreted as a minimum tasks per node
constraint; otherwise it is interpreted as an exact tasks per node constraint.
Note on Differences between TPN and PPN: There are two key differences between the following: (A) qsub -l nodes=12:ppn=3 and (B) qsub -l nodes=12,tpn=3 The first difference is that ppn is interpreted as the minimumrequired tasks per node while tpn defaults to exact tasks per node; case (B) executes the job with exactly 3 tasks on each allocated node while case (A) executes the job with at least 3 tasks on each allocated node-nodeA:4,nodeB:3,nodeC:5 The second major difference is that the line, nodes=X:ppn=Yactually requests X*Y tasks, whereas nodes=X,tpn=Y requests only X tasks. |
||||
Example: |
> msub -l nodes=10,walltime=600,tpn=4 job.cmd |
||||
TRIG | |||||
Format: | <TRIGSPEC> | ||||
Default: | --- | ||||
Description: | Adds trigger(s) to the job. (See the Trigger
Specification Page for specific syntax.)
|
||||
Example: |
> qsub -l trig=start:exec@/tmp/email.sh job.cmd |
||||
TRL (Format 1) | |||||
Format: | <INTEGER>[@<INTEGER>][:<INTEGER>[@<INTEGER>]]... | ||||
Default: | 0 | ||||
Description: | Specifies alternate task requests with their optional walltimes. (See Malleable Jobs.) | ||||
Example: |
> msub -l trl=2@500:4@250:8@125:16@62 job.cmd or > qsub -l trl=2:3:4
|
||||
TRL (Format 2) | |||||
Format: | <INTEGER>-<INTEGER> | ||||
Default: | 0 | ||||
Description: | Specifies a range of task requests that require the same walltime. (See Malleable Jobs.) | ||||
Example: |
> msub -l trl=32-64 job.cmd
|
||||
VAR | |||||
Format: | <ATTR>:<VALUE> | ||||
Default: | --- | ||||
Description: | Adds a generic variable or variables to the job. | ||||
Example: |
VAR=testvar1:testvalue1 Single variable VAR=testvar1:testvalue1+testvar2:testvalue2+testvar3:testvalue3 VAR=testvar1:testvalue1+testvar2: testvalue2+testvar3:testvalue3 Multiple variables |
||||
VC | |||||
Format: | vc=<NAME> | ||||
Default: | --- | ||||
Description: | Submits the job or workflow to a virtual container (VC). | ||||
Example: |
vc=vc13 |
||||
If more than one extension is required in a given job, extensions can be concatenated with a semicolon separator using the format <ATTR>:<VALUE>[;<ATTR>:<VALUE>]...
Example 1
#@comment="HOSTLIST:node1,node2;QOS:special;SID:silverA"
Job must run on nodes node1 and node2 using the QoS special. The job is also associated with the system ID silverAallowing the silver daemon to monitor and control the job.
Example 2
# PBS -W x=\"NODESET:ONEOF:NETWORK;DMEM:64\"
Job will have resources allocated subject to network based nodeset constraints. Further, each task will dedicate 64 MB of memory.
Example 3
> qsub -l nodes=4,walltime=1:00:00 -W x="FLAGS:ADVRES:john.1"
Job will be forced to run within the john.1 reservation.
Copyright © 2012 Adaptive Computing Enterprises, Inc.®