(Click to open topic with navigation)
4.323.1 Synopsis
mjobctl -c -w attr=val
mjobctl -e jobid
mjobctl -h [User|System|Batch|Defer|All] jobexp
mjobctl -m attr{+=|=|-=}valjobexp
mjobctl -N [<SIGNO>] jobexp
mjobctl -n <JOBNAME>
mjobctl -p <PRIORITY> jobexp
mjobctl -q {diag|starttime|hostlist} jobexp
mjobctl -w attr{+=|=|-=}valjobexp
mjobctl -x [-w flags=val] jobexp
4.323.2 Overview
4.323.3 Format
-c - Cancel | |
---|---|
Format | JOBEXP |
Description |
Cancel a job. Use -w (following a -c flag) to specify job cancellation according to given credentials or job attributes. See -c -w for more information.
You can use mjobctl -c flags=follow-dependency <job_id> to cancel all jobs that the <job_id> depends on. If you wish to cancel all jobs that depend on this <job_id>, add FLAGS=CANCELFAILEDDEPENDENCYJOBS to your SCHEDCFG entry in moab.cfg file. See CANCELFAILEDDEPENDENCYJOBS for more information. |
Example: |
> mjobctl -c job1045 Cancel job job1045. |
-c -w - Cancel Where | |
---|---|
Format | <ATTR>=<VALUE>
where <ATTR>=[ user | account | qos | class | reqreservation(RsvName) | state (JobState) | jobname(JobName, not job ID)] | partition |
Description |
Cancel a job based on a given credential or job
attribute. SeeJob States for a list of all valid job states. Also, you can cancel jobs from given partitions using -w partition=<PAR1>[<PAR2>...]]; however, you must also either use another -w flag to specify a job or use the standard job expression. |
Example |
> mjobctl -c -w state=USERHOLD Cancels all jobs that currently have a USERHOLD on them. > mjobctl -c -w user=user1 -w acct=acct1 Cancels all jobs assigned to user1 or acct1. |
-C - Checkpoint | |
---|---|
Format | JOBEXP |
Description | Checkpoint a job. See Checkpoint/Restart Facilities for more information. |
Example |
> mjobctl -C job1045 Checkpoint job job1045. |
-F - Force Cancel | |
---|---|
Format | JOBEXP |
Description | Forces a job to cancel and ignores previous cancellation attempts. |
Example |
> mjobctl -F job1045 Force cancel job job1045. |
-h - Hold | |
---|---|
Format | <HOLDTYPE><JOBEXP>
<HOLDTYPE> = { user | batch | system | defer | ALL } |
Default | user |
Description | Set or release a job hold See Job Holds for more information |
Example |
> mjobctl -h user job1045 Set a user hold on job job1045. > mjobctl -u all job1045 Unset all holds on job job1045. |
-m - Modify | |
---|---|
Format | <ATTR>{ += | =| -= } <VAL>
When using mjobctl -m with the hostlist attribute, only "=" is supported. If using Torque and mjobctl -m with the partition attribute, only "=" is supported. "+=", "-=", and "=" are supported with other resource managers (SLURM or Native). <ATTR>={ account | advres | arraylimit | awduration| class | cpuclock | deadline | depend | eeduration | env | features | feature | flags | gres | group | hold | hostlist | jobdisk | jobmem | jobname | jobswap | loglevel | maxmem | messages | minstarttime | nodeaccess | nodecount | notificationaddress | partition | priority | queue | qos | reqreservation | rmxstring | reqattr | reqawduration | sysprio | tpn | trig | trigvar | user | userprio | var | wclimit} |
Description |
Modify a specific job attribute. If an mjobctl -m attribute can affect how a job starts, then it generally cannot affect a job that is already running. For example, it is not feasible to change the hostlist of a job that is already running. The userprio attribute allows you to specify user priority. For job priority, use the '-p' flag. Modification of the job dependency is also communicated to the resource manager in the case of SLURM and PBS/Torque. Adding --flags=warnifcompleted causes a warning message to print when a job completes. To define values for awduration, eeduration, minstarttime (Note that the minstarttime attribute performs the same function as msub -a.), reqawduration, and wclimit, use the time spec format. A non-active job's partition list can be modified. If using Torque, only "=" (set) is supported. If using SLURM or a Native resource manager you can add or subtract partitions, even multiple partitions. When adding or subtracting multiple partitions, each partition must have its own -m partition{+= | = | -=}name on the command line. An example for adding multiple partitions is provided in the list of examples. To modify a job's generic resources, use the following format: gres{ += | = | -= } <gresName>[:<count>]. <gresName> is a single resource, not a list. <count> is an integer that, if not specified, is assumed to be 1. Modifying a job's generic resources causes Moab to append the new gres (+=), subtract the specified gres (-=), or clear out all existing generic resources attached to the job and override them with the newly-specified one (=). If <gresName> is an empty string, all generic resources will be removed from the job. To modify the node access policy for a queued job, use nodeaccess=[<policy>]. See 4.411 Node Access Policies for a listed of supported node access policies. |
Example |
> mjobctl -m messages+="Adding a message" --flags=completed 1664 Set the message on the job, even if the job is completed. > mjobctl -m reqawduration+=600 1664 Add 10 minutes to the job walltime. > mjobctl -m eeduration=-1 1664 Reset job's effective queue time, to when the job was submitted. > mjobctl -m var=Flag1=TRUE 1664 Set the job variable Flag1 to TRUE. > mjobctl -m notificationaddress="[email protected]" Sets the notification e-mail address associated with a job to [email protected]. > mjobctl -m partition+=p3 -m partition+=p4 Moab.5 Adds multiple partitions (p3 and p4) to job Moab.5. Torque only supports "=" . "+=", "-=", and "=" are supported with other resource managers (SLURM or Native). > mjobctl -m arraylimit=10 sim.25 Changes the concurrently running sub-job limit to 10 for array sim.25. > mjobctl -m gres=matlab:1 job0201 Overrides all generic resources applied to job job0201 and replaces them with 1 matlab. > mjobctl -m user=user.job Modifies the user of a job that was submitted directly to moab (msub) and has not yet been migrated. > mjobctl -m userprio-=100 Moab.4 Reduces the user priority of Moab.4 by 100. > mjobctl -m tpn=2 Moab.128 Changes the requested "tasks per node" for job Moab.128 to 2. > mjobctl -m maxmem=80mb 157 Modifies the total job memory of job 157. See MAXMEM for more information. |
-N - Notify | |
---|---|
Format | [signal=]<SIGID>JOBEXP |
Description | Send a signal to all jobs matching the job expression. |
Example |
> mjobctl -N INT 1664 Send an interrupt signal to job 1664. > mjobctl -N 47 1664 Send signal 47 to job 1664. |
-n - Name | |
---|---|
Format | |
Description | Select jobs by job name. |
Example |
-r - Resume | |
---|---|
Format | JOBEXP |
Description | Resume a job. |
Example |
> mjobctl -r job1045 Resume job job1045. |
-R - Requeue | |
---|---|
Format | JOBEXP |
Description | Requeue a job. |
Example |
> mjobctl -R job1045 Requeue job job1045. |
-s - Suspend | |
---|---|
Format | JOBEXP |
Description | Suspend a job. For more information, see Suspend/Resume Handling. |
Example |
> mjobctl -s job1045 Suspend job job1045. |
-u - Unhold | |
---|---|
Format | [<TYPE>[,<TYPE>]]JOBEXP <TYPE> = [ user | system | batch | defer | ALL ] |
Default | ALL |
Description | Release a hold on a job See Job Holds for more information. |
Example |
> mjobctl -u user,system scrib.1045 Release user and system holds on job scrib.1045. |
-x - Execute | |
---|---|
Format | JOBEXP |
Description | Execute a job. The -w option allows flags to be set for the job. Allowable flags are, ignorepolicies, ignorenodestate, and ignorersv. |
Example |
> mjobctl -x job1045 Execute job job1045. > mjobctl -x -w flags=ignorepolicies job1046 Execute job job1046 and ignore policies, such as MaxJobPerUser. |
4.323.4 Parameters
JOB EXPRESSION | |
---|---|
Format | <STRING> |
Description | The name of a job or a regular expression for several jobs. The flags that support job expressions can use node expression syntax as described in Node Selection. Using x: indicates the following string is to be interpreted as a regular expression, and using r: indicates the following string is to be interpreted as a range.
Job expressions do not work for array sub-jobs.
Moab uses regular expressions conforming to the POSIX 1003.2 standard. This standard is somewhat different than the regular expressions commonly used for filename matching in Unix environments (see man 7 regex). To interpret a job expression as a regular expression, use x:. In most cases, it is necessary to quote the job expression (for example, job13[5-9]) to prevent the shell from intercepting and interpreting the special characters. The mjobctl command accepts a comma delimited list of job expressions. Example usage might be mjobctl -r job[1-2],job4 or mjobctl -c job1,job2,job4. |
Example: |
> mjobctl -c "x:80.*" job '802' cancelled job '803' cancelled job '804' cancelled job '805' cancelled job '806' cancelled job '807' cancelled job '808' cancelled job '809' cancelled Cancel all jobs starting with 80. > mjobctl -m priority+=200 "x:74[3-5]" job '743' system priority modified job '744' system priority modified job '745' system priority modified > mjobctl -h x:17.* # This puts a hold on any job that has a 17 that is followed by an unlimited amount of any # character and includes jobs 1701, 17mjk10, and 17DjN_JW-07 > mjobctl -h r:1-17 # This puts a hold on jobs 1 through 17. |
mjobctl information can be reported as XML as well. This is done with the command mjobctl -q diag <JOB_ID>.
4.323.5.A XML Attributes
Name | Description |
---|---|
Account | The account assigned to the job |
AllocNodeList | The nodes allocated to the job |
Args | The job's executable arguments |
AWDuration | The active wall time consumed |
BlockReason | The block message index for the reason the job is not eligible |
Bypass | Number of times the job has been bypassed by other jobs |
Calendar | The job's timeframe constraint calendar |
Class | The class assigned to the job |
CmdFile | The command file path |
CompletionCode | The return code of the job as extracted from the RM |
CompletionTime | The time of the job's completion |
Cost | The cost of executing the job relative to an accounting manager |
CPULimit | The CPU limit for the job |
Depend | Any dependencies on the status of other jobs |
DRM | The master destination RM |
DRMJID | The master destination RM job ID |
EEDuration | The duration of time the job has been eligible for scheduling |
EFile | The stderr file |
Env | The job's environment variables set for execution |
EnvOverride | The job's overriding environment variables set for execution |
EState | The expected state of the job |
EstHistStartTime | The estimated historical start time |
EstPrioStartTime | The estimated priority start time |
EstRsvStartTime | The estimated reservation start time |
ExcHList | The excluded host list |
Flags | Command delimited list of Moab flags on the job |
GAttr | The requested generic attributes |
GJID | The global job ID |
Group | The group assigned to the job |
Hold | The hold list |
Holdtime | The time the job was put on hold |
HopCount | The hop count between the job's peers |
HostList | The requested host list |
IFlags | The internal flags for the job |
IsInteractive | If set, the job is interactive |
IsRestartable | If set, the job is restartable |
IsSuspendable | If set, the job is suspendable |
IWD | The directory where the job is executed |
JobID | The job's batch ID. |
JobName | The user-specified name for the job |
JobGroup | The job ID relative to its group |
LogLevel | The individual log level for the job |
MasterHost | The specified host to run primary tasks on |
Messages | Any messages reported by Moab regarding the job |
MinPreemptTime | The minimum amount of time the job must run before being eligible for preemption |
Notification | Any events generated to notify the job's user |
OFile | The stdout file |
OldMessages | Any messages reported by Moab in the old message style regarding the job |
OWCLimit | The original wallclock limit |
PAL | The partition access list relative to the job |
QueueStatus | The job's queue status as generated this iteration |
QOS | The QoS assigned to the job |
QOSReq | The requested QoS for the job |
ReqAWDuration | The requested active walltime duration |
ReqCMaxTime | The requested latest allowed completion time |
ReqMem | The total memory requested/dedicated to the job |
ReqNodes | The number of requested nodes for the job |
ReqProcs | The number of requested procs for the job |
ReqReservation | The required reservation for the job |
ReqRMType | The required RM type |
ReqSMinTime | The requested earliest start time |
RM | The master source resource manager |
RMXString | The resource manager extension string |
RsvAccess | The list of reservations accessible by the job |
RsvStartTime | The reservation start time |
RunPriority | The effective job priority |
Shell | The execution shell's output |
SID | The job's system ID (parent cluster) |
Size | The job's computational size |
STotCPU | The average CPU load tracked across all nodes |
SMaxCPU | The max CPU load tracked across all nodes |
STotMem | The average memory usage tracked across all nodes |
SMaxMem | The max memory usage tracked across all nodes |
SRMJID | The source RM's ID for the job |
StartCount | The number of the times the job has tried to start |
StartPriority | The effective job priority |
StartTime | The most recent time the job started executing |
State | The state of the job as reported by Moab |
StatMSUtl | The total number of memory seconds utilized |
StatPSDed | The total number of processor seconds dedicated to the job |
StatPSUtl | The total number of processor seconds utilized by the job |
StdErr | The path to the stderr file |
StdIn | The path to the stdin file |
StdOut | The path to the stdout file |
StepID | StepID of the job (used with LoadLeveler systems) |
SubmitHost | The host where the job was submitted |
SubmitLanguage | The RM language that the submission request was performed |
SubmitString | The string containing the entire submission request |
SubmissionTime | The time the job was submitted |
SuspendDuration | The amount of time the job has been suspended |
SysPrio | The admin specified job priority |
SysSMinTime | The system specified min. start time |
TaskMap | The allocation taskmap for the job |
TermTime | The time the job was terminated |
User | The user assigned to the job |
UserPrio | The user specified job priority |
UtlMem | The utilized memory of the job |
UtlProcs | The number of utilized processors by the job |
Variable | |
VWCTime | The virtual wallclock limit |
4.323.6 Examples
Example 4-99:
> mjobctl -q diag ALL --format=xml <Data><job AWDuration="346" Class="batch" CmdFile="jobsleep.sh" EEDuration="0" EState="Running" Flags="RESTARTABLE" Group="test" IWD="/home/test" JobID="11578" QOS="high" RMJID="11578.lolo.icluster.org" ReqAWDuration="00:10:00" ReqNodes="1" ReqProcs="1" StartCount="1" StartPriority="1" StartTime="1083861225" StatMSUtl="903.570" StatPSDed="364.610" StatPSUtl="364.610" State="Running" SubmissionTime="1083861225" SuspendDuration="0" SysPrio="0" SysSMinTime="00:00:00" User="test"><req AllocNodeList="hana" AllocPartition="access" ReqNodeFeature="[NONE]" ReqPartition="access"></req></job><job AWDuration="346" Class="batch" CmdFile="jobsleep.sh" EEDuration="0" EState="Running" Flags="RESTARTABLE" Group="test" IWD="/home/test" JobID="11579" QOS="high" RMJID="11579.lolo.icluster.org" ReqAWDuration="00:10:00" ReqNodes="1" ReqProcs="1" StartCount="1" StartPriority="1" StartTime="1083861225" StatMSUtl="602.380" StatPSDed="364.610" StatPSUtl="364.610" State="Running" SubmissionTime="1083861225" SuspendDuration="0" SysPrio="0" SysSMinTime="00:00:00" User="test"><req AllocNodeList="lolo" AllocPartition="access" ReqNodeFeature="[NONE]" ReqPartition="access"></req></job></Data>
Related Topics