8.4.1 Simple example of preemption

This section illustrates the process of setting up preemption on your system from beginning to end and contains examples of what actions to take and what you should see as you go.

Note For this basic setup example, we will have a user who can submit to either a test1 or test2 QoS. This example will use a REQUEUE preemption type.

First, you will need to make some configurations to the moab.cfg file.

Configuring moab.cfg

Here is an example of a moab.cfg file.

GUARANTEEDPREEMPTION TRUE
#should not be JOBNODEMATCHPOLICY EXACTNODE as it causes problems when starting jobs 

PREEMPTIONPOLICY REQUEUE

QOSCFG[test1] QFLAGS=PREEMPTEE JOBFLAGS=RESTARTABLE MEMBERULIST=john PRIORITY=100
QOSCFG[test2] QFLAGS=PREEMPTOR MEMBERULIST=john PRIORITY=1000

To configure the moab.cfg file

  1. Set GUARANTEEDPREEMPTION to TRUE. (This locks the job on a node and keeps trying to preempt.)
  2. Make sure that JOBNODEMATCHPOLICY is not set to EXACTNDODE.
  3. Set the PREEMPTPOLCY type. In this example, PREEMPTPOLICY is set to REQUEUE. (For information about different PREEMPTPOLICY types, see Choosing a PREEMPTPOLICY type.)
  4. Set up QFLAGS to mark jobs as PREEMPTEE (a lower-priority job that can be preempted by a higher-priority job), or as PREEMPTOR (a higher-priority job that can preempt a lower-priority job). As in the example:

  5. QOSCFG[test1] QFLAGS=PREEMPTEE JOBFLAGS=RESTARTABLE MEMBERULIST=john PRIORITY=100
    QOSCFG[test2] QFLAGS=PREEMPTOR MEMBERULIST=john PRIORITY=10000

    Note For this example, we also set JOBFLAGS=RESTARTABLE (because this example uses REQUEUE).
  6. Make sure that the PREEMPTEE job has a lower priority than the PREEMPTOR job. As in the example:

  7. QOSCFG[test1] QFLAGS=PREEMPTEE JOBFLAGS=RESTARTABLE MEMBERULIST=john PRIORITY=100
    QOSCFG[test2] QFLAGS=PREEMPTOR MEMBERULIST=john PRIORITY=10000

Now you can submit a job to the PREEMPTEE QoS (test1).

Submitting a job to the PREEMPTEE QoS

In this example, we will submit a job to the PREEMPTEE QoS (test1), requesting all processor cores in the cluster:

[john@g06]# echo sleep 600 | msub -l walltime=600 -l qos=test1 -l procs=128

Note the showq and checkjob ouptupt:

showq:

Moab.1
[john@g06]# showq

active jobs------------------------
JOBID     USERNAME    STATE      PROCS    REMAINING    STARTTIME
Moab.1    john        Running    128      00:09:59     Wed Nov  9 15:56:33

1 active job     128 of 128 processors in use by local jobs (100.00%)
                 2 of 2 nodes active (100.00%)

eligible jobs----------------------
JOBID     USERNAME    STATE      PROCS     WCLIMIT     QUEUETIME 
     
0 eligible jobs

blocked jobs-----------------------
JOBID     USERNAME    STATE      PROCS     WCLIMIT     QUEUETIME 

0 blocked jobs

Total job:  1

checkjob:

[john@g06]# checkjob Moab.1
job Moab.1
 
State: Running
Creds: user:john  group:john  qos:test1
WallTime: 00:00:00 of 00:10:00
SubmitTime: Wed Nov 9 15:56:33
(Time Queued Total: 00:00:00 Eligible: 00:00:00)
 
StartTime: Wed Nov  9 15:56:33
Total Requested Tasks: 128
 
Req[0] TaskCount: 128 Partition: licenses
 
Allocated Nodes:
node[01-02]*64
 
IWD: /opt/native/
SubmitDir: /opt/native/
Executable: /opt/native/spool/moab.job.zOyf1N
 
StartCount: 1
Flags: RESTARTABLE,PREEMPTEE,GLOBALQUEUE,PROCSPECIFIED
Attr: PREEMPTEE
StartPriority: 100
Reservation 'Moab.1' (-00:00:03 -> 00:09:57  Duration: 00:10:00

Note the following:

Submitting the PREEMPTOR QoS

Next, this example will submit a PREEMPTOR QoS job (test2) to preempt the first one.

[john@g06]# echo sleep 600 | msub -l walltime=600 -l qos=test2 -l procs=128        

Examine the following output for showq and checkjob:

showq:

Moab.2
[john@g06]# showq
 
active jobs------------------------
JOBID     USERNAME    STATE      PROCS    REMAINING    STARTTIME
Moab.2    john        Running    128      00:09:59     Wed Nov 9 15:56:47
 
1 active job 128 of 128 processors in use by local jobs (100.00%)
2 of 2 nodes active (100.00%)
 
eligible jobs---------------------- 
JOBID     USERNAME    STATE     PROCS     WCLIMIT      QUEUETIME
Moab.1    john        Idle      128       00:10:00     Wed Nov 9 15:56:33
 
1 eligible job
 
blocked jobs----------------------- 
JOBID     USERNAME    STATE     PROCS     WCLIMIT      QUEUETIME
 
0 blocked jobs
 
Total jobs: 2 

Note that the PREEMPTOR (Moab.2) moved to Running, while the PREEMPTEE (Moab.1) was requeued.

checkjob:

[john@g06]# checkjob Moab.2
job Moab.2
 
State: Running
Creds: user:john group:john qos:test2
WallTime: 00:02:04 of 00:10:00
SubmitTime: Wed Nov 9 15:56:46
(Time Queued Total: 00:00:01 Eligible: 00:00:00)
 
StartTime: Wed Nov 9 15:56:47
Total Requested Tasks: 128
 
Req[0] TaskCount: 128 Partition: licenses
NodeCount: 2
 
Allocated Nodes:
node[01-02]*64
 
 
IWD: /opt/native/
SubmitDir: /opt/native/
Executable: /opt/native/spool/moab.job.ELoX5Q
 
StartCount: 1 
Flags: HASPREEMPTED,PREEMPTOR,GLOBALQUEUE,PROCSPECIFIED 
StartPriority: 10000 
Reservation 'Moab.2' (-00:02:21 -> 00:07:39 Duration: 00:10:00)

Note Flags: HASPREEMPTED,PREEMPTOR,GLOBALQUEUE,PROCSPECIFIED. HASPREEMPTED is set when the PREEMPTOR job has preempted the PREEMPTEE job.

Also note that the PREEMPTOR priority plays a very big role in preemption. Generally, you should assign the PREEMPTOR a higher priority than any other queued jobs so that it will move to (or near to) the top of the eligible queue.

See Also

Copyright © 2012 Adaptive Computing Enterprises, Inc.®