Moab Adaptive HPC
9.0 Reservation Tracking

9.0 Reservation Tracking

Reservation tracking allows you to configure a dynamic job to be a certain size at a certain time. Reservations allow you to specify how many (or which specific) nodes a job should use at certain times. These reservations are standing reservations that you configure in the moab.cfg file.

For example, consider an application with three defined reservations that is heavily used during the afternoon:

Reservation 1: (12:00 a.m. - 12:00 p.m.)—node001, node002, node003, node007
Reservation 2: (12:00 p.m. - 5:00 p.m.)—node001, node002, node003, node007, node008, node009
Reservation 3: (5:00 p.m. - 12:00 a.m.)—node001, node002

The application is then configured to track the reservation.

At 12:00 a.m., it uses 4 nodes—node001, node002, node003, node007.
At 12:00 p.m., it expands to use 6 nodes—node001, node002, node003, node007, node008, node009.
At 5:00 p.m., it contracts to use only 2 nodes—node001, node002.

A dynamic partition can also be configured to use reservation tracking. This allows the system to change the operating system pools according to a calendar.

The following parameters are required. The example used is our hybrid cluster.

Configure the job to use partition tracking. This is the dynamic job associated with the partition for a dynamic partition. ADVRES specifies the name of the reservation group the job will track.

JOBCFG[win] FLAGS=ADVRES:RG1

In addition to normal parameters, each reservation must define the following:

  • Partition=ALL allows the reservation to span partitions. Required for dynamic partitions.
  • RSVGROUP=<GROUP> specifies the reservation group (defined above). In addition all reservations must have a HOSTLIST or TASKCOUNT, STARTTIME, and DURATION or ENDTIME.
  • FLAGS=BYNAME is also useful for non-partition dynamic jobs.

SRCFG[RG1S1]        PARTITION=ALL
SRCFG[RG1S1]        RSVGROUP=RG1

SRCFG[RG1S1]        COMMENT="Resource Group 1 Step 1"
SRCFG[RG1S1]        RSVGROUP=RG1
SRCFG[RG1S1]        STARTTIME=1:00:00
SRCFG[RG1S1]        ENDTIME=12:00:00
SRCFG[RG1S1]        DAYS=MON,TUE,WED,THU,FRI
SRCFG[RG1S1]        HOSTLIST=CCS1,CCS2
SRCFG[RG1S1]        USERLIST=root,user1
SRCFG[RG1S1]        PARTITION=ALL
SRCFG[RG1S1]        FLAGS=BYNAME

SRCFG[RG1S2]        COMMENT="Resource Group 1 Step 2"
SRCFG[RG1S2]        RSVGROUP=RG1
SRCFG[RG1S2]        STARTTIME=12:00:00
SRCFG[RG1S2]        ENDTIME=17:00:00
SRCFG[RG1S2]        DAYS=MON,TUE,WED,THU,FRI
SRCFG[RG1S2]        HOSTLIST=CCS1,CCS2,LAB1
SRCFG[RG1S2]        TASKCOUNT=6
SRCFG[RG1S2]        USERLIST=root,user1
SRCFG[RG1S2]        PARTITION=ALL
SRCFG[RG1S2]        FLAGS=BYNAME

SRCFG[RG1S3]        COMMENT="Resource Group 1 Step 3"
SRCFG[RG1S3]        RSVGROUP=RG1
SRCFG[RG1S3]        STARTTIME=17:00:00
SRCFG[RG1S3]        ENDTIME=12:00:00
SRCFG[RG1S3]        DAYS=MON,TUE,WED,THU,FRI
SRCFG[RG1S3]        HOSTLIST=CCS1,CCS2
SRCFG[RG1S3]        TASKCOUNT=4
SRCFG[RG1S3]        USERLIST=root,user1
SRCFG[RG1S3]        PARTITION=ALL
SRCFG[RG1S3]        FLAGS=BYNAME