Conventions

7.1.5 Configuring and Managing Reservations

7.1.5-A Reservation Attributes

All reservations possess a time frame of activity, an access control list (ACL), and a list of resources to be reserved. Additionally, reservations may also possess a number of extension attributes including epilog/prolog specification, reservation ownership and accountability attributes, and special flags that modify the reservation's behavior.

Start/End Time

All reservations possess a start and an end time that define the reservation's active time. During this active time, the resources within the reservation may only be used as specified by the reservation access control list (ACL). This active time may be specified as either a start/end pair or a start/duration pair. Reservations exist and are visible from the time they are created until the active time ends at which point they are automatically removed.

Access Control List (ACL)

For a reservation to be useful, it must be able to limit who or what can access the resources it has reserved. This is handled by way of an ACL. With reservations, ACLs can be based on credentials, resources requested, or performance metrics. In particular, with a standing reservation, the attributes USERLIST, GROUPLIST, ACCOUNTLIST, CLASSLIST, QOSLIST, JOBATTRLIST, PROCLIMIT, MAXTIME, or TIMELIMIT may be specified. (See Affinity and Modifiers.)

Reservation access can be adjusted based on a job's requested node features by mapping node feature requests to job attributes as in the following example:

NODECFG[DEFAULT]  FEATURES+=ia64
NODETOJOBATTRMAP  ia64,ia32
SRCFG[pgs]        JOBATTRLIST=ia32
> mrsvctl -c -a jattr=gpfs\! -h "r:13-500"

Selecting Resources

When specifying which resources to reserve, the administrator has a number of options. These options allow control over how many resources are reserved and where they are reserved. The following reservation attributes allow the administrator to define resources.

Task Description

Moab uses the task concept extensively for its job and reservation management. A task is simply an atomic collection of resources, such as processors, memory, or local disk, which must be found on the same node. For example, if a task requires 4 processors and 2 GB of memory, the scheduler must find all processors AND memory on the same node; it cannot allocate 3 processors and 1 GB on one node and 1 processor and 1 GB of memory on another node to satisfy this task. Tasks constrain how the scheduler must collect resources for use in a standing reservation; however, they do not constrain the way in which the scheduler makes these cumulative resources available to jobs. A job can use the resources covered by an accessible reservation in whatever way it needs. If reservation X allocates 6 tasks with 2 processors and 512 MB of memory each, it could support job Y which requires 10 tasks of 1 processor and 128 MB of memory or job Z which requires 2 tasks of 4 processors and 1 GB of memory each. The task constraints used to acquire a reservation's resources are transparent to a job requesting use of these resources.

Example 7-5:  

SRCFG[test] RESOURCES=PROCS:2,MEM:1024

Task Count

Using the task description, the TASKCOUNT attribute defines how many tasks must be allocated to satisfy the reservation request. To create a reservation, a task count and/or a host list must be specified.

Example 7-6:  

SRCFG[test] TASKCOUNT=256

Host List

A host list constrains the set of resources available to a reservation. If no task count is specified, the reservation attempts to reserve one task on each of the listed resources. If a task count is specified that requests fewer resources than listed in the host list, the scheduler reserves only the number of tasks from the host list specified by the task count attribute. If a task count is specified that requests more resources than listed in the host list, the scheduler reserves the host list nodes first and then seeks additional resources outside of this list.

Example 7-7:  

SRCFG[test] HOSTLIST=node01,node1[3-5]

Node Features

Node features can be specified to constrain which resources are considered.

Example 7-8:  

SRCFG[test] NODEFEATURES=fastos

Partition

A partition may be specified to constrain which resources are considered.

Example 7-9:  

SRCFG[test] PARTITION=core3

Flags

Reservation flags allow specification of special reservation attributes or behaviors. Supported flags are listed in the following table:

Flag Name Description
ACLOVERLAP

Deprecated (this is now a default flag). In addition to free or idle nodes, a reservation may also reserve resources that possess credentials that meet the reservation's ACL. To change this behavior, set the NOACLOVERLAP flag.

ADVRESJOBDESTROY All jobs that have an ADVRES matching this reservation are canceled when the reservation is destroyed.
ALLOWJOBOVERLAP A job is allowed to start in a reservation that may end before the job completes. When the reservation ends before the job completes, the job will not be canceled but will continue to run.
BYNAME Reservation only allows access to jobs that meet reservation ACLs and explicitly request the resources of this reservation using the job ADVRES flag. (See Job to Reservation Binding.)
DEDICATEDRESOURCE
(aka EXCLUSIVE)

Reservation only placed on resources that are not reserved by any other reservation including job, system, and user reservations (except when combined with IGNJOBRSV*).

The order that SRCFG reservations are listed in the configuration is important when using DEDICATEDRESOURCE, because reservations made afterwards can steal resources later. During configuration, list DEDICATEDRESOURCE reservations last to guarantee exclusiveness.

EVACVMS

Reservation will automatically evacuate virtual machines from the reservation nodelist.

The same action can be accomplished by using reservation profiles. For more information, see Optimizing Maintenance Reservations.

IGNIDLEJOBS*

Reservation can be placed on top of idle job reservations.

This flag is meant to be used in conjunction with DEDICATEDRESOURCE.

IGNJOBRSV*

Ignores existing job reservations, allowing the reservation to be forced onto available resources even if it conflicts with existing job reservations. User and system reservation conflicts are still valid. It functions the same as IGNIDLEJOBS plus allows a reservation to be placed on top of an existing running job's reservation.

This flag is meant to be used in conjunction with DEDICATEDRESOURCE.

IGNRSV*

Request ignores existing resource reservations allowing the reservation to be forced onto available resources even if this conflicts with other reservations. It functions the same as IGNJOBRSV plus allows the reservation to be placed on top of the system reservations.

This flag is meant to be used in conjunction with DEDICATEDRESOURCE.

IGNSTATE* Reservation ignores node state when assigning nodes. It functions the same as IGNRSV plus allows the reservation to be placed on nodes that are not currently available. Also ignores resource availability on nodes.
NOACLOVERLAP

All resources must be free or idle, with no existing reservations. Moab will not allocate in-use resources even if they match the reservation's ACL.

mrsvctl -c -t 12 -E -F noacloverlap -a user==john

Moab looks for resources that are exclusive (free). Without the flag, Moab would look for resources that are exclusive or that are already running john's jobs.

This flag is meant to be used in conjunction with DEDICATEDRESOURCE.

NOCHARGE By default, Moab charges the allocation manager for unused cycles in a standing reservation. Setting the NOCHARGE flag prevents Moab from charging the allocation manager for standing reservations.
NOVMMIGRATION If set on a reservation, this prevents VMs from being migrated away from the reservation. If there are multiple reservations on the hypervisor and at least one reservation does not have the NOVMIGRATION flag, then VMs will be migrated.
OWNERPREEMPT Jobs by the reservation owner are allowed to preempt non-owner jobs using reservation resources.
OWNERPREEMPTIGNOREMINTIME

Allows the OWNERPREEMPT flag to "trump" the PREEMPTMINTIME setting for jobs already running on a reservation when the owner of the reservation submits a job. For example: without the OWNERPREEMPTIGNOREMINTIME flag set, a job submitted by the owner of a reservation will not preempt non-owner jobs already running on the reservation until the PREEMPTMINTIME setting (if set) for those jobs is passed.

With the OWNERPREEMPTIGNOREMINTIME flag set, a job submitted by the owner of a reservation immediately preempts non-owner jobs already running on the reservation, regardless of whether PREEMPTMINTIME is set for the non-owner jobs.

REQFULL Reservation is only created when all resources can be allocated.
SINGLEUSE Reservation is automatically removed after completion of the first job to use the reserved resources.
SPACEFLEX Deprecated (this is now a default flag). Reservation is allowed to adjust resources allocated over time in an attempt to optimize resource utilization.

* IGNIDLEJOBS, IGNJOBRSV, IGNRSV, and IGNSTATE flags are built on one another and form a hierarchy. IGNJOBRSV performs the function of IGNIDLEJOBS plus its own functions. IGNRSV performs the function of IGNJOBSRV and IGNIDLEJOBS plus its own functions. IGNSTATE performs the function of IGNRSV, IGNJOBRSV, and IGNIDLEJOBS plus its own functions. While you can use combinations of these flags, it is not necessary. If you set one flag, you do not need to set other flags that fall beneath it in the hierarchy.

Most flags can be associated with a reservation via the mrsvctl -c -F command or the SRCFG parameter.

7.1.5-B Configuring Standing Reservations

Standing reservations allow resources to be dedicated for particular uses. This dedication can be configured to be permanent or periodic, recurring at a regular time of day and/or time of week. There is extensive applicability of standing reservations for everything from daily dedicated job runs to improved use of resources on weekends. By default, standing reservations can overlap other reservations. Unless you set an ignore-type flag (ACLOVERLAP, DEDICATEDRESOURCE, IGNIDLEJOBS, or IGNJOBRSV), they are automatically given the IGNRSV flag. All standing reservation attributes are specified via the SRCFG parameter using the attributes listed in the table below.

Standing Reservation Attributes

ACCESS
Format DEDICATED or SHARED
Default ---
Description If set to SHARED, allows a standing reservation to use resources already allocated to other non-job reservations. Otherwise, these other reservations block resource access.
Example
SRCFG[test] ACCESS=SHARED

Standing reservation test may access resources allocated to existing standing and administrative reservations.

The order that SRCFG reservations are listed in the configuration are important when using DEDICATED, because reservations made afterwards can steal resources later. During configuration, list DEDICATED reservations last to guarantee exclusiveness.

ACCOUNTLIST
Format List of valid, comma delimited account names (see ACL Modifiers).
Default ---
Description Specifies that jobs with the associated accounts may use the resources contained within this reservation.
Example
SRCFG[test] ACCOUNTLIST=ops,staff

Jobs using the account ops or staff are granted access to the resources in standing reservation test.

CHARGEACCOUNT
Format Any valid accountname.
Default ---
Description

Specifies the account to which Moab will charge all idle cycles within the reservation (via the allocation manager).

CHARGEACCOUNT must be used in conjunction with CHARGEUSER.

Example
SRCFG[sr_gold1] HOSTLIST=kula
SRCFG[sr_gold1] PERIOD=INFINITY
SRCFG[sr_gold1] OWNER=USER:admin
SRCFG[sr_gold1] CHARGEACCOUNT=math
SRCFG[sr_gold1] CHARGEUSER=john

Moab charges all idle cycles within reservations supporting standing reservation sr_gold1 to account math.

CHARGEUSER
Format Any valid username.
Default ---
Description

Specifies the user to which Moab will charge all idle cycles within the reservation (via the allocation manager).

CHARGEUSER must be used in conjunction with CHARGEACCOUNT.

Example
SRCFG[sr_gold1] HOSTLIST=kula
SRCFG[sr_gold1] PERIOD=INFINITY
SRCFG[sr_gold1] OWNER=USER:admin
SRCFG[sr_gold1] CHARGEACCOUNT=math
SRCFG[sr_gold1] CHARGEUSER=john

Moab charges all idle cycles within reservations supporting standing reservation sr_gold1 to user john.

CLASSLIST
Format List of valid, comma delimited classes/queues (see ACL Modifiers).
Default ---
Description Specifies that jobs with the associated classes/queues may use the resources contained within this reservation.
Example
SRCFG[test] CLASSLIST=!interactive

Jobs not using the class interactive are granted access to the resources in standing reservation test.

COMMENT
Format

<STRING>

If the string contains whitespace, it should be enclosed in single (') or double quotes (").

Default ---
Description Specifies a descriptive message associated with the standing reservation and all child reservations.
Example
SRCFG[test] COMMENT='rsv for network testing'

Moab annotates the standing reservation test and all child reservations with the specified message. These messages show up within Moab client commands, Moab web tools, and graphical administrator tools.

DAYS
Format

One or more of the following (comma-delimited):

  • Mon
  • Tue
  • Wed
  • Thu
  • Fri
  • Sat
  • Sun
  • [ALL]
Default [ALL]
Description Specifies which days of the week the standing reservation is active.
Example
SRCFG[test] DAYS=Mon,Tue,Wed,Thu,Fri

Standing reservation test is active Monday through Friday.

DEPTH
Format <INTEGER>
Default 2
Description

Specifies the depth of standing reservations to be created (one per period).

To satisfy the DEPTH, Moab creates new reservations at the beginning of the specified PERIOD. If your reservation ends at the same time that a new PERIOD begins, the number of reservations may not match the requested DEPTH. To prevent or resolve this issue, set the ENDTIME a couple minutes before the beginning of the next PERIOD. For example, set the ENDTIME to 23:58 instead of 00:00.

Example
SRCFG[test] PERIOD=DAY DEPTH=6

Specifies that six reservations will be created for standing reservation test.

DISABLE
Format <BOOLEAN>
Default FALSE
Description Specifies that the standing reservation should no longer spawn child reservations.
Example
SRCFG[test] PERIOD=DAY DEPTH=7 DISABLE=TRUE

Specifies that reservations are created for standing reservation test for today and the next six days.

ENDTIME
Format [[[DD:]HH:]MM:]SS
Default 24:00:00
Description

Specifies the time of day the standing reservation period ends (end of day or end of week depending on PERIOD).

Example
SRCFG[test] STARTTIME=8:00:00 
SRCFG[test] ENDTIME=17:00:00
SRCFG[test] PERIOD=DAY

Standing reservation test is active from 8:00 AM until 5:00 PM.

FLAGS
Format Comma-delimited list of zero or more flags listed in the reservation flags overview.
Default ---
Description Specifies special reservation attributes. See Managing Reservations - Flags for details.
Example
SRCFG[test] FLAGS=BYNAME,DEDICATEDRESOURCE

Jobs may only access the resources within this reservation if they explicitly request the reservation by name. Further, the reservation is created to not overlap with other reservations.

GROUPLIST
Format One or more comma-delimited group names.
Default [ALL]
Description Specifies the groups allowed access to this standing reservation (see ACL Modifiers).
Example
SRCFG[test] GROUPLIST=staff,ops,special
SRCFG[test] CLASSLIST=interactive

Moab allows jobs with the listed group IDs or which request the job class interactive to use the resources covered by the standing reservation.

HOSTLIST
Format

One or more comma delimited host names or host expressions or the string "class:<classname>".

Default ---
Description

Specifies the set of hosts that the scheduler can search for resources to satisfy the reservation. If specified using the "class:X" format, Moab only selects hosts that support the specified class. If TASKCOUNT is also specified, only TASKCOUNT tasks are reserved. Otherwise, all matching hosts are reserved.

The HOSTLIST attribute is treated as host regular expression so foo10 will map to foo10, foo101, foo1006, and so forth. To request an exact host match, the expression can be bounded by the carat and dollar symbol expression markers as in ^foo10$.

Example
SRCFG[test] HOSTLIST=node001,node002,node003
SRCFG[test] RESOURCES=PROCS:2;MEM:512
SRCFG[test] TASKCOUNT=2

Moab reserves a total of two tasks with 2 processors and 512 MB each, using resources located on node001, node002, and/or node003.

SRCFG[test] HOSTLIST=node01,node1[3-5]

The reservation will consume all nodes that have "node01" somewhere in their names and all nodes that have both "node1" and either a "3," "4," or "5" in their names.

SRCFG[test] HOSTLIST=r:node[1-6]

The reservation will consume all nodes with names that begin with "node" and end with any number 1 through 6. In other words, it will reserve node1, node2, node3, node4, node5, and node6.

JOBATTRLIST
Format

Comma-delimited list of one or more of the following job attributes:

  • PREEMPTEE
  • INTERACTIVE
  • any generic attribute configured through NODECFG.
Default ---
Description

Specifies job attributes that grant a job access to the reservation.

Values can be specified with a "!="assignment to only allow jobs NOT requesting a certain feature inside the reservation.

To enable/disable reservation access based on requested node features, use the parameter NODETOJOBATTRMAP.

Example
SRCFG[test] JOBATTRLIST=PREEMPTEE

Preemptible jobs can access the resources reserved within this reservation.

MAXJOB
Format <INTEGER>
Default ---
Description Specifies the maximum number of jobs that can run in the reservation.
Example
SRCFG[test] MAXJOB=1

Only one job will be allowed to run in this reservation.

MAXTIME
Format [[[DD:]HH:]MM:]SS[+]
Default ---
Description Specifies the maximum time for jobs allowable. Can be used with Affinity to attract jobs with same MAXTIME.
Example
SRCFG[test] MAXTIME=1:00:00+

Jobs with a time of 1:00:00 are attracted to this reservation.

NODEFEATURES
Format Comma-delimited list of node features.
Default ---
Description Specifies the required node features for nodes that are part of the standing reservation.
Example
SRCFG[test] NODEFEATURES=wide,fddi

All nodes allocated to the standing reservation must have both the wide and fddi node attributes.

OWNER
Format

<CREDTYPE>:<CREDID>

Where <CREDTYPE> is one of USER, GROUP, ACCT, QoS, CLASS or CLUSTER and <CREDTYPE> is a valid credential id of that type.

Default ---
Description

Specifies the owner of the reservation. Setting ownership for a reservation grants the user management privileges, including the power to release it.

Setting a USER as the OWNER of a reservation gives that user privileges to query and release the reservation.

For sandbox reservations, sandboxes are applied to a specific peer only if OWNER is set to CLUSTER:<PEERNAME>.

Example
SRCFG[test] OWNER=ACCT:jupiter

User jupiter owns the reservation and may be granted special privileges associated with that ownership.

PARTITION
Format Valid partition name.
Default [ALL]
Description Specifies the partition in which to create the standing reservation.
Example
SRCFG[test] PARTITION=OLD

The standing reservation will only select resources from partition OLD.

PERIOD
Format One of DAY, WEEK, or INFINITY.
Default DAY
Description Specifies the period of the standing reservation.
Example
SRCFG[test] PERIOD=WEEK

Each standing reservation covers a one week period.

PROCLIMIT
Format

<QUALIFIER><INTEGER>

<QUALIFIER> may be one of the following <, <=, ==, >=, >

Default ---
Description Specifies the processor limit for jobs requesting access to this standing reservation.
Example
SRCFG[test] PROCLIMIT<=4

Jobs requesting 4 or fewer processors are allowed to run.

PSLIMIT
Format

<QUALIFIER><INTEGER>

<QUALIFIER> may be one of the following <, <=, ==, >=, >

Default ---
Description Specifies the processor-second limit for jobs requesting access to this standing reservation.
Example
SRCFG[test] PSLIMIT<=40000

Jobs requesting 40000 or fewer processor-seconds are allowed to run.

QOSLIST
Format Zero or more valid, comma-delimited QoS names.
Default ---
Description Specifies that jobs with the listed QoS names can access the reserved resources.
Example
SRCFG[test] QOSLIST=hi,low,special

Moab allows jobs using the listed QoSes access to the reserved resources.

REQUIREDTPN
Format

<QUALIFIER><INTEGER>

<QUALIFIER> may be one of the following <, <=, ==, >=, >

Default ---
Description Restricts access to reservations based on the job's TPN (tasks per node).
Example
SRCFG[test] REQUIREDTPN==4

Jobs with tpn=4 or ppn=4 would be allowed within the reservation, but any other TPN value would not. (For more information, see TPN (Exact Tasks Per Node).)

RESOURCES
Format Semicolon delimited <ATTR>:<VALUE> pairs where <ATTR> may be one of PROCS, MEM, SWAP, or DISK.
Default PROCS:-1 (All processors available on node)
Description

Specifies what resources constitute a single standing reservation task. (Each task must be able to obtain all of its resources as an atomic unit on a single node.) Supported resources currently include the following:

  • PROCS (number of processors)
  • MEM (real memory in MB)
  • DISK (local disk in MB)
  • SWAP (virtual memory in MB)
Example
SRCFG[test] RESOURCES=PROCS:1;MEM:512

Each standing reservation task reserves one processor and 512 MB of real memory.

ROLLBACKOFFSET
Format [[[DD:]HH:]MM:]SS
Default ---
Description

Specifies the minimum time in the future at which the reservation may start. This offset is rolling meaning the start time of the reservation will continuously roll back into the future to maintain this offset. Rollback offsets are a good way of providing guaranteed resource access to users under the conditions that they must commit their resources in the future or lose dedicated access. See QoS for more info about quality of service and service level agreements; also see Rollback Reservation Overview.

Neither credlock nor advres are compatible on the jobs submitted for this reservation.

Example
SRCFG[ajax] ROLLBACKOFFSET=24:00:00 TASKCOUNT=32
SRCFG[ajax] PERIOD=INFINITY ACCOUNTLIST=ajax

The standing reservation guarantees access to up to 32 processors within 24 hours to jobs from the ajax account.

Adding an asterisk to the ROLLBACKOFFSET value pins rollback reservation start times when an idle reservation is created in the rollback reservation. For example:

SRCFG[staff] ROLLBACKOFFSET=18:00:00* PERIOD=INFINITY
RSVACCESSLIST
Format <RESERVATION>[,...]
Default ---
Description A list of reservations to which the specified reservation has access.
Example
SRCFG[test] RSVACCESSLIST=rsv1,rsv2,rsv3
RSVGROUP
Format <STRING>
Default ---
Description See section Reservation Group for a detailed description.
Example
SRCFG[test] RSVGROUP=rsvgrp1
SRCFG[ajax] RSVGROUP=rsvgrp1
STARTTIME
Format [[[DD:]HH:]MM:]SS
Default 00:00:00:00 (midnight)
Description

Specifies the time of day/week the standing reservation becomes active. Whether this indicates a time of day or time of week depends on the setting of the PERIOD attribute.

If specified within a reservation profile, a value of 0 indicates the reservation should start at the earliest opportunity.

Example
SRCFG[test] STARTTIME=08:00:00
SRCFG[test] ENDTIME=17:00:00
SRCFG[test] PERIOD=DAY

The standing reservation will be active from 8:00 a.m. until 5:00 p.m. each day.

TASKCOUNT
Format <INTEGER>
Default 0 (unlimited tasks)
Description Specifies how many tasks should be reserved for the reservation.
Example
SRCFG[test] RESOURCES=PROCS:1;MEM:256
SRCFG[test] TASKCOUNT=16

Standing reservation test reserves 16 tasks worth of resources; in this case, 16 processors and 4 GB of real memory.

TIMELIMIT
Format [[[DD:]HH:]MM:]SS
Default -1 (no time based access)
Description Specifies the maximum allowed overlap between the standing reservation and a job requesting resource access.
Example
SRCFG[test] TIMELIMIT=1:00:00

Moab allows jobs to access up to one hour of resources in the standing reservation.

TPN (Exact Tasks Per Node)
Format <INTEGER>
Default 0 (no TPN constraint)
Description Specifies the exact number of tasks per node that must be available on eligible nodes.
Example
SRCFG[2] TPN=4
SRCFG[2] RESOURCES=PROCS:2;MEM:256

Moab must locate four tasks on each node that is to be part of the reservation. That is, each node included in standing reservation 2 must have 8 processors and 1 GB of memory available.

TRIGGER
Format See Creating a trigger for syntax.
Default N/A
Description Specifies event triggers to be launched by the scheduler under the scheduler's ID. These triggers can be used to conditionally cancel reservations, modify resources, or launch various actions at specified event offsets. See About object triggers for more information.
Example
SRCFG[fast] TRIGGER=EType=start,Offset=5:00:00,AType=exec,Action="/usr/local/domail.pl"

Moab launches the domail.pl script 5 hours after any fast reservation starts.

USERLIST
Format Comma-delimited list of users.
Default ---
Description Specifies which users have access to the resources reserved by this reservation (see ACL Modifiers).
Example
SRCFG[test] USERLIST=bob,joe,mary

Users bob, joe and mary can all access the resources reserved within this reservation.

Standing Reservation Overview

A standing reservation is similar to a normal administrative reservation in that it also places an access control list on a specified set of resources. Resources are specified on a per-task basis and currently include processors, local disk, real memory, and swap. The access control list supported for standing reservations includes users, groups, accounts, job classes, and QoS levels. Standing reservations can be configured to be permanent or periodic on a daily or weekly basis and can accept a daily or weekly start and end time. Regardless of whether permanent or recurring on a daily or weekly basis, standing reservations are enforced using a series of reservations, extending a number of periods into the future as controlled by the DEPTH attribute of the SRCFG parameter.

The following examples demonstrate possible configurations specified with the SRCFG parameter.

Example 7-10: Basic Business Hour Standing Reservation

SRCFG[interactive] TASKCOUNT=6 RESOURCES=PROCS:1,MEM:512
SRCFG[interactive] PERIOD=DAY DAYS=MON,TUE,WED,THU,FRI
SRCFG[interactive] STARTTIME=9:00:00 ENDTIME=17:00:00
SRCFG[interactive] CLASSLIST=interactive

When using the SRCFG parameter, attribute lists must be delimited using the comma (,), pipe (|), or colon (:) characters; they cannot be space delimited. For example, to specify a multi-class ACL, specify:

SRCFG[test] CLASSLIST=classA,classB

Only one STARTTIME and one ENDTIME value can be specified per reservation. If varied start and end times are desired throughout the week, complementary standing reservations should be created. For example, to establish a reservation from 8:00 p.m. until 6:00 a.m. the next day during business days, two reservations should be created-one from 8:00 p.m. until midnight, and the other from midnight until 6:00 a.m. Jobs can run across reservation boundaries allowing these two reservations to function as a single reservation that spans the night. The following example demonstrates how to span a reservation across 2 days on the same nodes:

SRCFG[Sun] PERIOD=WEEK
SRCFG[Sun] STARTTIME=00:20:00:00 ENDTIME=01:00:00:00
SRCFG[Sun] HOSTLIST=node01,node02,node03

SRCFG[Mon] PERIOD=WEEK
SRCFG[Mon] STARTTIME=01:00:00:00 ENDTIME=01:06:00:00
SRCFG[Sun] HOSTLIST=node01,node02,node03

The preceding example fully specifies a reservation including the quantity of resources requested using the TASKCOUNT and RESOURCES attributes. In all cases, resources are allocated to a reservation in units called tasks where a task is a collection of resources that must be allocated together on a single node. The TASKCOUNT attribute specifies the number of these tasks that should be reserved by the reservation. In conjunction with this attribute, the RESOURCES attribute defines the reservation task by indicating what resources must be included in each task. In this case, the scheduler must locate and reserve 1 processor and 512 MB of memory together on the same node for each task requested.

As mentioned previously, a standing reservation reserves resources over a given time frame. The PERIOD attribute may be set to a value of DAY, WEEK, or INFINITY to indicate the period over which this reservation should recur. If not specified, a standing reservation recurs on a daily basis. If a standing reservation is configured to recur daily, the attribute DAYS may be specified to indicate which days of the week the reservation should exist. This attribute takes a comma-delimited list of days where each day is specified as the first three letters of the day in all capital letters: MON or FRI. The preceding example specifies that this reservation is periodic on a daily basis and should only exist on business days.

The time of day during which the requested tasks are to be reserved is specified using the STARTTIME and ENDTIME attributes. These attributes are specified in standard military time HH:MM:SS format and both STARTTIME and ENDTIME specification is optional defaulting to midnight at the beginning and end of the day respectively. In the preceding example, resources are reserved from 9:00 a.m. until 5:00 p.m. on business days.

The final aspect of any reservation is the access control list indicating who or what can use the reserved resources. In the preceding example, the CLASSLIST attribute is used to indicate that jobs requesting the class "interactive" should be allowed to use this reservation.

Specifying Reservation Resources

In most cases, only a small subset of standing reservation attributes must be specified in any given case. For example, by default, RESOURCES is set to PROCS=-1 which indicates that each task should reserve all of the processors on the node on which it is located. This, in essence, creates a one task equals one node mapping. In many cases, particularly on uniprocessor systems, this default behavior may be easiest to work with. However, in SMP environments, the RESOURCES attribute provides a powerful means of specifying an exact, multi-dimensional resource set.

An examination of the parameters documentation shows that the default value of PERIOD is DAYS. Thus, specifying this parameter in the preceding above was unnecessary. It was used only to introduce this parameter and indicate that other options exist beyond daily standing reservations.

Example 7-11: Host Constrained Standing Reservation

Although the first example did specify a quantity of resources to reserve, it did not specify where the needed tasks were to be located. If this information is not specified, Moab attempts to locate the needed resources anywhere it can find them. The Example 1 reservation essentially discovers hosts where the needed resources can be found. If the SPACEFLEX reservation flag is set, then the reservation continues to float to the best hosts over the life of the reservation. Otherwise, it will be locked to the initial set of allocated hosts.

If a site wanted to constrain a reservation to a subset of available resources, this could be accomplished using the HOSTLIST attribute. The HOSTLIST attribute is specified as a comma-separated list of host names and constrains the scheduler to only select tasks from the specified list. This attribute can exactly specify hosts or specify them using host regular expressions. The following example demonstrates a possible use of the HOSTLIST attribute:

SRCFG[interactive] DAYS=MON,TUE,WED,THU,FRI
SRCFG[interactive] PERIOD=DAY
SRCFG[interactive] STARTTIME=10:00:00 ENDTIME=15:00:00
SRCFG[interactive] RESOURCES=PROCS:2,MEM:256
SRCFG[interactive] HOSTLIST=node001,node002,node005,node020
SRCFG[interactive] TASKCOUNT=6
SRCFG[interactive] CLASSLIST=interactive

Note that the HOSTLIST attribute specifies a non-contiguous list of hosts. Any combination of hosts may be specified and hosts may be specified in any order. In this example, the TASKCOUNT attribute is also specified. These two attributes both apply constraints on the scheduler with HOSTLIST specifying where the tasks can be located and TASKCOUNT indicating how many total tasks may be allocated. In this example, six tasks are requested but only four hosts are specified. To handle this, if adequate resources are available, the scheduler may attempt to allocate more than one task per host. For example, assume that each host is a quad-processor system with 1 GB of memory. In such a case, the scheduler could allocate up to two tasks per host and even satisfy the TASKCOUNT constraint without using all of the hosts in the host list.

It is important to note that even if there is a one to one mapping between the value of TASKCOUNT and the number of hosts in HOSTLIST, the scheduler will not necessarily place one task on each host. If, for example, node001 and node002 were 8 processor SMP hosts with 1 GB of memory, the scheduler could locate up to four tasks on each of these hosts fully satisfying the reservation taskcount without even partially using the remaining hosts. (Moab will place tasks on hosts according to the policy specified with the NODEALLOCATIONPOLICY parameter.) If the host list provides more resources than what is required by the reservation as specified via TASKCOUNT, the scheduler will simply select the needed resources within the set of hosts listed.

Enforcing Policies Via Multiple Reservations

Single reservations enable multiple capabilities. Combinations of reservations can further extend a site's capabilities to impose specific policies.

Example 7-12: Reservation Stacking

If HOSTLIST is specified but TASKCOUNT is not, the scheduler will pack as many tasks as possible onto all of the listed hosts. For example, assume the site added a second standing reservation named debug to its configuration that reserved resources for use by certain members of its staff using the following configuration:

SRCFG[interactive] DAYS=MON,TUE,WED,THU,FRI
SRCFG[interactive] PERIOD=DAY
SRCFG[interactive] STARTTIME=10:00:00 ENDTIME=15:00:00
SRCFG[interactive] RESOURCES=PROCS:2,MEM:256
SRCFG[interactive] HOSTLIST=node001,node002,node005,node020
SRCFG[interactive] TASKCOUNT=6
SRCFG[interactive] CLASSLIST=interactive
SRCFG[debug]       HOSTLIST=node001,node002,node003,node004
SRCFG[debug]       USERLIST=helpdesk
SRCFG[debug]       GROUPLIST=operations,sysadmin
SRCFG[debug]       PERIOD=INFINITY

The new standing reservation is quite simple. Since RESOURCES is not specified, it will allocate all processors on each host that is allocated. Since TASKCOUNT is not specified, it will allocate every host listed in HOSTLIST. Since PERIOD is set to INFINITY, the reservation is always in force and there is no need to specify STARTTIME, ENDTIME, or DAYS.

The standing reservation has two access parameters set using the attributes USERLIST and GROUPLIST. This configuration indicates that the reservation can be accessed if any one of the access lists specified is satisfied by the job. In essence, reservation access is logically ORed allowing access if the requester meets any of the access constraints specified. In this example, jobs submitted by either user helpdesk or any member of the groups operations or sysadmin can use the reserved resources. (See ACL Modifiers.)

Unless ACL Modifiers are specified, access is granted to the logical OR of access lists specified within a standing reservation and granted to the logical AND of access lists across different standing reservations. A comparison of the standing reservations interactive and debug in the preceding example indicates that they both can allocate hosts node001 and node002. If node001 had both of these reservations in place simultaneously and a job attempted to access this host during business hours when standing reservation interactive was active. The job could only use the doubly reserved resources if it requests the run class interactive and it meets the constraints of reservation debug—that is, that it is submitted by user helpdesk or by a member of the group operations or sysadmin.

As a rule, the scheduler does not stack reservations unless it must. If adequate resources exist, it can allocate reserved resources side by side in a single SMP host rather than on top of each other. In the case of a 16 processor SMP host with two 8 processor standing reservations, 8 of the processors on this host will be allocated to the first reservation, and 8 to the next. Any configuration is possible. The 16 processor hosts can also have 4 processors reserved for user "John," 10 processors reserved for group "Staff," with the remaining 2 processors available for use by any job.

Stacking reservations is not usually required but some site administrators choose to do it to enforce elaborate policies. There is no problem with doing so as long as you can keep things straight. It really is not too difficult a concept; it just takes a little getting used to. See the Reservation Overview section for a more detailed description of reservation use and constraints.

As mentioned earlier, by default the scheduler enforces standing reservations by creating a number of reservations where the number created is controlled by the DEPTH attribute. Each night at midnight, the scheduler updates its periodic non-floating standing reservations. By default, DEPTH is set to 2, meaning when the scheduler starts up, it will create two 24-hour reservations covering a total of two days' worth of time-a reservation for today and one for tomorrow. For daily reservations, at midnight, the reservations roll, meaning today's reservation expires and is removed, tomorrow's reservation becomes today's, and the scheduler creates a new reservation for the next day.

With this model, the scheduler continues creating new reservations in the future as time moves forward. Each day, the needed resources are always reserved. At first, all appears automatic but the standing reservation DEPTH attribute is in fact an important aspect of reservation rolling, which helps address certain site specific environmental factors. This attribute remedies a situation that might occur when a job is submitted and cannot run immediately because the system is backlogged with jobs. In such a case, available resources may not exist for several days out and the scheduler must reserve these future resources for this job. With the default DEPTH setting of two, when midnight arrives, the scheduler attempts to roll its standing reservations but a problem arises in that the job has now allocated the resources needed for the standing reservation two days out. Moab cannot reserve the resources for the standing reservation because they are already claimed by the job. The standing reservation reserves what it can but because all needed resources are not available, the resulting reservation is now smaller than it should be, or is possibly even empty.

If a standing reservation is smaller than it should be, the scheduler will attempt to add resources each iteration until it is fully populated. However, in the case of this job, the job is not going to release its reserved resources until it completes and the standing reservation cannot claim them until this time. The DEPTH attribute allows a site to specify how deep into the future a standing reservation should reserve its resources allowing it to claim the resources first and prevent this problem. If a partial standing reservation is detected on a system, it may be an indication that the reservation's DEPTH attribute should be increased.

In Example 3, the PERIOD attribute is set to INFINITY. With this setting, a single, permanent standing reservation is created and the issues of resource contention do not exist. While this eliminates the contention issue, infinite length standing reservations cannot be made periodic.

Example 7-13: Multiple ACL Types

In most cases, access lists within a reservation are logically ORed together to determine reservation access. However, exceptions to this rule can be specified by using the required ACL marker-the asterisk (*). Any ACL marked with this symbol is required and a job is only allowed to use a reservation if it meets all required ACLs and at least one non-required ACL (if specified). A common use for this facility is in conjunction with the TIMELIMIT attribute. This attribute controls the length of time a job may use the resources within a standing reservation. This access mechanism can be ANDed or ORed to the cumulative set of all other access lists as specified by the required ACL marker. Consider the following example configuration:

SRCFG[special] TASKCOUNT=32
SRCFG[special] PERIOD=WEEK
SRCFG[special] STARTTIME=1:08:00:00  
SRCFG[special] ENDTIME=5:17:00:00
SRCFG[special] NODEFEATURES=largememory
SRCFG[special] TIMELIMIT=1:00:00*
SRCFG[special] QOSLIST=high,low,special-
SRCFG[special] ACCCOUNTLIST=!projectX,!projectY

The above configuration requests 32 tasks which translate to 32 nodes. The PERIOD attribute makes this reservation periodic on a weekly basis while the attributes STARTTIME and ENDTIME specify the week offsets when this reservation is to start and end. (Note that the specification format has changed to DD:HH:MM:SS.) In this case, the reservation starts on Monday at 8:00 a.m. and runs until Friday at 5:00 p.m. The reservation is enforced as a series of weekly reservations that only cover the specified time frame. The NODEFEATURES attribute indicates that each of the reserved nodes must have the node feature "largememory" configured.

As described earlier, TIMELIMIT indicates that jobs using this reservation can only use it for one hour. This means the job and the reservation can only overlap for one hour. Clearly jobs requiring an hour or less of wallclock time meet this constraint. However, a four-hour job that starts on Monday at 5:00 a.m. or a 12-hour job that starts on Friday at 4:00 p.m. also satisfies this constraint. Also, note the TIMELIMIT required ACL marker, *; it is set indicating that jobs must not only meet the TIMELIMIT access constraint but must also meet one or more of the other access constraints. In this example, the job can use this reservation if it can use the access specified via QOSLIST or ACCOUNTLIST; that is, it is assigned a QoS of high, low, or special , or the submitter of the job has an account that satisfies the !projectX and !projectY criteria. See the QoS Overview for more info about QoS configuration and usage.

Affinity

Reservation ACLs allow or deny access to reserved resources but they may be configured to also impact a job's affinity for a particular reservation. By default, jobs gravitate toward reservations through a mechanism known as positive affinity. This mechanism allows jobs to run on the most constrained resources leaving other, unreserved resources free for use by other jobs that may not be able to access the reserved resources. Normally this is a desired behavior. However, sometimes, it is desirable to reserve resources for use only as a last resort-using the reserved resources only when there are no other resources available. This last resort behavior is known as negative affinity. Note the '-' (hyphen or negative sign) following the special in the QOSLIST values. This special mark indicates that QoS special should be granted access to this reservation but should be assigned negative affinity. Thus, the QOSLIST attribute specifies that QoS high and low should be granted access with positive affinity (use the reservation first where possible) and QoS special granted access with negative affinity (use the reservation only when no other resources are available).

Affinity status is granted on a per access object basis rather than a per access list basis and always defaults to positive affinity. In addition to negative affinity, neutral affinity can also be specified using the equal sign (=) as in QOSLIST[0] normal= high debug= low-.

When a job matches multiple ACLs for a reservation, the final node affinity for the node, job, and reservation combination is based on the last matching ACL entry found in the configuration file.

For example, given the following reservation ACLs, a job matching both will receive a negative affinity:

SRCFG[res1] USERLIST=joe+ MAXTIME<=4:00:00-

With the following reservation ACLs, a job matching both will receive a positive affinity:

SRCFG[res1] MAXTIME<=4:00:00- USERLIST=joe+

ACL Modifiers

ACL modifiers allow a site to change the default behavior of ACL processing. By default, a reservation can be accessed if one or more of its ACLs can be met by the requestor. This behavior can be changed using the "deny" or "required" ACL modifier, as in the following tables:

Not
Symbol: ! (exclamation point)
Description If attribute is met, the requestor is denied access regardless of any other satisfied ACLs.
Example
SRCFG[test] GROUPLIST=staff USERLIST=!steve

Allow access to all staff members other than steve.

Required
Symbol: * (asterisk)
Description All required ACLs must be satisfied for requestor access to be granted.
Example
SRCFG[test] QOSLIST=*high MAXTIME=*2:00:00

Only jobs in QoS high that request less than 2 hours of walltime are granted access.

XOR
Symbol: ^ (carat)
Description All attributes of the type specified other than the ones listed in the ACL satisfy the ACL.
Example
SRCFG[test] QOSLIST=^high

All jobs other than those requesting QoS high are granted access.

CredLock
Symbol:
& (ampersand)
Description Matching jobs will be required to run on the resources reserved by this reservation. You can use this modifier on accounts, classes, groups, qualities of service, and users.
Example
SRCFG[test] USERLIST=&john

All of user john's jobs must run in this reservation.

HPEnable (hard policy enable)
Symbol: ~ (tilde)
Description ACLs marked with this modifier are ignored during soft policy scheduling and are only considered for hard policy scheduling once all eligible soft policy jobs start.
Example
SRCFG[johnspace] USERLIST=john CLASSLIST=~debug

All of user john's jobs are allowed to run in the reservation at any time. Debug jobs are also allowed to run in this reservation but are only considered after all of John's jobs are given an opportunity to start. User john's jobs are considered before debug jobs regardless of job priority.

If HPEnable and Not markers are used in conjunction, then specified credentials are blocked-out of the reservation during soft-policy scheduling.

Note the ACCOUNTLIST values in Example 7-13 are preceded with an exclamation point, or NOT symbol. This indicates that all jobs with accounts other than projectX and projectY meet the account ACL. Note that if a !<X> value (!projectX) appears in an ACL line, that ACL is satisfied by any object not explicitly listed by a NOT entry. Also, if an object matches a NOT entry, the associated job is excluded from the reservation even if it meets other ACL requirements. For example, a QoS 3 job requesting account projectX is denied access to the reservation even though the job QoS matches the QoS ACL.

Example 7-14: Binding Users to Reservations at Reservation Creation

# create a 4 node reservation for john and bind all of john's jobs to that reservation
> mrsvctl -c -a user=&john -t 4

Reservation Ownership

Reservation ownership allows a site to control who owns the reserved resources during the reservation time frame. Depending on needs, this ownership may be identical to, a subset of, or completely distinct from the reservation ACL. By default, reservation ownership implies resource accountability and resources not consumed by jobs are accounted against the reservation owner. In addition, ownership can also be associated with special privileges within the reservation.

Ownership is specified using the OWNER attribute in the format <CREDTYPE>:<CREDID>, as in OWNER=USER:john. To enable john's jobs to preempt other jobs using resources within the reservation, the SRCFG attribute FLAG should be set to OWNERPREEMPT. In the example below, the jupiter project chooses to share resources with the saturn project but only when it does not currently need them.

Example 7-15: Limited Shared Access

ACCOUNTCFG[jupiter] PRIORITY=10000
SRCFG[jupiter] HOSTLIST=node0[1-9]
SRCFG[jupiter] PERIOD=INFINITY
SRCFG[jupiter] ACCOUNTLIST=jupiter,saturn-
SRCFG[jupiter] OWNER=ACCT:jupiter
SRCFG[jupiter] FLAGS=OWNERPREEMPT

Partitions

A reservation can be used in conjunction with a partition. Configuring a standing reservation on a partition allows constraints to be (indirectly) applied to a partition.

Example 7-16: Time Constraints by Partition

The following example places a 3-day wall-clock limit on two partitions and a 64 processor-hour limit on jobs running on partition small.

SRCFG[smallrsv] PARTITION=small MAXTIME=3:00:00:00 PSLIMIT<=230400 HOSTLIST=ALL
SRCFG[bigrsv] PARTITION=big MAXTIME=3:00:00:00 HOSTLIST=ALL

Resource Allocation Behavior

As mentioned, standing reservations can operate in one of two modes, floating, or non-floating (essentially node-locked). A floating reservation is created when the flag SPACEFLEX is specified. If a reservation is non-floating, the scheduler allocates all resources specified by the HOSTLIST parameter regardless of node state, job load, or even the presence of other standing reservations. Moab interprets the request for a non-floating reservation as, "I want a reservation on these exact nodes, no matter what!"

If a reservation is configured to be floating, the scheduler takes a more relaxed stand, searching through all possible nodes to find resources meeting standing reservation constraints. Only Idle, Running, or Busy nodes are considered and further, only considered if no reservation conflict is detected. The reservation attribute ACCESS modifies this behavior slightly and allows the reservation to allocate resources even if reservation conflicts exist.

If a TASKCOUNT is specified with or without a HOSTEXPRESSION, Moab will, by default, only consider "up" nodes for allocation. To change this behavior, the reservation flag IGNSTATE can be specified as in the following example:

SRCFG[nettest] GROUPLIST=sysadm
SRCFG[nettest] FLAGS=IGNSTATE
SRCFG[nettest] HOSTLIST=node1[3-8]
SRCFG[nettest] STARTTIME=9:00:00
SRCFG[nettest] ENDTIME=17:00:00

Access to existing reservations can be controlled using the reservation flag IGNRSV.

Other standing reservation attributes not covered here include PARTITION and CHARGEACCOUNT. These parameters are described in some detail in the parameters documentation.

Example 7-17: Using Reservations to Guarantee Turnover

In some cases, it is desirable to make certain a portion of a cluster's resources are available within a specific time frame. The following example creates a floating reservation belonging to the jupiter account that guarantees 16 tasks for use by jobs requesting up to one hour.

SRCFG[shortpool] OWNER=ACCT:jupiter
SRCFG[shortpool] FLAGS=SPACEFLEX
SRCFG[shortpool] MAXTIME=1:00:00
SRCFG[shortpool] TASKCOUNT=16
SRCFG[shortpool] STARTTIME=9:00:00
SRCFG[shortpool] ENDTIME=17:00:00
SRCFG[shortpool] DAYS=Mon,Tue,Wed,Thu,Fri

This reservation enables a capability similar to what was known in early Maui releases as "shortpool." The reservation covers every weekday from 9:00 a.m. to 5:00 p.m., reserving 16 tasks and allowing jobs to overlap the reservation for up to one hour. The SPACEFLEX flag indicates that the reservation may be dynamically modified--over time to re-locate to more optimal resources. In the case of a reservation with the MAXTIME ACL, this would include migrating to resources that are in use but that free up within the MAXTIME time frame. Additionally, because the MAXTIME ACL defaults to positive affinity, any jobs that fit the ACL attempt to use available reserved resources first before looking elsewhere.

Rolling Reservations

Rolling reservations are enabled using the ROLLBACKOFFSET attribute and can be used to allow users guaranteed access to resources, but the guaranteed access is limited to a time-window in the future. This functionality forces users to commit their resources in the future or lose access.

Image 7-2: Rolling reservation over 3 iterations

Click to enlarge

Example 7-18: Rollback Reservations

SRCFG[ajax] ROLLBACKOFFSET=24:00:00 TASKCOUNT=32
SRCFG[ajax] PERIOD=INFINITY ACCOUNTLIST=ajax

Adding an asterisk to the ROLLBACKOFFSET value pins rollback reservation start times when an idle reservation is created in the rollback reservation. For example: SRCFG[staff] ROLLBACKOFFSET=18:00:00* PERIOD=INFINITY.

Modifying Resources with Standing Reservations

Moab can customize compute resources associated with a reservation during the life of the reservation. This can be done generally using the TRIGGER attribute, or it can be done for operating systems using the shortcut attribute OS. If set, Moab dynamically reprovisions allocated reservation nodes to the requested operating system as shown in the following example:

SRCFG[provision] PERIOD=DAY DAY=MON,WED,FRI STARTTIME=7:00:00 ENDTIME=10:00:00
SRCFG[provision] OS=rhel4  # provision nodes to use redhat during reservation, restore when done

7.1.5-C Managing Administrative Reservations

A default reservation with no ACL is termed an administrative reservation, but is occasionally referred to as a system reservation. It blocks access to all jobs because it possesses an empty access control list. It is often useful when performing administrative tasks but cannot be used for enforcing resource usage policies.

Administrative reservations are created and managed using the mrsvctl command. With this command, all aspects of reservation time frame, resource selection, and access control can be dynamically modified. The mdiag -r command can be used to view configuration, state, allocated resource information as well as identify any potential problems with the reservation. The following table briefly summarizes commands used for common actions. More detailed information is available in the command summaries.

Action Command
create reservation mrsvctl -c <RSV_DESCRIPTION>
list reservations mrsvctl -l
release reservation mrsvctl -r <RSVID>
modify reservation mrsvctl -m <ATTR>=<VAL> <RSVID>
query reservation configuration mdiag -r <RSVID>
display reservation hostlist mrsvctl -q resources <RSVID>

Related topics 

© 2014 Adaptive Computing