(Click to open topic with navigation)
Every reservation consists of 3 major components: (1) a set of resources, (2) a time frame, and (3) an access control list. Additionally, a reservation may also have a number of optional attributes controlling its behavior and interaction with other aspects of scheduling. Reservation attribute descriptions follow.
A reservation consists of one or more tasks. In attempting to locate the resources required for a particular reservation, Moab examines all feasible resources and locates the needed resources in groups specified by the task description. An example may help clarify this concept:
Reservation A requires four tasks. Each task is defined as 1 processor and 1 GB of memory.
Node X has 2 processors and 3 GB of memory available
Node Y has 2 processors and 1 GB of memory available
Node Z has 2 processors and 2 GB of memory available
When collecting the resources needed for the reservation, Moab examines each node in turn. Moab finds that Node X can support 2 of the 4 tasks needed by reserving 2 processors and 2 GB of memory, leaving 1 GB of memory unreserved. Analysis of Node Y shows that it can only support 1 task reserving 1 processor and 1 GB of memory, leaving 1 processor unreserved. Note that the unreserved memory on Node X cannot be combined with the unreserved processor on Node Y to satisfy the needs of another task because a task requires all resources to be located on the same node. Finally, analysis finds that node Z can support 2 tasks, fully reserving all of its resources.
Both reservations and jobs use the concept of a task description in specifying how resources should be allocated. It is important to note that although a task description is used to allocate resources to a reservation, this description does not in any way constrain the use of those resources by a job. In the above example, a job requesting resources simply sees 4 processors and 4 GB of memory available in reservation A. If the job has access to the reserved resources and the resources meet the other requirements of the job, the job could use these resources according to its own task description and needs.
Currently, the resources that can be associated with reservations include processors, memory, swap, local disk, initiator classes, and any number of arbitrary resources. Arbitrary resources may include peripherals such as tape drives, software licenses, or any other site specific resource.
Associated with each reservation is a time frame. This specifies when the resources will be reserved or dedicated to jobs that meet the reservation's access control list (ACL). The time frame simply consists of a start time and an end time. When configuring a reservation, this information may be specified as a start time together with either an end time or a duration.
A reservation's access control list specifies which jobs can use a reservation. Only jobs that meet one or more of a reservation's access criteria are allowed to use the reserved resources during the reservation time frame. Currently, the reservation access criteria include the following: users, groups, accounts, classes, QOS, job attributes, job duration, and job templates.
While a reservation's ACL will allow particular jobs to use reserved resources, it does not force any job to use these resources. With each job, Moab attempts to locate the best possible combination of available resources whether these are reserved or unreserved. For example, in the following figure, note that job X, which meets access criteria for both reservation A and B, allocates a portion of its resources from each reservation and the remainder from resources outside of both reservations.
Image 6-1: Job X uses resources from reservations A and B
Click to enlarge
Although by default, reservations make resources available to jobs that meet particular criteria, Moab can be configured to constrain jobs to only run within accessible reservations. This can be requested by the user on a job by job basis using a resource manager extension flag, or it can be enabled administratively via a QoS flag. For example, assume two reservations were created as follows:
> mrsvctl -c -a GROUP==staff -d 8:00:00 -h 'node[1-4]' reservation staff.1 created
> mrsvctl -c -a USER==john -t 2 reservation john.2 created
If the user "john," who happened to also be a member of the group "staff," wanted to force a job to run within a particular reservation, "john" could do so using the FLAGS resource manager extension. Specifically, in the case of a PBS job, the following submission would force the job to run within the "staff.1" reservation.
> msub -l nodes=1,walltime=1:00:00,flags=ADVRES:staff.1 testjob.cmd
Note that for this to work, PBS needs to have resource manager extensions enabled as described in the PBS Resource Manager Extension Overview. (Torque has resource manager extensions enabled by default.) If the user wants the job to run on reserved resources but does not care which, the user could submit the job with the following:
> msub -l nodes=1,walltime=1:00:00,flags=ADVRES testjob.cmd
Use the reservation BYNAME flag to require explicit binding for reservation access.
To lock jobs linked to a particular QoS into a reservation or reservation group, use the REQRID attribute.
There are two main types of reservations that sites typically deal with. The first, administrative reservations, are typically one-time reservations created for special purposes and projects. These reservations are created using the mrsvctl or setres commands. These reservations provide an integrated mechanism to allow graceful management of unexpected system maintenance, temporary projects, and time critical demonstrations. This command allows an administrator to select a particular set of resources or just specify the quantity of resources needed. For example an administrator could use a regular expression to request a reservation be created on the nodes "blue0[1-9]" or could simply request that the reservation locate the needed resources by specifying a quantity based request such as "TASKS==20."
The second type of reservation is called a standing reservation. It is specified using the SRCFG parameter and is of use when there is a recurring need for a particular type of resource distribution. Standing reservations are a powerful, flexible, and efficient means for enabling persistent or periodic policies such as those often enabled using classes or queues. For example, a site could use a standing reservation to reserve a subset of its compute resources for quick turnaround jobs during business hours on Monday thru Friday. The Standing Reservation Overview provides more information about configuring and using these reservations.
As previously mentioned, a given reservation may have one or more access criteria. A job can use the reserved resources if it meets at least one of these access criteria. It is possible to stack multiple reservations on the same node. In such a situation, a job can only use the given node if it has access to each active reservation on the node.
Reservations groups are ways of associating multiple reservations. This association is useful for variable namespace and reservation requests. The reservations in a group inherit the variables from the reservation group head, but if the same variable is set locally on a reservation in the group, the local variable overrides the inherited variable. Variable inheritance is useful for triggers as it provides greater flexibility with automating certain tasks and system behaviors.
Jobs may be bound to a reservation group (instead of a single reservation) by using the resource manager extension ADVRES.
To allow infinite walltime jobs, you must have the following scheduler flag set:
You can submit an infinite job by completing:
msub -l walltime=INFINITY
Or an infinite reservation by completing:
mrsvctl -c -d INFINITY
Infinite jobs can run in infinite reservations. Infinite walltime also works with job templates and advres.
Output XML for infinite jobs will print "INFINITY" in the ReqAWDuration, and XML for infinite rsvs will print "INFINITY" in duration and endtime.
<Data> <rsv AUser="jgardner" AllocNodeCount="1" AllocNodeList="n5" AllocProcCount="4" AllocTaskCount="1" HostExp="n5" LastChargeTime="0" Name="jgardner.1" Partition="base" ReqNodeList="n5:1" Resources="PROCS=[ALL]" StatCAPS="0" StatCIPS="0" StatTAPS="0" StatTIPS="0" SubType="Other" Type="User" cost="0.000000" ctime="1302127058" duration="INFINITY" endtime="INFINITY" starttime="1302127058"> <ACL aff="neutral" cmp="%=" name="jgardner.1" type="RSV"></ACL> <ACL cmp="%=" name="jgardner" type="USER"></ACL> <ACL cmp="%=" name="company" type="GROUP"></ACL> <ACL aff="neutral" cmp="%=" name="jgardner.1" type="RSV"></ACL> <History> <event state="PROCS=4" time="1302127058"></event> </History> </rsv> </Data>