(Click to open topic with navigation)
The concept of cluster fairness varies widely from person to person and site to site. While some interpret it as giving all users equal access to compute resources, more complicated concepts incorporating historical resource usage, political issues, and job value are equally valid. While no scheduler can address all possible definitions of fair, Moab provides one of the industry's most comprehensive and flexible set of tools allowing most sites the ability to address their many and varied fairness management needs.
Under Moab, most fairness policies are addressed by a combination of the facilities described in the following table:
Job Prioritization | |
---|---|
Description: | Specifies what is most important to the scheduler. Using service based priority factors allows a site to balance job turnaround time, expansion factor, or other scheduling performance metrics. |
Example: |
SERVICEWEIGHT 1 QUEUETIMEWEIGHT 10 Causes jobs to increase in priority by 10 points for every minute they remain in the queue. |
Usage Limits (Throttling Policies) | |
---|---|
Description: | Specifies limits on exactly what resources can be used at any given instant. |
Example: |
USERCFG[john] MAXJOB=3 GROUPCFG[DEFAULT] MAXPROC=64 GROUPCFG[staff] MAXPROC=128 Allows john to only run 3 jobs at a time. Allows the group staff to use up to 128 total processors and all other groups to use up to 64 processors. |
Fairshare | |
---|---|
Description: | Specifies usage targets to limit resource access or adjust priority based on historical cluster resource usage. |
Example: |
USERCFG[steve] FSTARGET=25.0+ FSWEIGHT 1 FSUSERWEIGHT 10 Enables priority based fairshare and specifies a fairshare target for user steve such that his jobs are favored in an attempt to keep his jobs using at least 25.0% of delivered compute cycles. |
Allocation Management | |
---|---|
Description: | Specifies long term, credential-based resource usage limits. |
Example: |
AMCFG[mam] TYPE=MAM HOST=server.sys.net Enables the Moab Accounting Manager allocation management interface. Within the allocation manager, project or account based allocations may be configured. These allocations may, for example, do such things as allow project X to use up to 100,000 processor-hours per quarter, provide various QoS sensitive charge rates, and share allocation access. |
Quality of Service | |
---|---|
Description: | Specifies additional resource and service access for particular users, groups, and accounts. QoS facilities can provide special priorities, policy exemptions, reservation access, and other benefits (as well as special charge rates). |
Example: |
QOSCFG[orion] PRIORITY=1000 XFTARGET=1.2 QOSCFG[orion] QFLAGS=PREEMPTOR,IGNSYSTEM,RESERVEALWAYS Enables jobs requesting the orion QoS a priority increase, an expansion factor target to improve response time, the ability to preempt other jobs, an exemption from system level job size policies, and the ability to always reserve needed resources if it cannot start immediately. |
Standing Reservations | |
---|---|
Description: | Reserves blocks of resources within the cluster for specific, periodic time frames under the constraints of a flexible access control list. |
Example: |
SRCFG[jupiter] HOSTLIST=node01[1-4] SRCFG[jupiter] STARTTIME=9:00:00 ENDTIME=17:00:00 SRCFG[jupiter] USERLIST=john,steve ACCOUNTLIST=jupiter Reserve nodes node011 through node014 from 9:00 AM until 5:00 PM for use by jobs from user john or steve or from the project jupiter. |
Class/Queue Constraints | |
---|---|
Description: | Associates users, resources, priorities, and limits with cluster classes or cluster queues that can be assigned to or selected by end-users. |
Example: |
CLASSCFG[long] MIN.WCLIMIT=24:00:00 SRCFG[jupiter] PRIORITY=10000 SRCFG[jupiter] HOSTLIST=acn[1-4][0-9] Assigns long jobs a high priority but only allow them to run on certain nodes. |
Selecting the Correct Policy Approach
Moab supports a rich set of policy controls in some cases allowing a particular policy to be enforced in more than one way. For example, cycle distribution can be controlled using usage limits, fairshare, or even queue definitions. Selecting the most correct policy depends on site objectives and needs; consider the following when making such a decision:
Related topics