Maui Scheduler

Maui Scheduler Administrator's Guide

version 3.2

Copyright © 1999-20141 Adaptive Computing Enterprises, Inc All Rights Reserved
Distribution of this document for commercial purposes in either
hard or soft copy form is strictly prohibited
without prior written consent from Adaptive Computing, Inc.

Overview

The Maui Scheduler is a policy engine which allows sites control over when, where, and how resources such as processors, memory, and disk are allocated to jobs. In addition to this control, it also provides mechanisms which help to intelligently optimize the use of these resources, monitor system performance, help diagnose problems, and generally manage the system.


Table of Contents:

1.0 Philosophy and Goals of the Maui Scheduler

2.0 Installation and Initial Configuration
2.1 Building and Installing Maui
2.2 Initial Configuration
2.3 Initial Testing

3.0 Scheduler Basics
3.1 Layout of Scheduler Components
3.2 Scheduling Environment and Objects
3.3 Scheduling Iterations and Job Flow
3.4 Configuring the Scheduler

4.0 Scheduler Commands
4.1 Client Overview
4.2 Monitoring System Status
4.3 Managing Jobs
4.4 Managing Reservations
4.5 Configuring Policies
4.6 End User Commands
4.7 Miscellaneous Commands

5.0 Prioritizing Jobs and Allocating Resources
5.1 Job Priority
5.2 Node Allocation
5.3 Node Access
5.4 Node Availability
5.5 Task Distribution

6.0 Managing Fairness - Throttling Policies, Fairshare, and Allocation Management
6.1 Fairness Overview
6.2 Throttling Policies
6.3 Fairshare
6.4 Allocation Management

7.0 Controlling Resource Access - Reservations, Partitions, and QoS Facilities
7.1 Advance Reservations
7.2 Partitions
7.3 QoS Facilities

8.0 Optimizing Scheduling Behavior - Backfill, Node Sets, and Preemption
8.1 Optimization Overview
8.2 Backfill
8.3 Node Sets
8.4 Preemption

9.0 Evaluating System Performance - Statistics, Profiling, Testing, and Simulation
9.1 Scheduler Performance Evaluation Overview
9.2 Accounting - Job and System Statistics
9.3 Profiling Current and Historical Usage
9.4 Testing New Versions and Configurations
9.5 Answering 'What If?' Questions with the Simulator

10.0 Managing Shared Resources - SMP Issues and Policies
10.1 Consumable Resource Handling
10.2 Load Balancing Features
10.3 Resource Usage Tracking
10.4 Resource Usage Limits

11.0 General Job Administration
11.1 Deferred Jobs and Job Holds
11.2 Job Priority Management
11.3 Suspend/Resume Handling
11.4 Checkpoint/Restart
11.5 Job Dependencies
11.6 Setting Job Defaults and Per Job Limits
11.7 General Job Policies
11.8 Using a Local Queue

12.0 General Node Administration
12.1 Node Location (Partitions, Frames, Queues, etc.)
12.2 Node Attributes (Node Features, Speed, etc.)
12.3 Node Specific Policies (MaxJobPerNode, etc.)
12.4 Configuring Node-Locked Consumable Generic Resources (tape drives, node-locked licenses, etc.)

13.0 Resource Managers and Interfaces
13.1 Resource Manager Overview
13.2 Resource Manager Configuration
13.3 Resource Manager Extensions
13.4 Adding Resource Manager Interfaces

14.0 Trouble Shooting and System Maintenance
14.1 Internal Diagnostics
14.2 Logging Facilities
14.3 Using the Message Buffer
14.4 Handling Events with the Notification Routine
14.5 Issues with Client Commands
14.6 Tracking System Failures
14.7 Problems with Individual Jobs

15.0 Improving User Effectiveness
15.1 User Feedback Loops
15.2 User Level Statistics
15.3 Enhancing Wallclock Limit Estimates
15.4 Providing Resource Availability Information
15.5 Job Start Time Estimates
15.6 Collecting Performance Information on Individual Jobs

16.0 Simulations
16.1 Simulation Overview
16.2 Resource Traces
16.3 Workload Traces
16.4 Simulation Specific Configuration

17.0 Miscellaneous
17.1 User Feedback
17.2 Grid Scheduling
17.3 Enabling High Availability Features
17.4 Using the Application Scheduling Library

Appendices
Appendix A: Case Studies
Appendix B: Extension Interface
Appendix C: Adding New Algorithms
Appendix D: Structure Limits
Appendix E: Security Configuration
Appendix F: Parameters Overview
Appendix G: Commands Overview
Appendix H: Interfacing to Maui
Appendix I: Considerations for Large Clusters
Appendix J: Differences Guide
Appendix K: Maui-Moab Comparison