There are multiple models in which Moab can operate allowing it to either honor the
node state set by an external service or locally determine and set the node state. This
section covers the following:
- identifying meanings of particular node states
- specifying node states within locally developed services and resource
managers
- adjusting node state within Moab based on load, policies, and events
12.5.1 Node State Definitions
State |
Definition |
Down
|
Node is either not reporting status, is reporting status
but failures are detected, or is reporting status but has been marked downby an administrator. |
Idle
|
Node is reporting status, currently is not executing any
workload, and is ready to accept additional workload. |
Busy
|
Node is reporting status, currently is executing
workload, and cannot accept additional workload due to load. |
Running
|
Node is reporting status, currently is executing
workload, and can accept additional workload. |
Drained
|
Node is reporting status, currently is not executing
workload, and cannot accept additional workload due to administrative action. |
Draining
|
Node is reporting status, currently is executing
workload, and cannot accept additional workload due to administrative action. |
12.5.2 Specifying Node States within Native Resource Managers
Native resource managers can report node state implicitly and explicitly, using
NODESTATE, LOAD, and other attributes. See
Managing Resources Directly with the Native Interface for more information.
12.5.3 Moab Based Node State Adjustment
Node state can be adjusted based on reported processor, memory, or other load
factors. It can also be adjusted based on reports of one or more resource managers in a
multi-resource manager configuration. Also, both generic events and generic metrics can
be used to adjust node state.
- TORQUE health scripts (allow compute nodes to detect and report site specific failures).
12.5.4 Adjusting Scheduling Behavior Based on Reported Node State
Based on reported node state, Moab can support various policies to make better use
of available resources.
12.5.4.1 Down State
See Also