TORQUE Resource Manager > Troubleshooting > Compute Node Health Check > Adjusting Node State Based on the Health Check Output

Adjusting Node State Based on the Health Check Output

If the health check reports an error, the node attribute "message" is set to the error string returned. Cluster schedulers can be configured to adjust a given node's state based on this information. For example, by default, Moab sets a node's state to down if a node error message is detected. The node health script continues to run at the configured interval (see Configuring MOMs to Launch a Health Check for more information), and if it does not generate the error message again during one of its later executions, Moab picks that up at the beginning of its next iteration and restores the node to an online state.

Related Topics 

© 2015 Adaptive Computing