(Click to open topic with navigation)
CHECKPOINT is one of the PREEMPTPOLICY types (for more information, see PREEMPTPOLICY types). For systems that allow checkpointing, the CHECKPOINT value allows a job to save its current state and either terminate or continue running. A checkpointed job may restart at any time and resume execution from its most recent checkpoint.
You can tune checkpointing behavior on a per-resource manager-basis by setting the CHECKPOINTSIG and CHECKPOINTTIMEOUT attributes of the RMCFG parameter.
For information about PREEPMPTEE and PREEMPTOR flags, see Preemption flags
The following outlines some benefits of using CHECKPOINT and also lists some things you should be aware of if you choose to use it.
Advantages:
This attribute allows you to restart a job from its last checkpoint.
Cautions:
Jobs tend to take longer to complete when you use CHECKPOINT.
To preempt jobs using CHECKPOINT
Make the following configurations to the moab.cfg file:
For example:
GUARANTEEDPREEMPTION TRUE PREEMPTPOLICY CHECKPOINT QOSCFG[test1] QFLAGS=PREEMPTEE MEMBERULIST=john PRIORITY=100 QOSCFG[test2] QFLAGS=PREEMPTOR MEMBERULIST=john PRIORITY=10000
Related topics