(Click to open topic with navigation)
5.656.1 Introduction
This test determines if the job can be restarted from a previous checkpoint image.
5.656.2 Test Steps
Start the job with the option -c enabled,periodic,interval=1 and look in the checkpoint directory for checkpoint images to be generated about every minute. Do a qhold on the job to stop it. Change the attribute checkpoint_name with the qalter command. Then do a qrls to restart the job.
> qsub -c enabled,periodic,interval=1 test.sh
999.xxx.yyy
> qhold 999
> qalter -W checkpoint_name=ckpt.999.xxx.yyy.1234567
> qrls 999
5.656.3 Successful Results
The job output file should be truncated back and the count should resume at an earlier number.
Related Topics