5.656 Test 6 - Restart from Previous Image

5.656.1 Introduction

This test determines if the job can be restarted from a previous checkpoint image.

5.656.2 Test Steps

Start the job with the option -c enabled,periodic,interval=1 and look in the checkpoint directory for checkpoint images to be generated about every minute. Do a qhold on the job to stop it. Change the attribute checkpoint_name with the qalter command. Then do a qrls to restart the job.

> qsub -c enabled,periodic,interval=1 test.sh

999.xxx.yyy

> qhold 999

> qalter -W checkpoint_name=ckpt.999.xxx.yyy.1234567

> qrls 999

5.656.3 Successful Results

The job output file should be truncated back and the count should resume at an earlier number.

Related Topics 

© 2016 Adaptive Computing