If the prologue script executes successfully, it should exit with a zero status. Otherwise, the script should return the appropriate error code as defined in the table below. The pbs_mom will report the script's exit status to pbs_server which will in turn take the associated action. The following table describes each exit code for the prologue scripts and the action taken.
Error | Description | Action |
---|---|---|
-4 | The script timed out | Job will be requeued |
-3 | The wait(2) call returned an error | Job will be requeued |
-2 | Input file could not be opened | Job will be requeued |
-1 |
Permission error (script is not owned by root, or is writable by others) |
Job will be requeued |
0 | Successful completion | Job will run |
1 | Abort exit code | Job will be aborted |
>1 | other | Job will be requeued |
Example G-1:
Following are example prologue and epilogue scripts that write the arguments passed to them in the job's standard out file:
prologue | |
---|---|
Script | #!/bin/sh
echo "Prologue Args:" echo "Job ID: $1" echo "User ID: $2" echo "Group ID: $3" echo "" exit 0 |
stdout | Prologue Args:
Job ID: 13724.node01 User ID: user1 Group ID: user1 |
epilogue | |
---|---|
Script | #!/bin/sh
echo "Epilogue Args:" echo "Job ID: $1" echo "User ID: $2" echo "Group ID: $3" echo "Job Name: $4" echo "Session ID: $5" echo "Resource List: $6" echo "Resources Used: $7" echo "Queue Name: $8" echo "Account String: $9" echo "" exit 0 |
stdout | Epilogue Args:
Job ID: 13724.node01 User ID: user1 Group ID: user1 Job Name: script.sh Session ID: 28244 Resource List: neednodes=node01,nodes=1,walltime=00:01:00 Resources Used: cput=00:00:00,mem=0kb,vmem=0kb,walltime=00:00:07 Queue Name: batch Account String: |
Example G-2:
The Ohio Supercomputer Center contributed the following scripts:
"prologue creates a unique temporary directory on each node assigned to a job before the job begins to run, and epilogue deletes that directory after the job completes.
Having a separate temporary directory on each node is probably not as good as having a good, high performance parallel filesystem.
prologue
#!/bin/sh # Create TMPDIR on all the nodes # Copyright 1999, 2000, 2001 Ohio Supercomputer Center # prologue gets 3 arguments: # 1 -- jobid # 2 -- userid # 3 -- grpid # jobid=$1 user=$2 group=$3 nodefile=/var/spool/pbs/aux/$jobid if [ -r $nodefile ] ; then nodes=$(sort $nodefile | uniq) else nodes=localhost fi tmp=/tmp/pbstmp.$jobid for i in $nodes ; do ssh $i mkdir -m 700 $tmp \&\& chown $user.$group $tmp done exit 0 |
epilogue
#!/bin/sh # Clear out TMPDIR # Copyright 1999, 2000, 2001 Ohio Supercomputer Center # epilogue gets 9 arguments: # 1 -- jobid # 2 -- userid # 3 -- grpid # 4 -- job name # 5 -- sessionid # 6 -- resource limits # 7 -- resources used # 8 -- queue # 9 -- account # jobid=$1 nodefile=/var/spool/pbs/aux/$jobid if [ -r $nodefile ] ; then nodes=$(sort $nodefile | uniq) else nodes=localhost fi tmp=/tmp/pbstmp.$jobid for i in $nodes ; do ssh $i rm -rf $tmp done exit 0 |
prologue, prologue.user, and prologue.parallel scripts can have dramatic effects on job scheduling if written improperly.
Related topics
© 2012 Adaptive Computing