5.644 Prologue Error Processing

If the prologue script executes successfully, it should exit with a zero status. Otherwise, the script should return the appropriate error code as defined in the table below. The pbs_mom will report the script's exit status to pbs_server which will in turn take the associated action. The following table describes each exit code for the prologue scripts and the action taken.

Error	Description	Action
-4	The script timed out	Job will be requeued
-3	The wait(2) call returned an error	Job will be requeued
-2	Input file could not be opened	Job will be requeued
-1	Permission error (script is not owned by root, or is writable by others)	Job will be requeued
0	Successful completion	Job will run
1	Abort exit code	Job will be aborted
>1	other	Job will be requeued

Example 5-354:

Following are example prologue and epilogue scripts that write the arguments passed to them in the job's standard out file:

prologue
Script	#!/bin/sh echo "Prologue Args:" echo "Job ID: $1" echo "User ID: $2" echo "Group ID: $3" echo "" exit 0
stdout	Prologue Args: Job ID: 13724.node01 User ID: user1 Group ID: user1

epilogue
Script	#!/bin/sh echo "Epilogue Args:" echo "Job ID: $1" echo "User ID: $2" echo "Group ID: $3" echo "Job Name: $4" echo "Session ID: $5" echo "Resource List: $6" echo "Resources Used: $7" echo "Queue Name: $8" echo "Account String: $9" echo "" exit 0
stdout	Epilogue Args: Job ID: 13724.node01 User ID: user1 Group ID: user1 Job Name: script.sh Session ID: 28244 Resource List: neednodes=node01,nodes=1,walltime=00:01:00 Resources Used: cput=00:00:00,mem=0kb,vmem=0kb,walltime=00:00:07 Queue Name: batch Account String:

Example 5-355:

The Ohio Supercomputer Center contributed the following scripts:

"prologue creates a unique temporary directory on each node assigned to a job before the job begins to run, and epilogue deletes that directory after the job completes.

Having a separate temporary directory on each node is probably not as good as having a good, high performance parallel filesystem.

prologue

#!/bin/sh

# Create TMPDIR on all the nodes

# prologue gets 3 arguments:

# 1 -- jobid

# 2 -- userid

# 3 -- grpid

jobid=$1

user=$2

group=$3

nodefile=/var/spool/pbs/aux/$jobid

if [ -r $nodefile ] ; then

nodes=$(sort $nodefile | uniq)

else

nodes=localhost

tmp=/tmp/pbstmp.$jobid

for i in $nodes ; do

ssh $i mkdir -m 700 $tmp \&\& chown $user.$group $tmp

done

exit 0

epilogue

#!/bin/sh

# Clear out TMPDIR

# epilogue gets 9 arguments:

# 1 -- jobid

# 2 -- userid

# 3 -- grpid

# 4 -- job name

# 5 -- sessionid

# 6 -- resource limits

# 7 -- resources used

# 8 -- queue

# 9 -- account