Prologue error processing

G.5 Prologue error processing

If the prologue script executes successfully, it should exit with a zero status. Otherwise, the script should return the appropriate error code as defined in the table below. The pbs_mom will report the script's exit status to pbs_server which will in turn take the associated action. The following table describes each exit code for the prologue scripts and the action taken.

Error	Description	Action
-4	The script timed out	Job will be requeued
-3	The wait(2) call returned an error	Job will be requeued
-2	Input file could not be opened	Job will be requeued
-1	Permission error (script is not owned by root, or is writable by others)	Job will be requeued
0	Successful completion	Job will run
1	Abort exit code	Job will be aborted
>1	other	Job will be requeued

Error

Description

Action

-4

The script timed out

Job will be requeued

-3

The wait(2) call returned an error

Job will be requeued

-2

Input file could not be opened

Job will be requeued

-1

Permission error

(script is not owned by root, or is writable by others)

Job will be requeued

Successful completion

Job will run

Abort exit code

Job will be aborted

other

Job will be requeued

Example G-1:

Following are example prologue and epilogue scripts that write the arguments passed to them in the job's standard out file:

prologue
Script	#!/bin/sh echo "Prologue Args:" echo "Job ID: $1" echo "User ID: $2" echo "Group ID: $3" echo "" exit 0
stdout	Prologue Args: Job ID: 13724.node01 User ID: user1 Group ID: user1

prologue

Script

#!/bin/sh
echo "Prologue Args:"
echo "Job ID: $1"
echo "User ID: $2"
echo "Group ID: $3"
echo ""

exit 0

stdout

Prologue Args:
Job ID: 13724.node01
User ID: user1
Group ID: user1

epilogue
Script	#!/bin/sh echo "Epilogue Args:" echo "Job ID: $1" echo "User ID: $2" echo "Group ID: $3" echo "Job Name: $4" echo "Session ID: $5" echo "Resource List: $6" echo "Resources Used: $7" echo "Queue Name: $8" echo "Account String: $9" echo "" exit 0
stdout	Epilogue Args: Job ID: 13724.node01 User ID: user1 Group ID: user1 Job Name: script.sh Session ID: 28244 Resource List: neednodes=node01,nodes=1,walltime=00:01:00 Resources Used: cput=00:00:00,mem=0kb,vmem=0kb,walltime=00:00:07 Queue Name: batch Account String:

epilogue

Script

#!/bin/sh
echo "Epilogue Args:"
echo "Job ID: $1"
echo "User ID: $2"
echo "Group ID: $3"
echo "Job Name: $4"
echo "Session ID: $5"
echo "Resource List: $6"
echo "Resources Used: $7"
echo "Queue Name: $8"
echo "Account String: $9"
echo ""

exit 0

stdout

Epilogue Args:
Job ID: 13724.node01
User ID: user1
Group ID: user1
Job Name: script.sh
Session ID: 28244
Resource List: neednodes=node01,nodes=1,walltime=00:01:00
Resources Used: cput=00:00:00,mem=0kb,vmem=0kb,walltime=00:00:07
Queue Name: batch
Account String:

Example G-2:

The Ohio Supercomputer Center contributed the following scripts:

"prologue creates a unique temporary directory on each node assigned to a job before the job begins to run, and epilogue deletes that directory after the job completes.

Having a separate temporary directory on each node is probably not as good as having a good, high performance parallel filesystem.

prologue

#!/bin/sh

# Create TMPDIR on all the nodes

# prologue gets 3 arguments:

# 1 -- jobid

# 2 -- userid

# 3 -- grpid

jobid=$1

user=$2

group=$3

nodefile=/var/spool/pbs/aux/$jobid

if [ -r $nodefile ] ; then

nodes=$(sort $nodefile | uniq)

else

nodes=localhost

tmp=/tmp/pbstmp.$jobid

for i in $nodes ; do

ssh $i mkdir -m 700 $tmp \&\& chown $user.$group $tmp

done

exit 0

epilogue

#!/bin/sh

# Clear out TMPDIR

# epilogue gets 9 arguments:

# 1 -- jobid

# 2 -- userid

# 3 -- grpid

# 4 -- job name

# 5 -- sessionid

# 6 -- resource limits

# 7 -- resources used

# 8 -- queue

# 9 -- account

jobid=$1

nodefile=/var/spool/pbs/aux/$jobid

if [ -r $nodefile ] ; then

nodes=$(sort $nodefile | uniq)

else

nodes=localhost

tmp=/tmp/pbstmp.$jobid

for i in $nodes ; do

ssh $i rm -rf $tmp

done

exit 0

prologue, prologue.user, and prologue.parallel scripts can have dramatic effects on job scheduling if written improperly.