Maui Scheduler

16.3 Workload Traces

Workload traces fully describe all scheduling relevant aspects of batch jobs including resources requested and utilized, time of all major scheduling events (i.e., submission time, start time, etc), the job credentials used, and the job execution environment. Each job trace is composed of a single line consisting of 44 whitespace delimited fields as shown in the table below.
Note Maui 3.2.6 and higher can be configured to provide this information in XML format conforming to the SSS 1.0 job description specification.

16.3.1 Workload Trace Format

Field Name Field Index Data Format Default Value Details
JobID 1 <STRING> [NO DEFAULT] Name of job, must be unique
Nodes Requested 2 <INTEGER> 0 Number of nodes requested (0 = no node request count specified)
Tasks Requested 3 <INTEGER> 1 Number of tasks requested
User Name 4 <STRING> [NO DEFAULT] Name of user submitting job
Group Name 5 <STRING> [NO DEFAULT] Primary group of user submitting job
Wallclock Limit 6 <INTEGER> 1 Maximum allowed job duration in seconds
Job Completion State 7 <STRING> Completed One of Completed, Removed, NotRun
Required Class 8 <STRING> [DEFAULT:1] Class/queue required by job specified as square bracket list of <QUEUE>[:<QUEUE INSTANCE>] requirements. (ie, [batch:1])
Submission Time 9 <INTEGER> 0 Epoch time when job was submitted
Dispatch Time 10 <INTEGER> 0 Epoch time when scheduler requested job begin executing
Start Time 11 <INTEGER> 0 Epoch time when job began executing (NOTE: usually identical to 'Dispatch Time')
Completion Time 12 <INTEGER> 0 Epoch time when job completed execution
Required Network Adapter 13 <STRING> [NONE] Name of required network adapter if specified
Required Node
14 <STRING> [NONE] Required node architecture if specified
Required Node
Operating System
15 <STRING> [NONE] Required node operating system if specified
Required Node
16 one of >, >=, =, <=, < >= Comparison for determining compliance with required node memory
Required Node
17 <INTEGER> 0 Amount of required configured RAM (in MB) on each node
Required Node Disk
18 one of >, >=, =, <=, < >= Comparison for determining compliance with required node disk
Required Node Disk 19 <INTEGER> 0 Amount of required configured local disk (in MB) on each node
Required Node
20 <STRING> [NONE] square bracket enclosed list of node features required by job if specified (ie '[fast][ethernet]')
System Queue
21 <INTEGER> 0 Epoch time when job met all fairness policies
Tasks Allocated 22 <INTEGER> <TASKS REQUESTED> Number of tasks actually allocated to job (NOTE: in most cases, this field is identical to field #3, Tasks Requested)
Required Tasks Per
23 <INTEGER> -1 Number of Tasks Per Node required by job or '-1' if no requirement specified
QOS 24 <STRING>[:<STRING>] [NONE] QOS requested/delivered using the format <QOS_REQUESTED>[:<QOS_DELIVERED>] (ie, 'hipriority:bottomfeeder')
JobFlags 25 <STRING>[:<STRING>]... [NONE] square bracket delimited list of job attributes (i.e., [BACKFILL][BENCHMARK][PREEMPTEE])
Account Name 26 <STRING> [NONE] Name of account associated with job if specified
Executable 27 <STRING> [NONE] Name of job executable if specified
Comment 28 <STRING> [NONE] Resource manager specific list of job attributes if specified. See the Resource Manager Extension Overview for more info.
Bypass Count 29 <INTEGER> -1 Number of time job was bypassed by lower priority jobs via backfill or '-1' if not specified
30 <DOUBLE> 0 Number of processor seconds actually utilized by job
Partition Name 31 <STRING> [DEFAULT] Name of partition in which job ran
Dedicated Processors per Task 32 <INTEGER> 1 Number of processors required per task
Dedicated Memory per Task 33 <INTEGER> 0 Amount of RAM (in MB) required per task
Dedicated Disk per Task 34 <INTEGER> 0 Amount of local disk (in MB) required per task
Dedicated Swap per Task 35 <INTEGER> 0 Amount of virtual memory (in MB) required per task
Start Date 36 <INTEGER> 0 Epoch time indicating earliest time job can start
End Date 37 <INTEGER> 0 Epoch time indicating latest time by which job must complete
Allocated Host List 38 <STRING>[:<STRING>]... [NONE] colon delimited list of hosts allocated to job (i.e., node001:node004) NOTE: In Maui 3.0, this field only lists the job's master host.
Resource Manager Name 39 <STRING> [NONE] Name of resource manager if specified
Required Host Mask 40 <STRING>[<STRING>]... [NONE] List of hosts required by job. (if taskcount > #hosts, scheduler must use these nodes in addition to others, if taskcount < #host, scheduler must select needed hosts from this list)
Reservation 41 <STRING> [NONE] Name of reservation required by job if specified
Set Description 42 <STRING>:<STRING>[:<STRING>] [NONE] Set constraints required by node in the form <SetConstraint>:<SetType>[:<SetList>] where SetConstraint is one of ONEOF, FIRSTOF, or ANYOF, SetType is one of PROCSPEED, FEATURE, or NETWORK, and SetList is an optional colon delimited list of allowed set attributes, (i.e. 'ONEOF:PROCSPEED:350:450:500')
Application Simulator Data 43 <STRING>[:<STRING>] [NONE] Name of application simulator module and associated configuration data (i.e., 'HSM:IN=infile.txt:140000;OUT=outfile.txt:500000')
Note if no applicable value is specified, the exact string '[NONE]' should be entered.

Sample Workload Trace:

'SP02.2343.0 20 20 570 519 86400 Removed [batch:1] 887343658 889585185 889585185 889585411 ethernet R6000 AIX43 >= 256 >= 0 [NONE] 889584538 20 0 0 2 0 test.cmd 1001 6 678.08 0 1 0 0 0 0 0 [NONE] 0 [NONE] [NONE] [NONE] [NONE] [NONE]'

16.3.2Creating New Workload Traces

Because workload traces and workload statistics utilize the same format, there are trace fields which provide information that is valuable to a statistical analysis of historical system performance but not necessary for the execution of a simulation.

Particularly, in the area of time based fields, there exists an opportunity to overspecify. Which time based fields are important depend on the setting the the JOBSUBMISSIONPOLICY parameter.

JOBSUBMISSIONPOLICY Value Critical Time Based Fields
NORMAL WallClock Limit
Submission Time
Completion Time
WallClock Limit
Completion Time
Note Dispatch Time should always be identical to Start Time
Note In all cases, the difference of 'Completion Time - Start Time' is used to determine actual job run time.
Note System Queue Time and Proc-Seconds Utilized are only used for statistics gathering purposes and will not alter the behavior of the simulation.
Note In all cases, relative time values are important, i.e., Start Time must be greater than or equal to Submission Time and less than Completion Time.