16.3.3 Workload Accounting Records

Moab workload accounting records fully describe all scheduling relevant aspects of batch jobs including resources requested and used, time of all major scheduling events (such as submission time and start time), the job credentials used, and the job execution environment. Each job trace is composed of a single line consisting of whitespace delimited fields as shown in the following table.

Note Moab can be configured to provide this information in flat text tabular form or in XML format conforming to the SSS 1.0 job description specification.

16.3.3.1 Workload Event Record Format (v 5.0.0)

All job events (JOBSUBMIT, JOBSTART, JOBEND, and so forth) provide job data in a standard format as described in the following table:

Field Name Field Index Data Format Default Value Details
1 HH:MM:SS - Specifies time event occurred.
2 <epochtime> - Specifies time event occurred.
3 job - Specifies record object type.
4 <STRING> - Unique object identifier.
5 one of jobcancel, jobcheckpoint, jobend, jobfailure, jobhold, jobmigrate, jobpreempt, jobreject, jobresume, jobstart or jobsubmit - Specifies record event type.
6 <INTEGER> 0 Number of nodes requested (0 = no node request count specified).
7 <INTEGER> 1 Number of tasks requested.
8 <STRING> - Name of user submitting job.
9 <STRING> - Primary group of user submitting job.
10 <INTEGER> 1 Maximum allowed job duration (in seconds).
11 <STRING> - Job state at time of event.
12 <STRING> [DEFAULT:1] Class/queue required by job specified as square bracket list of <QUEUE>
[:<QUEUEINSTANCE>] requirements. (For example: [batch:1]).
13 <INTEGER> 0 Epoch time when job was submitted.
14 <INTEGER> 0 Epoch time when scheduler requested job begin executing.
15 <INTEGER> 0 Epoch time when job began executing. This is usually identical to Dispatch Time.
16 <INTEGER> 0 Epoch time when job completed execution.
17 <STRING> - Name of required network adapter if specified.
18 <STRING> - Required node architecture if specified.
19 <STRING> - Required node operating system if specified.
20 one of >, >=, =, <=, < >= Comparison for determining compliance with required node memory.
21 <INTEGER> 0 Amount of required configured RAM (in MB) on each node.
22 one of >, >=, =,
<=, <
>= Comparison for determining compliance with required node disk.
23 <INTEGER> 0 Amount of required configured local disk (in MB) on each node.
24 <STRING> - Square bracket enclosed list of node features required by job if specified. (For example: [fast][ethernet])
25 <INTEGER> 0 Epoch time when job met all fairness policies.
26 <INTEGER> <TASKS REQUESTED> Number of tasks actually allocated to job.
NoteIn most cases, this field is identical to field #7, Tasks Requested.
27 <INTEGER> -1 Number of Tasks Per Node required by job or '-1' if no requirement specified.
28 <STRING>
[:<STRING>]
- QoS requested/assigned using the format <QOS_REQUESTED>
[:<QOS_DELIVERED>]. (For example: hipriority:bottomfeeder)
29 <STRING>
[:<STRING>]...
- Square bracket delimited list of job attributes. (For example: [BACKFILL][PREEMPTEE])
30 <STRING> - Name of account associated with job if specified.
31 <STRING> - Name of job executable if specified.
32 <STRING> - Resource manager specific list of job attributes if specified. See the Resource Manager Extension Overview for more information.
33 <INTEGER> -1 Number of times job was bypassed by lower priority jobs via backfill or '-1' if not specified.
34 <DOUBLE> 0 Number of processor seconds actually used by job.
35 <STRING> [DEFAULT] Name of partition in which job ran.
36 <INTEGER> 1 Number of processors required per task.
37 <INTEGER> 0 Amount of RAM (in MB) required per task.
38 <INTEGER> 0 Amount of local disk (in MB) required per task.
39 <INTEGER> 0 Amount of virtual memory (in MB) required per task.
40 <INTEGER> 0 Epoch time indicating earliest time job can start.
41 <INTEGER> 0 Epoch time indicating latest time by which job must complete.
42 <hostname>
[,<hostname>]...
- Comma delimited list of hosts allocated to job. (For example: node001,node004)
43 <STRING> - Name of resource manager if specified.
44 <hostname>
[,<hostname>]...
- List of hosts required by job. (If the job's taskcount is greater than the specified number of hosts, the scheduler must use these nodes in addition to others; if the job's taskcount is less than the specified number of hosts, the scheduler must select needed hosts from this list.)
45 <STRING> - Name of reservation required by job if specified.
46 <STRING>
[:<STRING>]
- Name of application simulator module and associated configuration data. (For example:
HSM:IN=infile.txt:140000;OUT=
outfile.txt:500000)
47 <STRING>:
<STRING>
[:<STRING>]
- Set constraints required by node in the form <SetConstraint>:<SetType>[:<SetList>] where SetConstraint is one of ONEOF, FIRSTOF, or ANYOF, SetType is one of PROCSPEED, FEATURE, or NETWORK, and SetList is an optional colon delimited list of allowed set attributes. (For example: ONEOF:PROCSPEED:350:450:500)
48 <STRING> - Job messages including resource manager, scheduler, and administrator messages if specified.
49 <DOUBLE> 0.0 Cost of executing job incorporating resource consumption metric, resource quantity consumed, and credential, allocated resource, and delivered QoS charge rates.
50 <STRING> - List of job events impacting resource allocation (XML).
NoteHistory information is only reported in Moab 5.1.0 and higher.
51 Comma-delimited list of one or more of the following: <ATTR>=
<VALUE> pairs where <VALUE> is a double and <ATTR> is one of the following: network (in MB transferred), license (in license-seconds), storage (in MB-seconds stored), or gmetric:<TYPE>.
- Cumulative resources used over life of job.
52 <STRING> - List of job estimate usage.
53 <INTEGER> - Job exit status/completion code.
54 <STRING> - Extended memory usage statistics (max, mem, avg, and so forth).
55 <STRING> - Extended CPU usage statistics (max, mem, avg, and so forth).
56 <STRING> -1 Generic metric averages.
57 <INTEGER> -1 The amount of time, in seconds, that the job was eligible for scheduling.
Note If no applicable value is specified, the exact string - should be entered.
Note Fields that contain a description string such as Job Message use a packed string format. The packed string format replaces white space characters such as spaces and carriage returns with a hex character representation. For example a blank space is represented as \20. Since fields in the event record are space delimited, this preserves the correct order and spacing of fields in the record.

Sample Workload Trace

13:21:05 110244355 job 1413 JOBEND 20 20 josh staff 86400 Removed [batch:1] 887343658 889585185 \
889585185 889585411 ethernet R6000 AIX53 >= 256 >= 0 - 889584538 20 0 0 2 0 test.cmd \
1001 6 678.08 0 1 0 0 0 0 0 - 0 - - - - - - - - 0.0 - - - 0 - -

16.3.3.2 Workload Event Record Format (v 6.0.0.0)

All job events (JOBSUBMIT, JOBSTART, JOBEND, and so forth) provide job data in the native wiki format (ATTR=VALUE). This is to make events more readable and to allow format flexibility.

Examples

09:26:40 1288279600:1 sched    Moab         SCHEDSTART   -
09:26:40 1288279600:2 rm       pbs          RMUP         initialized
09:26:40 1288279600:3 sched    Moab         RMPOLLSTART  -
09:26:40 1288279600:4 job      58           JOBSUBMIT    58   REQUESTEDNC=1 REQUESTEDTC=3 UNAME=wightman 
GNAME=wightman WCLIMIT=60   STATE=Completed RCLASS=[batch:1] SUBMITTIME=1288279493 RMEMCMP=>= RDISKCMP=>= 
RFEATURES=[NONE]   SYSTEMQUEUETIME=1288279493 TASKS=1 FLAGS=RESTARTABLE PARTITION=pbs   DPROCS=1 
ENDDATE=2140000000 TASKMAP=proxy,GLOBAL SRM=pbs   MESSAGE="\STARTLabel\20\20\20CreateTime\20ExpireTime
\20\20\20\20Owner\20Prio\20Num\20Message\0a,\STARTcheckpoint\20record\20not\20found"   EXITCODE=0 SID=2357 
NODEALLOCATIONPOLICY=SHARED
09:26:40 1288279600:5 job      58           JOBEND       58   REQUESTEDNC=1 REQUESTEDTC=3 UNAME=wightman 
GNAME=wightman WCLIMIT=60   STATE=Completed RCLASS=[batch:1] SUBMITTIME=1288279493 RMEMCMP=>= RDISKCMP=>= 
RFEATURES=[NONE]   SYSTEMQUEUETIME=1288279493 TASKS=1 FLAGS=RESTARTABLE PARTITION=pbs   DPROCS=1 
ENDDATE=2140000000 TASKMAP=proxy,GLOBAL SRM=pbs EXITCODE=0   SID=2357 NODEALLOCATIONPOLICY=SHARED 
EFFECTIVEQUEUEDURATION=107

16.3.3.3 Creating New Workload Simulation Traces

Because workload event records and simulation workload traces use the same format, these event records can be used as a starting point for generating a new simulation trace. In the Moab simple case, an event record or collection of event records can be used directly as the value for the SIMWORKLOADTRACEFILE as in the following example:

# collect all job records for December
> cat /opt/moab/stats/events.*Dec*2006 | grep JOBEND > /opt/moab/DecJobs.txt
# edit moab.cfg for use job records
> vi /opt/moab/etc/moab.cfg
  (add 'SIMWORKLOADTRACEFILE /opt/moab/DecJobs.txt')
  (set SCHEDCFG[] MODE and other simulation parameters as described in the Simulation Overview)

# start the simulation
> moab
Note In the preceding example, all non-JOBEND events were filtered out. This step is not required but only JOBEND events are used in a simulation; other events are ignored by Moab.

Modifying Existing Job Event Records

When creating a new simulation workload, it is often valuable to start with workload traces representing a well-known or even local workload. These traces preserve distribution information about job submission times, durations, processor count, users, groups, projects, special resource requests, and numerous other factors that effectively represent an industry, user base, or organization.

When modifying records, a field or combination of fields can be altered, new jobs inserted, or certain jobs filtered out.

Note Because job event records are used for multiple purposes, some of the fields are valuable for statistics or auditing purposes but are ignored in simulations. For the most part, fields representing resource utilization information are ignored while fields representing resource requests are not.

Modifying Time Distribution Factors of a Workload Trace

In some cases, simulations focus on determining the effects of changing the quantities or types of jobs or on changing policies or job ownership to see changes to system performance and resource utiliation. However, other times simulations tend to focus on response-time metrics as job submission and job duration aspects of the workload are modified. Which time-based fields are important to modify depend on the simulation purpose and the setting of the JOBSUBMISSIONPOLICY parameter.

JOBSUBMISSIONPOLICY Value Critical Time Based Fields
WallClock Limit
Submission Time
StartTime
Completion Time
WallClock Limit
StartTime
Completion Time

Note 1: Dispatch Time should always be identical to Start Time
Note 2: In all cases, the difference of 'Completion Time - Start Time' is used to determine actual job run time.
Note 3: System Queue Time and Proc-Seconds Utilized are only used for statistics gathering purposes and will not alter the behavior of the simulation.
Note 4: In all cases, relative time values are important, i.e., Start Time must be greater than or equal to Submission Time and less than Completion Time.

Creating Workload Traces From Scratch

There is nothing which prevents a completely new workload trace from being created from scratch. To do this, simply create a file whith fields matching the format described in the Workload Event Record Format section.


16.3.3.4 Reservation Records/Traces

All reservation events provide reservation data in a standard format as described in the following table:

Field Name Field Index Data Format Default Value Details
0 [HH:MM:SS] - Specifies time event occurred.
1 <epochtime> - Specifies time event occurred.
2 rsv - Specifies record object type.
3 <STRING> - Unique object identifier.
4 one of rsvcreate, rsvstart, rsvmodify, rsvfail or rsvend - Specifies record event type.
5 <EPOCHTIME> - Specifies epoch time of reservation start date.
6 <EPOCHTIME> - Specifies epoch time of reservation start date.
7 <EPOCHTIME> - Specifies epoch time of reservation end date.
8 <INTEGER> - Specifies number of tasks allocated to reservation at event time.
9 <INTEGER> - Specifies number of nodes allocated to reservation at event time.
10 <INTEGER> - Specifies proc-seconds reserved resources were dedicated to one or more job at event time.
11 <INTEGER> - Specifies proc-seconds resources were reserved at event time.
12 <comma-delimited list of hostnames> - Specifies list of hosts reserved at event time.
13 <STRING> - Specifies reservation ownership credentials.
14 <STRING> - Specifies reservation access control list.
15 <STRING> - Specifies associated node category assigned to reservation.
16 <STRING> - Specifies general human readable event message.
17 <STRING> - Displays the command line arguments used to create the reservation (only shows on the rsvcreate event).

16.3.3.5 Recording Job Events

Job events occur when a job undergoes a definitive change in state. Job events include submission, starting, cancellation, migration, and completion. Some site administrators do not want to use an external accounting system and use these logged events to determine their clusters' accounting statistics. Moab can be configured to record these events in the appropriate event file found in the Moab stats/ directory. To enable job event recording for both local and remotely staged jobs, use the RECORDEVENTLIST parameter. For example:

RECORDEVENTLIST JOBCANCEL,JOBCOMPLETE,JOBSTART,JOBSUBMIT
...
				

This configuration records an event each time both remote and/or local jobs are canceled, run to completion, started, or submitted. The Event Logs section details the format of these records.

See Also

Copyright © 2012 Adaptive Computing Enterprises, Inc.®