Moab workload accounting records fully describe all scheduling relevant aspects of batch jobs including resources requested and used, time of all major scheduling events (such as submission time and start time), the job credentials used, and the job execution environment. Each job trace is composed of a single line consisting of whitespace delimited fields as shown in the following table.
Moab can be configured to provide this information in flat text tabular form or in XML format conforming to the SSS 1.0 job description specification. |
All job events (JOBSUBMIT, JOBSTART, JOBEND, and so forth) provide job data in a standard format as described in the following table:
Field Name | Field Index | Data Format | Default Value | Details | ||
---|---|---|---|---|---|---|
Event Time (Human Readable) | 1 | HH:MM:SS | - | Specifies time event occurred. | ||
Event Time (Epoch) | 2 | <epochtime> | - | Specifies time event occurred. | ||
Object Type | 3 | job | - | Specifies record object type. | ||
Object ID | 4 | <STRING> | - | Unique object identifier. | ||
Object Event | 5 | one of jobcancel, jobcheckpoint, jobend, jobfailure, jobhold, jobmigrate, jobpreempt, jobreject, jobresume, jobstart or jobsubmit | - | Specifies record event type. | ||
Nodes Requested | 6 | <INTEGER> | 0 | Number of nodes requested (0 = no node request count specified). | ||
Tasks Requested | 7 | <INTEGER> | 1 | Number of tasks requested. | ||
User Name | 8 | <STRING> | - | Name of user submitting job. | ||
Group Name | 9 | <STRING> | - | Primary group of user submitting job. | ||
Wallclock Limit | 10 | <INTEGER> | 1 | Maximum allowed job duration (in seconds). | ||
Job Event State | 11 | <STRING> | - | Job state at time of event. | ||
Required Class | 12 | <STRING> | [DEFAULT:1] | Class/queue required by job specified as square bracket list of <QUEUE>[:<QUEUEINSTANCE>] requirements. (For example: [batch:1]). | ||
Submission Time | 13 | <INTEGER> | 0 | Epoch time when job was submitted. | ||
Dispatch Time | 14 | <INTEGER> | 0 | Epoch time when scheduler requested job begin executing. | ||
Start Time | 15 | <INTEGER> | 0 | Epoch time when job began executing. This is usually identical to Dispatch Time. | ||
Completion Time | 16 | <INTEGER> | 0 | Epoch time when job completed execution. | ||
Required Network Adapter | 17 | <STRING> | - | Name of required network adapter if specified.
|
||
Required Node
Architecture |
18 | <STRING> | - | Required node architecture if specified. | ||
Required Node
Operating System |
19 | <STRING> | - | Required node operating system if specified. | ||
Required Node
Memory Comparison |
20 | one of >, >=, =, <=, < | >= | Comparison for determining compliance with required node memory. | ||
Required Node
Memory |
21 | <INTEGER> | 0 | Amount of required configured RAM (in MB) on each node. | ||
Required Node Disk
Comparison |
22 | one of >, >=, =, <=, < | >= | Comparison for determining compliance with required node disk. | ||
Required Node Disk | 23 | <INTEGER> | 0 | Amount of required configured local disk (in MB) on each node. | ||
Required Node
Attributes/Features | 24 | <STRING> | - | Square bracket enclosed list of node features required by job if specified. (For example: [fast][ethernet]) | ||
System Queue
Time |
25 | <INTEGER> | 0 | Epoch time when job met all fairness policies. | ||
Tasks Allocated | 26 | <INTEGER> | <TASKS REQUESTED> | Number of tasks actually allocated to job.
|
||
Required Tasks Per Node | 27 | <INTEGER> | -1 | Number of Tasks Per Node required by job or '-1' if no requirement specified. | ||
QOS | 28 | <STRING>[:<STRING>] | - | QoS requested/assigned using the format <QOS_REQUESTED>[:<QOS_DELIVERED>]. (For example: hipriority:bottomfeeder) | ||
JobFlags | 29 | <STRING>[:<STRING>]... | - | Square bracket delimited list of job attributes. (For example: [BACKFILL][BENCHMARK][PREEMPTEE]) | ||
Account Name | 30 | <STRING> | - | Name of account associated with job if specified. | ||
Executable | 31 | <STRING> | - | Name of job executable if specified. | ||
Resource Manager Extension String | 32 | <STRING> | - | Resource manager specific list of job attributes if specified. See the Resource Manager Extension Overview for more information. | ||
Bypass Count | 33 | <INTEGER> | -1 | Number of times job was bypassed by lower priority jobs via backfill or '-1' if not specified. | ||
ProcSeconds
Utilized |
34 | <DOUBLE> | 0 | Number of processor seconds actually used by job. | ||
Partition Name | 35 | <STRING> | [DEFAULT] | Name of partition in which job ran. | ||
Dedicated Processors per Task | 36 | <INTEGER> | 1 | Number of processors required per task. | ||
Dedicated Memory per Task | 37 | <INTEGER> | 0 | Amount of RAM (in MB) required per task. | ||
Dedicated Disk per Task | 38 | <INTEGER> | 0 | Amount of local disk (in MB) required per task. | ||
Dedicated Swap per Task | 39 | <INTEGER> | 0 | Amount of virtual memory (in MB) required per task. | ||
Start Date | 40 | <INTEGER> | 0 | Epoch time indicating earliest time job can start. | ||
End Date | 41 | <INTEGER> | 0 | Epoch time indicating latest time by which job must complete. | ||
Allocated Host List | 42 | <hostname>[,<hostname>]... | - | Comma delimited list of hosts allocated to job. (For example: node001,node004) | ||
Resource Manager Name | 43 | <STRING> | - | Name of resource manager if specified. | ||
Required Host List | 44 | <hostname>[,<hostname>]... | - | List of hosts required by job. (If the job's taskcount is greater than the specified number of hosts, the scheduler must use these nodes in addition to others; if the job's taskcount is less than the specified number of hosts, the scheduler must select needed hosts from this list.) | ||
Reservation | 45 | <STRING> | - | Name of reservation required by job if specified. | ||
Application Simulator Data | 46 | <STRING>[:<STRING>] | - | Name of application simulator module and associated configuration data. (For example: HSM:IN=infile.txt:140000;OUT=outfile.txt:500000) | ||
Set Description | 47 | <STRING>:<STRING>[:<STRING>] | - | Set constraints required by node in the form <SetConstraint>:<SetType>[:<SetList>] where SetConstraint is one of ONEOF, FIRSTOF, or ANYOF, SetType is one of PROCSPEED, FEATURE, or NETWORK, and SetList is an optional colon delimited list of allowed set attributes. (For example: ONEOF:PROCSPEED:350:450:500) | ||
Job Message | 48 | <STRING> | - | Job messages including resource manager, scheduler, and administrator messages if specified. | ||
Job Cost | 49 | <DOUBLE> | 0.0 | Cost of executing job incorporating resource consumption metric, resource quantity consumed, and credential, allocated resource, and delivered QoS charge rates. | ||
History | 50 | <STRING> | - | List of job events impacting resource allocation (XML).
| ||
Utilization | 51 | Comma delimited list of one or more of the following: <ATTR>=<VALUE> pairs where <VALUE> is a double and <ATTR> is one of the following: network (in MB transferred), license (in license-seconds), storage (in MB-seconds stored), or gmetric:<TYPE>. | - | Cumulative resources used over life of job. | ||
Estimate Data | 52 | <STRING> | - | List of job estimate usage. | ||
Completion Code | 53 | <INTEGER> | - | Job exit status/completion code. | ||
Extended Memory Load Information | 54 | <STRING> | - | Extended memory usage statistics (max, mem, avg, and so forth). | ||
Extended CPU Load Information | 55 | <STRING> | - | Extended CPU usage statistics (max, mem, avg, and so forth). | ||
Generic Metric Averages | 56 | <STRING> | -1 | Generic metric averages. | ||
Effective Queue Duration | 57 | <INTEGER> | -1 | The amount of time, in seconds, that the job was eligible for scheduling. |
If no applicable value is specified, the exact string - should be entered. |
Fields that contain a description string such as Job Message use a packed string format. The packed string format replaces white space characters such as spaces and carriage returns with a hex character representation. For example a blank space is represented as \20. Since fields in the event record are space delimited, this preserves the correct order and spacing of fields in the record. |
Sample Workload Trace
13:21:05 110244355 job 1413 JOBEND 20 20 josh staff 86400 Removed [batch:1] 887343658 889585185 \ 889585185 889585411 ethernet R6000 AIX53 >= 256 >= 0 - 889584538 20 0 0 2 0 test.cmd \ 1001 6 678.08 0 1 0 0 0 0 0 - 0 - - - - - - - - 0.0 - - - 0 - -
All job events (JOBSUBMIT, JOBSTART, JOBEND, and so forth) provide job data in the native wiki format (ATTR=VALUE). This is to make events more readable and to allow format flexibility. This is not the default format for Moab 6.0 and higher; it is enabled by setting the WIKIEVENTS parameter to TRUE in the Moab configuration file.
Examples
09:26:40 1288279600:1 sched Moab SCHEDSTART - 09:26:40 1288279600:2 rm pbs RMUP initialized 09:26:40 1288279600:3 sched Moab RMPOLLSTART - 09:26:40 1288279600:4 job 58 JOBSUBMIT 58 REQUESTEDNC=1 REQUESTEDTC=3 UNAME=wightman GNAME=wightman WCLIMIT=60 STATE=Completed RCLASS=[batch:1] SUBMITTIME=1288279493 RMEMCMP=>= RDISKCMP=>= RFEATURES=[NONE] SYSTEMQUEUETIME=1288279493 TASKS=1 FLAGS=RESTARTABLE PARTITION=pbs DPROCS=1 ENDDATE=2140000000 TASKMAP=proxy,GLOBAL SRM=pbs MESSAGE="\STARTLabel\20\20\20CreateTime\20ExpireTime \20\20\20\20Owner\20Prio\20Num\20Message\0a,\STARTcheckpoint\20record\20not\20found" EXITCODE=0 SID=2357 NODEALLOCATIONPOLICY=SHARED
09:26:40 1288279600:5 job 58 JOBEND 58 REQUESTEDNC=1 REQUESTEDTC=3 UNAME=wightman GNAME=wightman WCLIMIT=60 STATE=Completed RCLASS=[batch:1] SUBMITTIME=1288279493 RMEMCMP=>= RDISKCMP=>= RFEATURES=[NONE] SYSTEMQUEUETIME=1288279493 TASKS=1 FLAGS=RESTARTABLE PARTITION=pbs DPROCS=1 ENDDATE=2140000000 TASKMAP=proxy,GLOBAL SRM=pbs EXITCODE=0 SID=2357 NODEALLOCATIONPOLICY=SHARED EFFECTIVEQUEUEDURATION=107
Because workload event records and simulation workload traces use the same format, these event records can be used as a starting point for generating a new simulation trace. In the Moab simple case, an event record or collection of event records can be used directly as the value for the SIMWORKLOADTRACEFILE as in the following example:
# collect all job records for December > cat /opt/moab/stats/events.*Dec*2006 | grep JOBEND > /opt/moab/DecJobs.txt # edit moab.cfg for use job records > vi /opt/moab/etc/moab.cfg (add 'SIMWORKLOADTRACEFILE /opt/moab/DecJobs.txt') (set SIMRESOURCETRACEFILE, SCHEDCFG[] MODE and other simulation parameters as described in the Simulation Overview # start the simulation > moab
In the preceding example, all non-JOBEND events were filtered out. This step is not required but only JOBEND events are used in a simulation; other events are ignored by Moab. |
Modifying Existing Job Event Records
When creating a new simulation workload, it is often valuable to start with workload traces representing a well-known or even local workload. These traces preserve distribution information about job submission times, durations, processor count, users, groups, projects, special resource requests, and numerous other factors that effectively represent an industry, user base, or organization.
When modifying records, a field or combination of fields can be altered, new jobs inserted, or certain jobs filtered out.
Because job event records are used for multiple purposes, some of the fields are valuable for statistics or auditing purposes but are ignored in simulations. For the most part, fields representing resource utilization information are ignored while fields representing resource requests are not. |
Modifying Time Distribution Factors of a Workload Trace
In some cases, simulations focus on determining the effects of changing the quantities or types of jobs or on changing policies or job ownership to see changes to system performance and resource utiliation. However, other times simulations tend to focus on response-time metrics as job submission and job duration aspects of the workload are modified. Which time-based fields are important to modify depend on the simulation purpose and the setting of the JOBSUBMISSIONPOLICY parameter.
JOBSUBMISSIONPOLICY Value | Critical Time Based Fields |
---|---|
NORMAL | WallClock Limit
Submission Time StartTime Completion Time |
CONSTANTJOBDEPTH
CONSTANTPSDEPTH |
WallClock Limit
StartTime Completion Time |
Note 1: Dispatch Time should always be identical to Start
Time
Note 2: In all cases, the difference of 'Completion Time
- Start Time' is used to determine actual job run time.
Note 3: System Queue Time and Proc-Seconds Utilized are
only used for statistics gathering purposes and will not alter the behavior
of the simulation.
Note 4: In all cases, relative time values are important,
i.e., Start Time must be greater than or equal to Submission Time and less
than Completion Time.
Creating Workload Traces From Scratch
There is nothing which prevents a completely new workload trace from being created from scratch. To do this, simply create a file with fields matching the format described in the Workload Event Record Format section.
All reservation events provide reservation data in a standard format as described in the following table:
Field Name | Field Index | Data Format | Default Value | Details |
---|---|---|---|---|
Event Time (Human) | 0 | [HH:MM:SS] | - | Specifies time event occurred. |
Event Time (Epoch) | 1 | <epochtime> | - | Specifies time event occurred. |
Object Type | 2 | rsv | - | Specifies record object type. |
Object ID | 3 | <STRING> | - | Unique object identifier. |
Object Event | 4 | one of rsvcreate, rsvstart, rsvmodify, rsvfail or rsvend | - | Specifies record event type. |
Creation Time | 5 | <EPOCHTIME> | - | Specifies epoch time of reservation start date. |
Start Time | 6 | <EPOCHTIME> | - | Specifies epoch time of reservation start date. |
End Time | 7 | <EPOCHTIME> | - | Specifies epoch time of reservation end date. |
Tasks Allocated | 8 | <INTEGER> | - | Specifies number of tasks allocated to reservation at event time. |
Nodes Allocated | 9 | <INTEGER> | - | Specifies number of nodes allocated to reservation at event time. |
Total Active Proc-Seconds | 10 | <INTEGER> | - | Specifies proc-seconds reserved resources were dedicated to one or more job at event time. |
Total Proc-Seconds | 11 | <INTEGER> | - | Specifies proc-seconds resources were reserved at event time. |
Hostlist | 12 | <comma delimited list of hostnames> | - | Specifies list of hosts reserved at event time. |
Owner | 13 | <STRING> | - | Specifies reservation ownership credentials. |
ACL | 14 | <STRING> | - | Specifies reservation access control list. |
Category | 15 | <STRING> | - | Specifies associated node category assigned to reservation. |
Comment | 16 | <STRING> | - | Specifies general human readable event message. |
Command Line | 17 | <STRING> | - | Displays the command line arguments used to create the reservation (only shows on the rsvcreate event). |
Job events occur when a job undergoes a definitive change in state. Job events include submission, starting, cancellation, migration, and completion. Some site administrators do not want to use an external accounting system and use these logged events to determine their clusters' accounting statistics. Moab can be configured to record these events in the appropriate event file found in the Moab stats/ directory. To enable job event recording for both local and remotely staged jobs, use the RECORDEVENTLIST parameter. For example:
RECORDEVENTLIST JOBCANCEL,JOBCOMPLETE,JOBSTART,JOBSUBMIT ...
This configuration records an event each time both remote and/or local jobs are canceled, run to completion, started, or submitted. The Event Logs section details the format of these records.