(Quick Reference)

5 Reporting Framework

5.1Overview

5.1.1Concepts

5.1.2Capabilities

5.2Example Report (CPU Utilization)

5.2.1Creating A Report

5.2.2Adding Samples

5.2.3Consolidating Data

5.2.4Retrieving Report Data

5.2.5Possible Configurations

5 Reporting Framework

The reporting framework is a set of tools to make time-based reports from numerical data. The following sections will (1) provide an overview of the framework and the concepts related to it, and (2) work through an example report (CPU Utilization) with details regarding which web services to use and with what data.

The REST API reference is located in the Report Resource section.

5.1 Overview

5.1.1 Concepts

The reporting framework uses 3 core concepts: reports, datapoints, and samples.

Report - A report is a time-based view of numerical data.
Datapoint - A datapoint is a consolidated set of data for a certain time period.
Sample - A sample is a snapshot of a certain set of data at a particular point in time.

To illustrate, consider the memory utilization of a virtual machine: at any given point in time, you can get the memory utilization by using your operating system's performance utilities (top for Linux, Task Manager for Windows):

2400/12040MB

By recording the memory utilization and time constantly for 1 minute, you could gather the following data:

Time Memory Utilization
3:53:55 PM 2400/12040 MB
3:54:13 PM 2410/12040 MB
3:54:27 PM 2406/12040 MB
3:54:39 PM 2402/12040 MB
3:54:50 PM 2409/12040 MB

Each of the rows in the table above represent a sample of data. By averaging the rows we can consolidate them into one or more datapoints:

Start time End Time Memory Utilization
3:53:30 PM 3:54:00 PM 2400/12040 MB
3:54:00 PM 3:54:30 PM 2408/12040 MB
3:54:30 PM 3:55:00 PM 2406/12040 MB

Note that each datapoint covers exactly the same amount of time, and averages all samples within that period of time.

A report, then, is simply a list of datapoints with some additional configuration information:

Field Value
Name Memory Utilization Report
Datapoint Duration 30 seconds
Report Size 3 datapoints
Datapoints:
Start time End Time Memory Utilization
3:53:30 PM 3:54:00 PM 2400/12040 MB
3:54:00 PM 3:54:30 PM 2408/12040 MB
3:54:30 PM 3:55:00 PM 2406/12040 MB

5.1.2 Capabilities

While storing simple information like memory utilization is nice, the reporting framework is built to automatically handle much more complex information.

Consolidating Samples

Samples are JSON documents which are pushed into the report using the samples API. Samples are then stored until the consolidation operation creates a datapoint out of them. The table below shows how different data types are handled in this operation:

Type	Consolidation Function Handling
Numbers	Numerical data is averaged
Strings	Strings are aggregated into an array
Objects	The consolidation function recursively consolidates sub-objects
Lists	Lists are combined into a single flat list containing all elements
Mixed	If samples have different types of data for the same field, the values are aggregated into an array.
Null	These values will be ignored unless all values for a sample field are set to null, resulting in a null result.

If the mixed data types contains at least one number, it will be treated as numerical data. The non-numerical data will be ignored and the result will be averaged.

Below is an example of how the consolidation function works:

Samples:
Time NumberEx StringEx ListEx MixedEx MixedNumberEx
3:53:55 PM 2400 "str1" ["elem1"] "str1" "str1"
3:54:13 PM 2410 "str2" ["elem2", "elem3"] ["elem1"] ["elem1"]
3:54:27 PM 2405 "str3" ["elem4"] null 5
Resulting Datapoint after consolidation:
Time NumberEx StringEx ListEx MixedEx MixedNumberEx
3:55:00 PM 2405 ["str1", "str2", "str3"] ["elem1", "elem2", "elem3", "elem4"] ["str1", "elem1"] 5

Minimum Number of Samples

If your dataset is highly variable (i.e. values contained in samples are not very close together), converting a single sample into a datapoint may provide misleading information. It may be better to have a datapoint with an "Unknown" value. This can be accomplished by setting the minimum number of samples for a datapoint in the report.

The minimumSampleSize field in the Report API explains that if the specified size of samples is not met when the consolidation function is performed, the datapoint is considered "null" and no data is available for it. When this occurs, the sample data is discarded and the data field of the datapoint is set to "null".

For information on how to set this option, see the REST API Report Resource section in the documentation.

Report Size

Reports have a predetermined number of datapoints, or size, which sets a limit on the amount of data that can be stored. After the report size has been reached, as newly created datapoints are pushed into the report, the oldest datapoints will automatically be deleted. This is to aid in managing the storage capacity of the server hosting MWS.

On report creation, a Mongo collection will be initialized that is the maximum size of a single entry (currently 16 MB) multiplied by the report size. Be careful in setting a large report size as this will quickly allocate the entire disk if many reports with large report sizes are created.

5.2 Example Report (CPU Utilization)

To understand how the behavior and usage of the reporting framework, a sample report covering CPU Utilization will be shown in this section. It will not cover how to gather or display data for reports, but will cover some basic operations that are available with Moab Web Services to facilitate reporting.

5.2.1 Creating A Report

Before any data is sent to Moab Web Services, a report must first be created. A JSON request body with a HTTP method of POST must be used to do this.

POST /rest/reports

{
  "name":"cpu-util",
  "description":"An example report for cpu utilization",
  "consolidationFunction":"average",
  "datapointDuration":600,
  "reportSize":288
}

This will result in a report being created which can then be retrieved by sending a GET request to /rest/reports/cpu-util. The datapointDuration of 600 signifies that the datapoint consolidation should occur once every 10 minutes, while the reportSize (i.e. number of the datapoints) shows that the report will retain up to 2 days worth of the latest datapoints.

GET /rest/reports/cpu-util

{
    "consolidationFunction": "average",
    "datapointDuration": 600,
    "datapoints": [],
    "description": "An example report for cpu utilization",
    "id": "aef6f6a3a0bz7bf6449537c9d",
    "keepSamples": false,
    "minimumSampleSize": 1,
    "name": "cpu-util",
    "reportSize": 288,
    "version": 0
}

Note that an ID has been generated automatically and that no datapoints are associated with the report.

5.2.2 Adding Samples

Until samples are added and associated with the report, datapoint consolidation will generate datapoints with a data field equal to null. Once samples are added, however, they will be averaged and inserted into the next datapoint.

Create samples for the cpu-util by sending a POST request as follows:

POST /rest/reports/cpu-util/samples

[
  {
    "agent": "cpu-monitor",
    "timestamp":"2012-01-01 12:00:00 UTC",
    "data": {
      "minutes1": 0.5,
      "minutes5": 0,
      "minutes15": 0
    }
  },
  {
    "agent": "cpu-monitor",
    "timestamp":"2012-01-01 12:01:00 UTC",
    "data": {
      "minutes1": 1,
      "minutes5": 0.5,
      "minutes15": 0.05
    }
  },
  {
    "agent": "cpu-monitor",
    "timestamp":"2012-01-01 12:02:00 UTC",
    "data": {
      "minutes1": 1,
      "minutes5": 0.5,
      "minutes15": 0.1
    }
  },
  {
    "agent": "cpu-monitor",
    "timestamp":"2012-01-01 12:03:00 UTC",
    "data": {
      "minutes1": 0.75,
      "minutes5": 1,
      "minutes15": 0.25
    }
  },
  {
    "agent": "cpu-monitor",
    "timestamp":"2012-01-01 12:04:00 UTC",
    "data": {
      "minutes1": 0,
      "minutes5": 1,
      "minutes15": 0.85
    }
  }
]

This sample data contains average load for the last 1, 5, and 15 minute intervals. The samples were recorded at one-minute intervals starting at noon on January 1st, 2012.

5.2.3 Consolidating Data

A consolidation function must run to generate datapoints from the given samples. This scheduled consolidation will occur at intervals of datapointDuration seconds. For each field in the data object in samples, all values will be averaged. If non-numeric values are included, the following strategies will be followed:

All fields which contain a single numeric value in any included sample will be averaged and the non-numeric or null values will be ignored.
All fields which contain a list will be consolidated into a single, flat list.
All fields which contain only non-numeric or null values will be consolidated into a single, flat list.

If no historical datapoints are provided in the creation of a report as in this example, the next consolidation will be scheduled for the current time plus the datapointDuration. In this example, the scheduled consolidation is at 10 minutes from the creation date. If historical datapoints are included in the report creation, the latest datapoint's endDate plus the datapointDuration will be used as the scheduled time. If this date was in the past, the next scheduled consolidation will occur at the appropriate interval from the last endDate.

5.2.4 Retrieving Report Data

To retrieve the consolidated datapoints, simply perform a GET request on the report once again. Alternatively, the GET for a report's datapoints may be used.

GET /rest/reports/cpu-util

{
    "consolidationFunction": "average",
    "datapointDuration": 600,
    "datapoints": [
        {
            "firstSampleDate": null,
            "lastSampleDate": null,
            "data": null,
            "startDate": "2012-01-01 11:49:00 UTC",
            "endDate": "2012-01-01 11:59:00 UTC"
        },
        {
            "firstSampleDate": "2012-01-01 12:00:00 UTC",
            "lastSampleDate": "2012-01-01 12:04:00 UTC",
            "data": {
                "minutes1": 0.65,
                "minutes15": 0.25,
                "minutes5": 0.6
            },
            "startDate": "2012-01-01 11:59:00 UTC",
            "endDate": "2012-01-01 12:09:00 UTC"
        }
    ],
    "description": "An example report for cpu utilization",
    "id": "aef6f6a3a0bz7bf6449537c9d",
    "keepSamples": false,
    "minimumSampleSize": 1,
    "name": "cpu-util",
    "reportSize": 288,
    "version": 0
}

Note that of the two datapoints above, only the second actually contains data, while the other is set to null. Only samples lying within the datapoint's duration, or from the startDate to the endDate, are included in the consolidation. Therefore the first datapoint, which covered the 10 minute period just before the samples' recorded timestamps, contained no data. The second, which covers the 10 minute period matching that of the samples, contains the averaged sample data. This data could be used to display consolidated report data in a custom interface.

5.2.5 Possible Configurations

Configuration options may be changed to affect the process of report generation. These are documented in the API for the Report object and the Sample object.

<< 4Resources

6MWS Plugins (Beta) >>

5 Reporting Framework

Table of Contents

5 Reporting Framework

5.1 Overview

5.1.1 Concepts

5.1.2 Capabilities

Consolidating Samples

Minimum Number of Samples

Report Size

5.2 Example Report (CPU Utilization)

5.2.1 Creating A Report

5.2.2 Adding Samples

5.2.3 Consolidating Data

5.2.4 Retrieving Report Data

5.2.5 Possible Configurations

Client Code Samples

Moab Web Services

Plugin Services

Plugin Types