You are here: Nitro Overview > Workload Solutions and Use Cases

1.2 Workload Solutions and Use Cases

This topic provides an operational overview of workload solutions available with Nitro. Use cases are also provided to showcase some of the solutions and benefits with using Nitro.

In this topic:

1.2.1 Workloads

Nitro easily schedules and executes these typical workloads:

1.2.2 Use Cases

This section contains use cases of Nitro workload solutions.

In this section:

1.2.2.A Many Independent Short Workloads

Let's say a user wants to submit a workload of 50,000 HTC jobs to execute on a system, this means submitting each job separately to the system's scheduler. Submitting all 50,000 jobs at once to the work queue slows down the scheduler sufficiently that other users complain of reduced job response times (turnaround time between job submission and job completion). This reduced response time is due to the overhead the scheduler incurs scheduling so many HTC jobs and the overhead of starting and managing so many individual short jobs. In other words, the shorter the HTC job, the greater the percentage of the job's response time is consumed by job scheduling, startup, and management overhead.

Using Nitro, a user can submit a single "Nitro job" with the 50,000 HTC jobs (now referred to as Nitro tasks) to the system's scheduler. Nitro will quickly execute this workload using its very low scheduling overhead and very quick workload management. In other words, Nitro's low scheduling overhead provides improved response time for executing many short jobs when compared to a normal scheduler.

For example, let's look at a Nitro demonstration with an investment trading enterprise. The investment trading enterprise normally submitted a workload of 10,000 of its own HTC jobs to a commercial scheduler that took 110 seconds to execute the 10,000 jobs on ten 12-core hosts. Submitted as a single Nitro job to the same commercial scheduler, a very early version of Nitro took only 9 seconds to execute the same 10,000 jobs (again now called tasks) on the same ten 12-cores hosts. Resulting in a response time speedup of 12x!

1.2.2.B Large Queues

Let's say an HPC cluster or a datacenter has a large job queue where a significant portion of the job queue consists of HTC workloads (thousands to millions of short jobs taking seconds to minutes to complete). This causes reduced job response times from the system scheduler.

However, by using Nitro, users can consolidate these workloads into a few Nitro jobs (tens or hundreds) to improve the system scheduler's job response times. Those Nitro jobs will execute the HTC workloads on the hosts the scheduler allocates to the Nitro jobs, thereby speeding their own workload turnarounds as well as improving the other users' job turnaround times due to a large reduction in the queue size.

For example, let's say a scheduler has a job queue containing 100,000 jobs and of those 95,000 can be consolidated into 95 Nitro jobs (each executing 1,000 tasks). By consolidating, the scheduler's job queue will drop from 100,000 jobs to 5,095 jobs, which the scheduler can process in 5% of the time required for processing 100,000 jobs, for a 20x scheduling cycle speedup. This speedup also benefits all the other jobs in the queue.

1.2.2.C Multi-threaded HTC Applications

Let's say a user has a multi-threaded application that runs on a single host and wants to run the same application many times, but each time with different parameters or datasets. The user will either submit a single job that executes the application one instance at a time with the different parameters or datasets, or submits many jobs that execute the application only once, each with different parameters or datasets. The former method does not take advantage of the many available hosts on which many application instances can execute simultaneously for a shorter overall response time but does have the advantage of not affecting the system scheduler's scheduling time. The latter method has a greater potential for a shorter overall response time but has the disadvantage of affecting the system scheduler's job throughput.

Using Nitro, the user can run multiple instances of an application on the hosts the system scheduler allocates to the Nitro job without affecting the system scheduler's scheduling time or throughput. In addition, the Nitro job will usually execute all instances in a shorter time than the system scheduler could due to much lower overhead since Nitro's orientation and optimization quickly starts the next application instance as soon as an existing application instance completes.

1.2.2.D Regression Testing

Let's say an institution performs large quantities of regression tests each night for the applications it develops. Many regression tests are similar and can execute independently of each other. The regression test framework submits the tests as individual jobs to the system scheduler, which means many jobs for it to schedule.

Using Nitro can increase the regression test throughput due to its much lower workload management overhead; that is, it can immediately start the next regression workload as soon as an existing workload completes without the overhead a system scheduler incurs communicating job completion, scheduling the next job to start, and then starting the next job.

© 2016 Adaptive Computing