Moab Adaptive Computing Suite Administrator's Guide 5.4

2.1.1 Architecting a New Utility Computing Hosting Center

The initial steps associated with setting up a typical hosting center involve assessing who the target customer is, what the customer needs, and what combination of services and resources will be offered to the customer. The importance of the step should not be overlooked or underestimated. In most cases, for utility computing to be successful, its usage must be completely automated. If possible, the end-user should not even care where the resources or compute services are coming from but just be able to continue to use these offerings in a manner that he or she is already familiar with. Delivering on this promise requires understanding the customer's needs to a great extent and delivering a tightly integrated, complete, and fully customized utility computing solution.

2.1.1.1 The Purpose of a Utility Computing Center

Utility computing hosting centers exist because they have a valuable resource or service customers need. These customers may be internal or external to an organization and may potentially be geographically distributed. Resources may include raw processing power, unique or large data sets, special applications, or access to other advanced computing infrastructure or services. Services may include specific searches, reports, calculations, or other operations based on customer input or customer data. A utility computing hosting center provides a partial or complete environment in which the customer can accomplish tasks that would otherwise be difficult or impossible to accomplish independently.

2.1.1.2 Define Hosting Center Objectives

With an understanding of the unique value and purpose of a utility computing hosting center, a center's mission objectives can now be set. Objectives consist of determining what customer tasks will be supported, how the customer will interact with the center, and what level of service will be delivered. In essence, these objectives declare the following:

  • This is what we do.
  • This is how customers will use it.
  • This is what we will promise.

2.1.1.3 Determine Customer Environment Needs

The first consideration in architecting a utility computing center is firmly defining which customer tasks will be supported. Will general HPC batch jobs be supported? Will the center enable data mining of a particular data set? Will automated regression tests be performed on a customer application across a series of hardware architectures? The center's supported task or tasks must be defined and appropriately constrained.

An important consideration within a new utility computing service offering centers around selecting the correct customers and providing a unique and valuable environment that empowers the customer to be more effective in accomplishing goals in a timely manners. Because of the difficulty of providing a complete seamless solution that is immediately useful to the customer, it is important to not attempt to support too large of a customer diversity. Initially, customers should be targeted who operate in a well-defined space with well understood needs in the areas of resources, services, interconnectivity, and operational environment.

With customers selected, the next step is to determine the collection of compute resources, software, and environmental configuration required to allow the customer to successfully use the utility computing service. Knowing these factors allows the center to identify what a complete customer compute environment must look like and, at a high level, what is required to dynamically provision it.

However, allocating and provisioning resources may only be part of what is required. In many utility computing environments, a major aspect of providing the service may be custom-tailoring the compute resources to make them immediately useful to the customer. This may entail installing customer-specific node builds, creating various users and groups, enabling key applications, launching particular services, or setting environment variables. Exactly what must be customized depends on the level of access provided to the customer, be it login access, batch system access, or application level access.

2.1.1.4 Determine Resource Integration Methodology

Once customer environment needs are understood, the next step involves determining how the user will interact with utility computing resources; there are three primary models: OS access, batch access, and application access. For all models, both ease of use and security is critical.

OS Level Access

Providing operating system level access to the compute resources is relatively straightforward and often involves creating a new node build (that can be either real or virtual), enabling host security to grant access only to the customer (often via SSH), and allowing the customer to further customize the resource or launch services as needed.

Batch Level Access

In the batch access model, the utility computing resources should be made available as a direct extension of the customer's existing compute resources and local batch system. In this model, the existing customer batch system should communicate with Moab to indicate the nature of resources required; the local system should then reserve, allocate, and customize these resources, and the local and remote batch systems should automatically coordinate job and data migration tasks to optimize the use of these resources.

Application/Service Level Access

With application or service level access, a customer would often use utility computing resources directly via a portal or indirectly by using a local service that has spilled over onto the utility computing hosting center's resources. In many regards, enabling application or service level access can be much simpler in that resource access does not need to be customized on a per request basis. A customer may determine that a particular application should be allowed to spill over, and to enable this, the utility computing center must only determine that a secure and adequately powerful set of resources is available and that the application is launched with the configuration required to support requests directly or to connect with a distributed application management system.

2.1.1.5 Determine Customer Service Agreement Needs

After determining the environment and interconnect needed to support usage, the next step entails understanding customer business practices as they affect the use of the hosting center. This requires understanding questions such as What type of business arrangements will be required to make the hosting center's resources useful?, What charging model makes the most sense?, and What guarantees will be required and what levels of service should be offered?

Often, many of these questions can only partially be answered at initial configuration time as the center and its customers will evolve relationships that make mutual business sense. As these relationships evolve, Moab Service Level Agreement facilities can be employed to deliver resource quantity, prioritization, and response time based guarantees to allow customers to comfortably use utility computing resources and services in mission critical aspects of their business.

2.1.1.6 Identify Resource Monitoring Requirements

Resource monitoring needs in a utility computing environment are unique because of the partitioned security and dynamic resource configuration requirements. The task of monitoring a utility computing environment is compounded by the need to identify failures in network, storage, compute, and application resources, as well as monitor resource utilization of many diverse types of shared resources.

Moab's unique ability to obtain holistic world-view information from multiple sources and support conflict resolution policies across these sources allows it to properly track a resource even as it is being rebooted or re-installed. Moab does this by tracking configuration databases, provisioning managers, hardware monitors, node monitoring tools, and even locally created scripts and information stores to determine a node's true health and status regardless of the mechanism used to provision it and regardless of its current state.

In selecting the correct tool(s) needed to monitor the health of a compute resource, the following considerations should be taken into account:

  • Monitoring tools should be able to detect current configuration, load, state, and health information.
  • Existing services often provide a good source of resource information.
  • Moab can integrate information from multiple sources.
  • Monitoring should report information for compute, network, storage, and application resources.
  • Static information that is not available via a monitoring tool can be maintained in a test file and imported via Moab's Native resource manager interface.
  • The scalability and performance impact of monitoring tools should be considered early when architecting the system.
  • If provisioning steps will entail power-cycling one or more resources, a method of determining node state should be available that provides information even when the node is powered off.
  • If provisioning steps will entail re-installing customer-specific images, dynamically changing the operating system of a compute node, or using virtual spaces such as XEN or VMWare, the monitoring tool may need to be dynamically installed or pre-installed into each of these spaces.
  • Partitioned security creating firewalls around each utility computing space may prevent many tools from reporting needed information.

There is not a single answer to all of these issues, and in many cases the best solution may be a collection of industry standard tools, local configuration files, and site-specific tools. In other cases, particularly with more simple or application level access based centers, monitoring can be much simpler.

Note With the native interface, sites can monitor, reserve, and dynamically provision virtually any type of resource including network, storage or compute resources, and software licenses. In many cases, these native interfaces already exist or are available from Adaptive Computing. In other cases, these interfaces can be developed using any scripting or compiled language using the flat file, script, SQL, or web service based Moab interfaces.

2.1.1.7 Identify Resource Provisioning Requirements

Resource provisioning entails customizing and configuring available resources to meet specific customer needs. In the case of compute hosts, this step may include installing a new operating system or new application software. For networks, it may include creating a request specific VLAN and configuring guaranteed bandwidth. For data servers, it may include creating virtual data sandboxes with disk quota and security policies in place.

Regardless of the specific needs, enabling resource provisioning consists of providing a mechanism whereby Moab can dynamically modify or provision these resources to meet the needs of a new request specific utility computing space or virtual cluster. The process of enabling each resource includes providing an interface whereby Moab can request the specific change and detect potential failures in fulfilling the request. The provisioning interface can be wrapped as needed and can use resource command line interfaces, APIs, expect scripts, or whatever method is needed to perform and monitor the change. This process is covered in detail in the Provisioning Resources and Managing Dynamic Security section.

2.1.1.8 Identify Complete Virtual Cluster Packages

The final step to architecting a new utility computing hosting center is identifying the full environment resource consumption associated with each request. This entails determining not only what compute resources are required by the request, but also what network, storage, and application resources are required. Further, this process also involves isolating what resources are required to provision an environment for the request. The ultimate purpose of this exercise is to identify both resource dependencies and potential bottlenecks.

This process should not be taken to an extreme, but if there is a resource that may potentially bottleneck, or may potentially fail, it may be valuable to have the hosting center management system monitor and schedule this resource both to avoid failures and to prevent over-subscription.

Once the list of resources associated with a request are determined, these resources should be mapped out over time to identify which resources are needed when. This will allow the scheduler to release unused resources early and improve overall resource utilization. Cluster Resources staff can assist in this effort to find the best way to manage scheduling of per request resources.

Later sections will show how to take this information and map it into a virtual cluster. Within Moab, a virtual cluster is an object that can consist of many components, including but not limited to a Virtual LAN, file space, software licenses, compute resources, and/or host based authentication. These requirements can be site specific and even query specific. Moab allows administrators to define pre-packaged virtual clusters using the VCPROFILE parameter.

See Also