Moab Adaptive Computing Suite Administrator's Guide 5.4

2.1.3 Hosting Center Configuration

The steps associated with setting up a typical hosting center include the following:

2.1.3.1 Defining Hosting Center Mission

Virtual hosting centers exist because they have a valuable resource that is needed by geographically distributed customers. This resource may include raw processing power, unique or large data sets, special applications, or access to other advanced computing infrastructure or services. In all cases, a hosting center provides a partial or complete environment in which the customer can accomplish tasks that would otherwise be difficult or impossible to accomplish independently.

With this understanding, a site's mission objectives should be tailored around selecting the correct customers and providing a unique and valuable environment that empowers the customer to be more effective in accomplishing goals in a timely manner.

2.1.3.1.1 Determine Customer Environment Needs

A key element of any hosting center is determining what collection of compute resources, software, and environment is required to allow the customer to successfully use the hosting center. Knowing these factors allows the center to identify what resulting virtual clusters must look like and (at a high level) what is required to dynamically provision them.

2.1.3.1.2 Determine Customer Service Agreement Needs

After determining the environment, the next step entails understanding customer business practices as they affect the use of the hosting center. This requires understanding questions such as What type of business arrangements will be required to make the hosting center's resources useful?, What charging model makes the most sense?, and What guarantees will be required and what levels of service should be offered?

Often, many of these questions can only partially be answered at initial configuration time as the center and its customers will evolve relationships that make mutual business sense.

2.1.3.2 Integrate Resources

Integrating resources consists of two primary steps: monitoring resources and provisioning resources.

2.1.3.2.1 Enable Resource Monitoring

Because hosting centers typically require both a high level of security and dynamic compute host provisioning, standard resource management tools such as LSF, PBS, and Loadleveler may not provide workable options. Consequently, use of alternative resource monitoring approaches may be required via the Moab Native Resource Manager interface.

With the native interface, sites can monitor, reserve, and dynamically provision virtually any type of compute resource including networks, file or data storage systems, compute resources, and software licenses. In many cases, these interfaces already exist and are available from Adaptive Computing. In other cases, these interfaces can be developed using any scripting or compiled language using the flat file, script, or web service based Moab interfaces.

At a high level, resource monitoring includes determining a resource's health or state, its configuration and attributes, and the type and quantity of consumable resources it has installed and currently available. In a very simple case, a compute node may report a state of active (resource health), an architecture of Opteron (configuration and attributes), and one of two processors and 256 of 512 MB of RAM in use (type and quantity of consumable resources).

For more information, see Resource Management in a Hosted Environment.

2.1.3.2.1.1 Monitoring Standard Compute Hosts

Using the native interface, state information regarding standard compute hosts can be extracted in any manner convenient using locally generated tools or other system services (such as ganglia, hardware monitors, provisioning managers, and so forth). In an environment where only serial channel data is available, a configuration such as the following might be used:

RMCFG[core] TYPE=NATIVE:AGFULL CLUSTERQUERYURL=exec:///opt/hosting/bin/resquery.pl
...

The RMCFG line indicates that the script

/opt/hosting/bin/resquery.pl
should be executed to generate resource data. This example script converts the output of the XCAT hardware monitor interface to WIKI text data providing information regarding basic node state and configuration. With all exec based queries, Moab reads the script stdout as source data for processing. Moab allows significant flexibility in how resource data is organized. A sample output of the Full Extension data model follows:

node001 STATE=Idle CPROC=2 CMEM=512
node002 STATE=Idle CPROC=2 CMEM=512
node010 STATE=Down CPROC=2 CMEM=1024
node011 STATE=Idle CPROC=2 CMEM=1024

2.1.3.2.1.2 Monitoring Non-Compute Resources

In the case of license managers, storage managers, and network managers, the interface developed need only support the services required by the particular host center. For example, in the case of a hosting center with tight security requirements and supporting a compute intensive workload, there may not be a need for bandwidth guarantees but there will be a need for basic network health monitoring and VLAN/VPN support. Consequently, the interface to the network manager may do nothing more than perform a simple health check against the router and monitor per customer network activity. In such cases, a simple PERL or shell script of a few dozen lines may be adequate.

2.1.3.2.2 Enable Resource Provisioning

Resource provisioning entails customizing and configuring available resources to meet specific customer needs. In the case of compute hosts, this step may include installing a new operating system or new application software. For networks, it may include creating a request-specific VLAN and configuring guaranteed bandwidth. For data servers, it may include creating virtual data sandboxes with disk quota and security policies in place. In many cases, the site may also require the dynamic creation of users and groups to support workload that will use these resources.

Regardless of the specific needs, enabling resource provisioning consists of providing a mechanism whereby Moab can dynamically modify or provision these resources to meet the needs of a new virtual cluster. The process of enabling each resource includes providing an interface whereby Moab can request the specific change and detect potential failures in fulfilling the request. The provisioning interface can be wrapped as needed and can use resource command line interfaces, APIs, expect scripts, or whatever method is needed to perform and monitor the change.

Once these interfaces are in place, they will be called by triggers from within Moab with timings and dependencies orchestrated to maximizing resource efficiency and minimize customer response time.

2.1.3.2.3 Identify Complete Virtual Cluster Packages

A virtual cluster can consist of many components, including but not limited to a virtual LAN, file space, software licenses, compute resources, and host based authentication. These requirements can be site specific and even query specific. Moab allows administrators to define pre-packaged virtual clusters using the VCPROFILE parameter within the moab.cfg file. This is documented in detail in the VPC Overview.

2.1.3.3 Provide User Interface

The user interface presents the virtual cluster to the world. To be successful, the interface must be simple, distilling the options to a minimal yet complete set of needed resource options. To be scalable, it will need to empower customer self-service. With Moab, Cluster Resources provides an easily skinned hosting center portal allowing customers to modify their user profile, request cluster access, view status of current cluster requests, and view statistics, account balances, and other historical information.

Creation of the user portal can be handled entirely by Adaptive Computing. Alternatively, site administrators can choose to modify the provided portal or create their own using the Moab command line, C-API, or Java API interface.

2.1.3.3.1 Enable Customer Registration

Before users can be allowed to use the customer portal, they should be registered with the host organization. This can be an automated process, or it can be fully manual or semi-manual. Regardless, the process of registration should gather initial user profile information such as email address, service levels authorized, billing rates, and so forth. This information can be entered directly into Moab configuration or Moab can be configured to extract this information directly from an allocation management system, an identity management system, or a relational database.

2.1.3.3.2 Enable Self-Service Web Site

The customer self-service web site should be secure requiring ether password based or another form of authentication. It is often desirable to have the web site obtain customer profile information from the Moab server to limit the service and resource access options to only those specifically subscribed to by the customer during the registration. At a minimum, the interface should also allow the user to create, monitor, modify, and cancel existing virtual clusters. This includes a virtual cluster overview page wherein customers can see all of their currently active virtual clusters with associated access information such as IP address, login, and password.

As mentioned earlier, there is further value in enabling additional self-service options including direct manipulation of customer contact information and default resource requirement templates. Other beneficial options include direct subscription to new services, policy and general usage documentation, access to statistical reports, and account usage and balance information.

2.1.3.3.3 Enable Email Notifications/Alerts

In addition to the web based virtual cluster overview page, customers can also receive email notifications when a requested virtual cluster is about to be available or when hardware or system failures have resulted in adjustments to the original virtual cluster start date. Further email notifications are also often desirable to notify customers when a virtual cluster is about to expire, or if an option to extend the reservation exists.

2.1.3.4 Enable Service Policies

2.1.3.4.1 Enable General Fairness Policies

Fairness policies in a virtual hosting allow sites to manage the distribution of resources amongst competing customers. The purpose of these policies is to control both current and historical usage in a balanced manner, delivering fair access and responsiveness.

2.1.3.4.2 Enable Quality of Service Packages

The Moab QoS facility allows an organization to bundle service and resource access together with target performance levels, policy adjustments, and charging rates. Customers can be granted access to QoS objects on a per user, or per account basis and may be locked into a single QoS or allowed to choose from a suite of QoS offerings. The QoS overview pages provide details on configuring and using this facility.

2.1.3.4.3 Setting Utility Computing Access Controls

Utility computing access controls can be enabled on a 'per QoS' or 'per resource manager' basis.

QOS Based Access Control

Per QoS access to utility computing resources can be set using the UTILITY QoS flag. By default, all QoS's can access utility computing resources. However, if this flag is set on any QoS, then only QoS's with this flag may use utility computing resources even if jobs satisfy other utility computing access constraints. Further, if the QoS flag UTILITY is set, then only jobs that are associated with a utility QoS will be considered when determining the effective cluster backlog that will be evaluated against any utility computing thresholds.

The following example allows jobs associated with the sthomas and biochem QoS's to allocate and use utility computing resources.

QOSCFG[basic]    PRIORITY=10
QOSCFG[sthomas]  PRIORITY=1000 FLAGS=UTILITY
QOSCFG[biochem]  PRIORITY=2200 FLAGS=UTILITY,USERESERVED

Resource Manager Based Access Control

Resource managers with the HOSTINGCENTER flag can be configured with user, group, account, QoS, and class based access control lists. These access constraints are applied to all resources allocated through these resource managers.

The following example allows jobs associated with the users steve or john or jobs submitted to the classes hiprio or medical to allocate and use utility computing resources.

RMCFG[odmaster]  FLAGS=hostingcenter AUTHULIST=steve,john
RMCFG[odmaster]  AUTHCLIST=hiprio,medical

2.1.3.5 Automating Customer Management

There are several steps associated with fully automating customer management; follow a link for more information on respective topics.

2.1.3.6 Evaluating Center Effectiveness

Standard cluster analysis techniques, as described in the Cluster Analysis section, can be applied to determine efficiency, utilization, and bottlenecks.

To view current resource reservations, use the showres or mdiag -r command. The showres command without any arguments provides a high-level summary of all reservations while the -n flag displays reservations on a per node basis. The mdiag -r command provides a detailed overview of reservation configuration and status and also performs diagnostics on the reservations and reserved resources.