Moab Workload Manager

13.9 Managing Networks

13.9.1 Network Management Overview

Network resources can be tightly integrated with the rest of a compute cluster using the Moab multi-resource manager management interface. This interface has the following capabilities:

  • Dynamic per job and per partition VLAN creation and management
  • Monitoring and reporting of network health and failure events
  • Monitoring and reporting of network load
  • Creation of subnets with guaranteed performance criteria
  • Automated workload-aware configuration and router maintenance
  • Intelligent network-aware scheduling algorithms

13.9.2 Dynamic VLAN Creation

Most sites using dynamic VLAN's operate under the following assumptions:

  • Each compute node has access to two or more networks, one of which is the compute network, and another which is the administrator network.
  • Each compute node may only access other compute nodes via the compute network.
  • Each compute node may only communicate with the head node via the administrator network.
  • Logins on the head node may not be requested from a compute node.

In this environment, organizations may choose to have VLANs automatically configured that encapsulate individual jobs or VPC requests. These VLAN's essentially disconnect the job from either incoming or outgoing communication with other compute nodes.

13.9.2.1 Configuring VLANs

Automated VLAN management can be enabled by setting up a network resource manager that supports dynamic VLAN configuration and a QoS to request this feature. The example configuration highlights this setup:

...
RMCFG[cisco] TYPE=NATIVE RESOURCETYPE=NETWORK FLAGS=VLAN
RMCFG[cisco] CLUSTERQUERYURL=exec://$TOOLSDIR/node.query.cisco.pl
RMCFG[cisco] SYSTEMMODIFYURL=exec://$TOOLSDIR/system.modify.cisco.pl

QOSCFG[netsecure] SECURITY=VLAN

13.9.2.2 Requesting a VLAN

VLANs can be requested on a per job basis directly using the associated resource manager extension or indirectly by requesting a QoS with a VLAN security requirement.

> qsub -l nodes=256,walltime=24:00:00,qos=netsecure biojob.cmd

143325.umc.com submitted

13.9.3 Network Load and Health Monitoring

Network-level load and health monitoring is enabled by supporting the cluster query action in the network resource manager and specifying the appropriate CLUSTERQUERYURL attribute in the associated resource manager interface. Node (virtual node) query commands (mnodectl,checknode) can be used to view this load and health information that will also be correlated with associated workload and written to persistent accounting records. Network load and health based event information can also be fed into generic events and used to drive appropriate event based triggers.

At present, load and health attributes such as fan speed, temperature, port failures, and various core switch failures can be monitored and reported. Additional failure events are monitored and reported as support is added within the network management system.

13.9.5 Providing Per-QoS and Per-Job Bandwidth Guarantees

Intra-job bandwidth guarantees can be requested on a per job and/or per QoS basis using the BANDWIDTH resource manager extensions (for jobs) and the MINBANDWIDTH QoS attribute (for QoS limits). If specified, Moab does not allow a job to start unless these criteria can be satisfied via proper resource allocation or dynamic network partitions. As needed, Moab makes future resource reservations to be able to guarantee required allocations.

Example

> qsub -l nodes=24,walltime=8:00:00,bandwidth=1000 hex3chem.cmd

job 44362.qjc submitted

Note If dynamic network partitions are enabled, a NODEMODIFYURL attribute must be properly configured to drive the network resource manager. See Native Resource Manager Overview for details.

13.9.6 Enabling Workload-Aware Network Maintenance

Network-aware maintenance is enabled by supporting the modify action in the network resource manager and specifying the appropriate NODEMODIFYURL attribute in the associated resource manager interface. Administrator resource management commands, (mnodectl and mrmctl), will then be routed directly through the resource manager to the network management system. In addition, reservation and real-time generic event and generic metric triggers can be configured to intelligently drive these facilities for maintenance and auto-recovery purposes.

Maintenance actions can include powering on and off the switch as well as rebooting/recycling all or part of the network. Additional operations are enabled as supported by the underlying networks.

13.9.7 Creating a Resource Management Interface for a New Network

Many popular networks are supported using interfaces provided in the Moab tools directory. If a required network interface is not available, a new one can be created using the following guidelines:

General Requirements

In all cases, a network resource manager should respond to a cluster query request by reporting a single node with a node name that will not conflict with any existing compute nodes. This node should report as a minimum the state attribute.

Monitoring Load

Network load is reported to Moab using the generic resource bandwidth. For greatest value, both configured and used bandwidth (in megabytes per second) should be reported as in the following example:

force10 state=idle ares=bandwidth:5466 cres=bandwidth:10000

Monitoring Failures

Network warning and failure events can be reported to Moab using the gevent metric. If automated responses are enabled, embedded epochtime information should be included.

force10 state=idle gevent[checksum]='ECC failure detected on port 13'

Controlling Router State

Router power state can be controlled as a system modify interface is created that supports the commands on, off, and reset.

Creating VLANs

VLAN creation, management, and reporting is more advanced requiring persistent VLAN ID tracking, global pool creation, and other features. Use of existing routing interface tools as templates is highly advised. VLAN management requires use of both the cluster query interface and the system modify interface.

13.9.8 Per-Job Network Monitoring

It is possible to gather network usage on a per job basis using the Native Interface. When the native interface has been configured to report netin and netout Moab automatically gathers this data through the life of a job and reports total usage statistics upon job completion.

...
node99  netin=78658 netout=1256  
...

This information is visible to users and administrators via command-line utilities, the web portal, and the desktop graphical interfaces.

See Also