In many instances a site may have certain resources controlled by different resource managers. For example, a site may use a particular resource manager for licensing software for jobs, another resource manager for managing file systems, another resource manager for job control, and another for node monitoring. Moab can be configured to communicate with each of these resource managers, gathering all their data and incorporating such into scheduling decisions. With a more distributed approach to resource handling, failures are more contained and scheduling decisions can be more intelligent.
Moab must know how to communicate with each resource manager. In most instances, this is simply done by configuring a query command.
With multi-resource manager support, a job may be submitted either to a local resource manager queue or to the Moab global queue. In most cases, submitting a job to a resource manager queue constrains the job to only run within the resources controlled by that resource manager. However, if the job is submitted to the Moab global queue, it can use resources of any active resource manager. This is accomplished through job translation and staging.
When Moab evaluates resource availability, it determines the cost in terms of both data and job staging. If staging a job's executable or input data requires a significant amount of time, Moab integrates data and compute resource availability to determine a job's earliest potential start time on a per resource manager basis and makes an optimal scheduling decision accordingly. If the optimal decision requires a data stage operation, Moab reserves the required compute resources, stages the data, and then starts the job when the required data and compute resources are available.
Using the native interface, Moab can actually perform most of these functions without the need for an external resource manager. First, configure the native resource managers:
RESOURCELIST node01,node02 ... RMCFG[base] TYPE=PBS RMCFG[network] TYPE=NATIVE:AGFULL RMCFG[network] CLUSTERQUERYURL=/tmp/network.sh RMCFG[fs] TYPE=NATIVE:AGFULL RMCFG[fs] CLUSTERQUERYURL=/tmp/fs.sh
The network script can be as simple as the following:
> _RX=`/sbin/ifconfig eth0 | grep "RX by" | cut -d: -f2 | cut -d' ' -f1`; \ > _TX=`/sbin/ifconfig eth0 | grep "TX by" | cut -d: -f3 | cut -d' ' -f1`; \ > echo `hostname` NETUSAGE=`echo "$_RX + $_TX" | bc`;
The preceding script would output something like the following:
node01 NETUSAGE=10928374
Moab grabs information from each resource manager and includes its data in the final view of the node.
> checknode node01 node node01 State: Running (in current state for 00:00:20) Configured Resources: PROCS: 2 MEM: 949M SWAP: 2000M disk: 1000000 Utilized Resources: SWAP: 9M Dedicated Resources: PROCS: 1 disk: 1000 Opsys: Linux-2.6.5-1.358 Arch: linux Speed: 1.00 CPULoad: 0.320 Location: Partition: DEFAULT Rack/Slot: NA Network Load: 464.11 b/s Network: DEFAULT Features: fast Classes: [batch 1:2][serial 2:2] Total Time: 00:30:39 Up: 00:30:39 (100.00%) Active: 00:09:57 (32.46%) Reservations: Job '5452'(x1) -00:00:20 -> 00:09:40 (00:10:00) JobList: 5452
Notice that the Network Load is now being reported along with disk usage.
Example File System Utilization Tracker (per user)
The following configuration can be used to track file system usage on a per user basis:
..... RMCFG[file] POLLINTERVAL=24:00:00 POLLTIMEISRIGID=TRUE RMCFG[file] TYPE=NATIVE:AGFULL RMCFG[file] RESOURCETYPE=FS RMCFG[file] CLUSTERQUERYURL=/tmp/fs.pl .....
Assuming that /tmp/fs.pl outputs something of the following format:
DEFAULT STATE=idle AFS=<fs id="user1" size="789456"></fs><fs id="user2" size="123456"></fs>
This will track disk usage for users user1 and user2 every 24 hours.
Copyright © 2012 Adaptive Computing Enterprises, Inc.®