4.3 Provisioning & Load Balancing

You must define a provisioning resource manager for Moab to be able to change operating systems on nodes.

Create a provisioning resource manager by adding an RMCFG line in the moab.cfg file. The only attribute it needs is a NODEMODIFYURL that Moab can call to change the operating system on a given node. You can define and adjust PROVDURATION to specify how long the provisioning process takes to finish so that it can schedule around it.

RMCFG[prov]               TYPE=NATIVE RESOURCETYPE=PROV
RMCFG[prov]               PROVDURATION=5:00
RMCFG[prov]               NODEMODIFYURL=exec://$TOOLSDIR/os.switch.pl

The script following NODEMODIFYURL should contain the logic necessary to swap the OS of the compute node based on the parameters received by its arguments. Moab calls this script, passing it the following arguments:

$NODEMODIFYURL <node id> --set OS=<os>

If you use external cluster management software (such as xCAT) rather than a local unmanaged DHCP/TFTP server, you must ensure that the NODEMODIFYURL script obeys the following algorithm:

Specify node configuration information on the NODECFG lines in the moab.cfg file. The NODECFG line includes the host name of a given compute node and OSLIST, a comma-separated list of the operating systems supported by that particular node. For example:

NODECFG[compute000] OSLIST=windows PARTITION=local FEATURES=compute000
NODECFG[compute001] OSLIST=linux PARTITION=local FEATURES=compute001
NODECFG[compute002] OSLIST=linux,windows PARTITION=local FEATURES=compute002
NODECFG[compute003] OSLIST=linux,windows PARTITION=local FEATURES=compute003
NODECFG[compute004] OSLIST=linux,windows PARTITION=local FEATURES=compute004
NODECFG[compute005] OSLIST=linux,windows PARTITION=local FEATURES=compute005
NODECFG[compute006] OSLIST=linux,windows PARTITION=local FEATURES=compute006
NODECFG[compute007] OSLIST=linux,windows PARTITION=local FEATURES=compute007

4.3.1 Switching from Dual to Single OS Provisioning

If you no longer want Moab to provision multiple operating systems to a compute node, it is not enough to just change the OSLIST parameter in the Moab configuration file. You must prevent the operating system resource manager from directing Moab to provision multiple operating systems.

4.3.1.1 Removing Linux OS Provisioning

If you are running TORQUE, use the following steps to remove Linux operating system provisioning:

  1. Open the moab.cfg file and edit the appropriate NODECFG line. For example, if compute004 is the node you want to run Windows only, remove "linux" from the line so that it reads as follows:
    NODECFG[compute004] OSLIST=windows PARTITION=local FEATURES=compute004
  2. Use the qterm command to terminate pbs_server (TORQUE).
    > qterm
  3. Remove the node from the nodes file, which is commonly located in /var/spool/torque/server_priv/nodes.
  4. Restart pbs_server (TORQUE).
    > pbs_server
  5. Restart Moab.

4.3.1.2 Removing Windows OS Provisioning

If you are running HPC 2008 R2, use the following steps to remove Windows operating system provisioning:

  1. Open the moab.cfg file and edit the appropriate NODECFG line. For example, if compute003 is the node you want to run Linux only, remove "windows" from the line so that it reads as follows:
    NODECFG[compute003] OSLIST=linux PARTITION=local FEATURES=compute003
  2. Open HPC Cluster Manager and click Node Management.
  3. Right-click the specified node in the node list and choose Take Offline.
  4. After taking the node offline, right-click the node again and choose Delete.
  5. Launch Moab Services for Microsoft Windows HPC 2008 R2 and click Configure.
  6. Click Flush DBs to make sure the changes made to the HPC cluster manager are immediately recognized by the integration service.

4.3.2 Configuring Multiple Operating Systems in Windows

Multiple Windows operating systems can be supported by allowing the environmental variable OSSTRING to set the cluser.query.hpc.pl and os.switch.pl scripts. To do so:

  1. Ensure that Moab can identify the different operating systems that each resource manager reports. By default, the MSMHPC service reports OS=windows for all the nodes it manages.
  2. Customize your cluster.query.hpc.pl. For instance, remap the OS= wiki parameter supported by Moab with the value from the OSSTRING environmental variable.
  3. Modify your moab.cfg file:
    RMCFG[HPC] TYPE=NATIVE:MSMHPC
    RMCFG[HPC] PARTITION=local
    RMCFG[HPC] NODESTATEPOLICY=OPTIMISTIC
    RMCFG[HPC] DEFOS=windowsA
    RMCFG[HPC] FLAGS=USERSPACEISSEPARATE
    RMCFG[HPC] ADMINEXEC=jobsubmit
    RMCFG[HPC] ENV=OSSTRING=windowsA;RMNAME=MSMHPC;PUBKEY=mypubkey;DOMAIN=yourdomain;PROXY=http://winhead:5343/MSMHPC
    RMCFG[HPC] CLUSTERQUERYURL=exec://$TOOLSDIR/cluster.query.hpc.pl
    RMCFG[HPC] WORKLOADQUERYURL=exec://$TOOLSDIR/workload.query.hpc.pl
    RMCFG[HPC] JOBSUBMITURL=exec://$TOOLSDIR/job.submit.hpc.pl
    RMCFG[HPC] JOBSTARTURL=exec://$TOOLSDIR/job.start.hpc.pl
    RMCFG[HPC] JOBCANCELURL=exec://$TOOLSDIR/job.cancel.hpc.pl
    RMCFG[HPC] JOBREQUEUEURL=exec://$TOOLSDIR/job.requeue.hpc.pl
    RMCFG[HPC2] TYPE=NATIVE:MSMHPC
    RMCFG[HPC2] PARTITION=local
    RMCFG[HPC2] NODESTATEPOLICY=OPTIMISTIC
    RMCFG[HPC2] DEFOS=windowsB
    RMCFG[HPC2] FLAGS=USERSPACEISSEPARATE
    RMCFG[HPC2] ADMINEXEC=jobsubmit
    RMCFG[HPC2] ENV=OSSTRING=windowsB;RMNAME=MSMHPC;PUBKEY=mypubkey;DOMAIN=yourdomain;PROXY=http://SOMEWHEREELSE:5343/MSMHPC
    RMCFG[HPC2] CLUSTERQUERYURL=exec://$TOOLSDIR/cluster.query.hpc.pl
    RMCFG[HPC2] WORKLOADQUERYURL=exec://$TOOLSDIR/workload.query.hpc.pl
    RMCFG[HPC2] JOBSUBMITURL=exec://$TOOLSDIR/job.submit.hpc.pl
    RMCFG[HPC2] JOBSTARTURL=exec://$TOOLSDIR/job.start.hpc.pl
    RMCFG[HPC2] JOBCANCELURL=exec://$TOOLSDIR/job.cancel.hpc.pl
    RMCFG[HPC2] JOBREQUEUEURL=exec://$TOOLSDIR/job.requeue.hpc.pl
    Doing so passes all the configuration values through the environmental variables, so you can use the same set of scripts.
  4. Customize your os.switch.pl script to reflect your environment's setup. You need a script that can change an OS to a DESTINATION_OS. To find the source of the operating system, run mdiag -n in Moab, since Moab processes the command from the cache.

Copyright © 2011 Adaptive Computing Enterprises, Inc.®