11.0 General Node Administration > Configuring Node Attributes

Conventions

11.2 Node Attributes

11.2-A Configurable Node Attributes

Nodes can possess a large number of attributes describing their configuration which are specified using the NODECFG parameter. The majority of these attributes such as operating system or configured network interfaces can only be specified by the direct resource manager interface. However, the number and detail of node attributes varies widely from resource manager to resource manager. Sites often have interest in making scheduling decisions based on scheduling attributes not directly supplied by the resource manager. Configurable node attributes are listed in the following table; click an attribute for more detailed information:

ACCESS
ARCH
CHARGERATE
COMMENT
ENABLEPROFILING
FEATURES
FLAGS
GRES
LOGLEVEL
MAXIOIN
MAXJOB
MAXJOBPERUSER
MAXPE
MAXPEPERJOB
MAXPROC

NETWORK
NODEINDEX
NODETYPE
OS
OSLIST
OVERCOMMIT
PARTITION
POWERPOLICY
PREEMPTMAXCPULOAD
PREEMPTMINMEMAVAIL
PREEMPTPOLICY
PRIORITY
PRIORITYF
PROCSPEED

PROVRM
RACK
RADISK
RCDISK
RCMEM
RCPROC
RCSWAP
SIZE
SLOT
SPEED
TRIGGER
VARIABLE
VMOCTHRESHOLD

Attribute Description
ACCESS

Specifies the node access policy that can be one of SHARED, SHAREDONLY, SINGLEJOB, SINGLETASK, or SINGLEUSER. See Node Access Policies for more details.

NODECFG[node013] ACCESS=singlejob
ARCH Specifies the node's processor architecture.
NODECFG[node013] ARCH=opteron
CHARGERATE

Allows a site to assign specific charging rates to the usage of particular resources. The CHARGERATE value may be specified as a floating point value and is integrated into a job's total charge (as documented in the Charging and Allocation Management section).

This feature can only be used in conjunction with the AMCFG[] LOCALCOST flag which limits its use to cases where Moab calculates the full charge to be used by Moab Accounting Manager.

NODECFG[DEFAULT] CHARGERATE=1.0
NODECFG[node003] CHARGERATE=1.5
NODECFG[node022] CHARGERATE=2.5
COMMENT

Allows an organization to annotate a node via the configuration file to indicate special information regarding this node to both users and administrators. The COMMENT value may be specified as a quote delimited string as shown in the example that follows. Comment information is visible using checknode, mdiag, Moab Cluster Manager, and Moab Access Portal.

NODECFG[node013] COMMENT="Login Node"
ENABLEPROFILING

Allows an organization to track node state over time. This information is available using showstats -n.

NODECFG[DEFAULT] ENABLEPROFILING=TRUE
FEATURES

Not all resource managers allow specification of opaque node features (also known as node properties). For these systems, the NODECFG parameter can be used to directly assign a list of node features to individual nodes. To set/overwrite a node's features, use FEATURES=<X>; to append node features, use FEATURES+=<X>.

NODECFG[node013] FEATURES+=gpfs,fastio

The total number of supported node features is limited as described in the Adjusting Default Limits section.

If supported by the resource manager, the resource manager specific manner of requesting node features/properties within a job may be used. (Within TORQUE, use qsub -l nodes=<NODECOUNT>:<NODEFEATURE>.) However, if either not supported within the resource manager or if support is limited, the Moab feature resource manager extension may be used.

FLAGS

Specifies various flags that should be set on the given node. Node flags must be set using the mschedctl -m config command. Do not set node flags in the moab.cfg file. Flags set in moab.cfg may conflict with settings controlled automatically by resource managers, Moab Web Services, or Viewpoint.

  • globalvars - The node has variables that may be used by triggers.
  • novmmigrations - Excludes this hypervisor from VM auto-migrations. This means that VMs cannot automatically migrate to or from this hypervisor while this flag is set.
  • NODECFG[node1] FLAGS=NoVMMigrations

    To allow VMs to resume migrating, remove this flag using mschedctl -m config 'NODECFG[node1] FLAGS-=NoVMMigrations' or use a resource manager to unset the flag. Because both Moab and the RM report the novmmigration flag and the RM's setting always overrides the Moab setting, you cannot remove the flag via the Moab command when the RM is reporting it.

GRES

Many resource managers do not allow specification of consumable generic node resources. For these systems, the NODECFG parameter can be used to directly assign a list of consumable generic attributes to individual nodes or to the special pseudo-node global, which provides shared cluster (floating) consumable resources. To set/overwrite a node's generic resources, use GRES=<NAME>[:<COUNT>]. (See Managing Consumable Generic Resources.)

NODECFG[node013] GRES=quickcalc:20
LOGLEVEL Node specific loglevel allowing targeted log facility verbosity.
MAXIOIN Maximum input allowed on node before it is marked busy.
MAXJOB See Node Policies for details.
MAXJOBPERUSER See Node Policies for details.
MAXPE See Node Policies for details.
MAXPEPERJOB

Maximum allowed Processor Equivalent per job on this node. A job will not be allowed to run on this node if its PE exceeds this number.

NODECFG[node024] MAXPEPERJOB=10000
...
MAXPROC

Maximum dedicated processors allowed on this node. No jobs are scheduled on this node when this number is reached. See Node Policies for more information.

NODECFG[node024] MAXPROC=8
...
NETWORK

The ability to specify which networks are available to a given node is limited to only a few resource managers. Using the NETWORK attribute, administrators can establish this node to network connection directly through the scheduler. The NODECFG parameter allows this list to be specified in a comma-delimited list.

NODECFG[node024] NETWORK=GigE
...
NODEINDEX The node's index. See Node Location for details.
NODETYPE

The NODETYPE attribute is most commonly used in conjunction with an allocation management system such as Moab Accounting Manager. In these cases, each node is assigned a node type and within the allocation management system, each node type is assigned a charge rate. For example, a site administrator may want to charge users more for using large memory nodes and may assign a node type of BIGMEM to these nodes. The allocation management system would then charge a premium rate for jobs using BIGMEM nodes. (See the Allocation Manager Overview for more information.)

Node types are specified as simple strings. If no node type is explicitly set, the node will possess the default node type of DEFAULT. Node type information can be specified directly using NODECFG or through use of the FEATURENODETYPEHEADER parameter.

NODECFG[node024] NODETYPE=BIGMEM
OS

This attribute specifies the node's operating system.

NODECFG[node013] OS=suse10

Because the TORQUE operating system overwrites the Moab operating system, change the operating system with opsys instead of OS if you are using TORQUE.

OSLIST

This attribute specifies the list of operating systems the node can run.

NODECFG[compute002] OSLIST=linux,windows
OVERCOMMIT

Specifies the high-water limit for over-allocation of processors or memory on a hypervisor. This setting is used to protect hypervisors from having too many VMs placed on them, regardless of the utilization level of those VMs. Possible attributes include DISK, MEM, PROC, and SWAP. Usage is <attr>:<integer>.

NODECFG[node012] OVERCOMMIT=PROC:2,MEM:4
PARTITION See Node Location for details.
POWERPOLICY The POWERPOLICY can be set to OnDemand or STATIC. It defaults to STATIC if not set. If set to STATIC, Moab will never automatically change the power status of a node. If set to OnDemand, Moab will turn the machine off and on based on workload and global settings. See Green Computing for further details.
PREEMPTMAXCPULOAD

If the node CPU load exceeds the specified value, any batch jobs running on the node are preempted using the preemption policy specified with the node's PREEMPTPOLICY attribute. If this attribute is not specified, the global default policy specified with PREEMPTPOLICY parameter is used. See Sharing Server Resources for further details.

NODECFG[node024] PRIORITY=-150 COMMENT="NFS Server Node"
NODECFG[node024] PREEMPTPOLICY=CANCEL PREEMPTMAXCPULOAD=1.2
...
PREEMPTMINMEMAVAIL

If the available node memory drops below the specified value, any batch jobs running on the node are preempted using the preemption policy specified with the node's PREEMPTPOLICY attribute. If this attribute is not specified, the global default policy specified with PREEMPTPOLICY parameter is used. See Sharing Server Resources for further details.

NODECFG[node024] PRIORITY=-150 COMMENT="NFS Server Node"
NODECFG[node024] PREEMPTPOLICY=CANCEL PREEMPTMINMEMAVAIL=256
...
PREEMPTPOLICY

If any node preemption policies are triggered (such as PREEMPTMAXCPULOAD or PREEMPTMINMEMAVAIL) any batch jobs running on the node are preempted using this preemption policy if specified. If not specified, the global default preemption policy specified with PREEMPTPOLICY parameter is used. See Sharing Server Resources for further details.

NODECFG[node024] PRIORITY=-150 COMMENT="NFS Server Node"
NODECFG[node024] PREEMPTPOLICY=CANCEL PREEMPTMAXCPULOAD=1.2
...
PRIORITY

The PRIORITY attribute specifies the fixed node priority relative to other nodes. It is only used if NODEALLOCATIONPOLICY is set to PRIORITY. The default node priority is 0. A default cluster-wide node priority may be set by configuring the PRIORITY attribute of the DEFAULT node. See Priority Node Allocation for more details.

NODEALLOCATIONPOLICY  PRIORITY
NODECFG[node024] PRIORITY=120
...
PRIORITYF

The PRIORITYF attribute specifies the function to use when calculating a node's allocation priority specific to a particular job. It is only used if NODEALLOCATIONPOLICY is set to PRIORITY. The default node priority function sets a node's priority exactly equal to the configured node priority. The priority function allows a site to indicate that various environmental considerations such as node load, reservation affinity, and ownership be taken into account as well using the following format:

<COEFFICIENT> * <ATTRIBUTE> [ + <COEFFICIENT> * <ATTRIBUTE> ]...

<ATTRIBUTE> is an attribute from the table found in the Priority Node Allocation section.

A default cluster-wide node priority function may be set by configuring the PRIORITYF attribute of the DEFAULT node. See Priority Node Allocation for more details.

NODEALLOCATIONPOLICY  PRIORITY
NODECFG[node024] PRIORITYF='APROCS + .01 * AMEM - 10 * JOBCOUNT'
...
PROCSPEED

Knowing a node's processor speed can help the scheduler improve intra-job efficiencies by allocating nodes of similar speeds together. This helps reduce losses due to poor internal job load balancing. Moab's Node Set scheduling policies allow a site to control processor speed based allocation behavior.

Processor speed information is specified in MHz and can be indicated directly using NODECFG or through use of the FEATUREPROCSPEEDHEADER parameter.

PROVRM

Provisioning resource managers can be specified on a per node basis. This allows flexibility in mixed environments. If the node does not have a provisioning resource manager, the default provisioning resource manager will be used. The default is always the first one listed in moab.cfg.

RMCFG[prov] TYPE=NATIVE RESOURCETYPE=PROV
RMCFG[prov] PROVDURATION=10:00
RMCFG[prov] NODEMODIFYURL=exec://$HOME/tools/os.switch.pl
...
NODECFG[node024] PROVRM=prov
RACK The rack associated with the node's physical location. Valid values range from 1 to 400. See Node Location for details.
RADISK Jobs can request a certain amount of disk space through the RM Extension String's DDISK parameter. When done this way, Moab can track the amount of disk space available for other jobs. To set the total amount of disk space available the RADISK parameter is used.
RCDISK Jobs can request a certain amount of disk space (in MB) through the RM Extension String's DDISK parameter. When done this way, Moab can track the amount of disk space available for other jobs. The RCDISK attribute constrains the amount of disk reported by a resource manager while the RADISK attribute specifies the amount of disk available to jobs. If the resource manager does not report available disk, the RADISK attribute should be used.
RCMEM

Jobs can request a certain amount of real memory (RAM) in MB through the RM Extension String's DMEM parameter. When done this way, Moab can track the amount of memory available for other jobs. The RCMEM attribute constrains the amount of RAM reported by a resource manager while the RAMEM attribute specifies the amount of RAM available to jobs. If the resource manager does not report available disk, the RAMEM attribute should be used.

Please note that memory reported by the resource manager will override the configured value unless a trailing caret (^) is used.

NODECFG[node024] RCMEM=2048
...

If the resource manager does not report any memory, then Moab will assign node0242048 MB of memory.

NODECFG[node024] RCMEM=2048^
...

Moab will assign 2048 MB of memory to node024 regardless of what the resource manager reports.

RCPROC

The RCPROC specifies the number of processors available on a compute node.

NODECFG[node024] RCPROC=8
...
RCSWAP

Jobs can request a certain amount of swap space in MB.

RCSWAP works similarly to RCMEM. Setting RCSWAP on a node will set the swap but can be overridden by swap reported by the resource manager. If the trailing caret (^) is used, Moab will ignore the swap reported by the resource manager and use the configured amount.

NODECFG[node024] RCSWAP=2048
...

If the resource manager does not report any memory, Moab will assign node0242048 MB of swap.

NODECFG[node024] RCSWAP=2048^
...

Moab will assign 2048 MB of swap to node024 regardless of what the resource manager reports.

SIZE

The number of slots or size units consumed by the node. This value is used in graphically representing the cluster using showstate or Moab Cluster Manager. See Node Location for details. For display purposes, legal size values include 1, 2, 3, 4, 6, 8, 12, and 16.

NODECFG[node024] SIZE=2
...
SLOT The first slot in the rack associated with the node's physical location. Valid values range from 1 to MMAX_RACKSIZE (default=64). See Node Location for details.
SPEED

Because today's processors have multiple cores and adjustable clock frequency, this feature has no meaning and will be deprecated.

TRIGGER See Object Triggers for details.
VARIABLE

Variables associated with the given node, which can be used in job scheduling. See -l PREF.

NODECFG[node024] VARIABLE=var1
...			
VMOCTHRESHOLD

Specifies the high-water threshold for utilization of resources on a server (i.e. processor and memory). This setting is used to protect hypervisors from becoming too highly utilized and thus negatively impacting the performance of VMs running on the hypervisor. Possible attributes include PROC and MEM.

NODECFG[node024] VMOCTHRESHOLD=PROC=2,MEM=2		

11.2-B Node Features/Node Properties

A node feature (or node property) is an opaque string label that is associated with a compute node. Each compute node may have any number of node features assigned to it, and jobs may request allocation of nodes that have specific features assigned. Node features are labels and their association with a compute node is not conditional, meaning they cannot be consumed or exhausted.

Node features may be assigned by the resource manager, and this information may be imported by Moab or node features may be specified within Moab directly. As a convenience feature, certain node attributes can be specified via node features using the parameters listed in the following table:

PARAMETER DESCRIPTION
FEATURENODETYPEHEADER Set Node Type
FEATUREPARTITIONHEADER Set Partition
FEATUREPROCSPEEDHEADER Set Processor Speed
FEATURERACKHEADER Set Rack
FEATURESLOTHEADER Set Slot

Example 11-2:  

FEATUREPARTITIONHEADER  par
FEATUREPROCSPEEDHEADER  cpu

Related topics