5.30 Adding Nodes

Torque can add and remove nodes either dynamically with qmgr or by manually editing the TORQUE_HOME/server_priv/nodes file. See Initializing/Configuring Torque on the Server (pbs_server).

Be aware of the following:

  • Nodes cannot be added or deleted dynamically if there is a mom_hierarchy file in the server_priv directory.
  • When you make changes to nodes by directly editing the nodes file, you must restart pbs_server for those changes to take effect. Changes made using qmgr do not require a restart.
  • When you make changes to a node's ip address, you must clear the pbs_server cache. Either restart pbs_server or delete the changed node and then re-add it.
  • Before a newly added node is set to a free state, the cluster must be informed that the new node is valid and they can trust it for running jobs. Once this is done, the node will automatically transition to free.
  • Adding or changing a hostname on a node requires a pbs_server restart in order to add the new hostname as a node.

5.30.1 Run-time Node Changes

Torque can dynamically add nodes with the qmgr command. For example, the following command will add node node003:

$ qmgr -c 'create node node003[,node004,node005...] [np=n][,[TTL=YYYY-MM-DDThh:mm:ssZ],[acl=user:user1[:user2:user3...]],[requestid=n]]'

The optional parameters are used as follows:

You can alter node parameters by following these examples:

qmgr -c 'set node node003 np=6'
qmgr -c 'set node node003 TTL=2020-12-31T23:59:59Z'
qmgr -c 'set node node003 requestid=23234'
qmgr -c 'set node node003 acl="user:user10:user11:user12"'
qmgr -c 'set node node003 acl=""'

Torque does not use the TTL, acl, and requestid parameters. Information for those parameters are simply passed to Moab.

The set node subcommand of qmgr supports the += and -= syntax, but has known problems when used to alter the acl parameter. Do not use it for this. Instead, simply reset the full user list, as shown in the above example.

The create node and set node command examples above would append the following line(s) to the bottom of the TORQUE_HOME/server_priv/nodes file:

node003 np=6 TTL=2020-12-31T23:59:59Z acl=user1:user2:user3 requestid=3210
node004 ...

Nodes can also be removed with a similar command:

> qmgr -c 'delete node node003[,node004,node005...]'

Related Topics 

© 2017 Adaptive Computing