pbsnodes

PBS node manipulation.

5.120.1 Synopsis

pbsnodes [-{a|x}] [-q] [-s server] [node|:property]
pbsnodes -l [-q] [-s server] [state] [nodename|:property ...]
pbsnodes -m <running|standby|suspend|hibernate|shutdown> <host list>
pbsnodes [-{c|d|o|r}] [-q] [-s server] [-n -l] [-N "note"] [-A "append note"] [node|:property]

5.120.2 Description

The pbsnodes command is used to mark nodes down, free or offline. It can also be used to list nodes and their state. Node information is obtained by sending a request to the PBS job server. Sets of nodes can be operated on at once by specifying a node property prefixed by a colon. (For more information, see Node States.)

Nodes do not exist in a single state, but actually have a set of states. For example, a node can be simultaneously "busy" and "offline". The "free" state is the absence of all other states and so is never combined with other states.

In order to execute pbsnodes with other than the -A or -l options, the user must have PBS Manager or Operator privilege.

5.120.3 NUMA-Awareness

When Torque is configured with NUMA-awareness and configured with --enable-groups, the number of total and the number of available sockets, numachips (numa nodes), cores, and threads are returned when the status of nodes are queried by Moab (a call is made to pbsnodes).

See 5.306 pbsnodes with NUMA-Awareness for additional information and examples.

5.120.4 Options

Option Description
-A Append a note attribute to existing note attributes. The -N note option will overwrite exiting note attributes. -A will append a new note attribute to the existing note attributes delimited by a ',' and a space.
-a All attributes of a node or all nodes are listed. This is the default if no flag is given.
-c Clear OFFLINE from listed nodes.
-d Print MOM diagnosis on the listed nodes. Not yet implemented. Use momctl instead.
-m

Set the hosts in the specified host list to the requested power state. If a compute node does not support the energy-saving power state you request, the command returns an error and leaves the state unchanged.

In order for the command to wake a node from a low-power state, Wake-on-LAN (WOL) must be enabled for the node.

In order for the command to wake a node from a low-power state, Wake-on-LAN  must be enabled for the node and it must support the g WOL packet. For more information, see Changing Node Power States.

The allowable power states are:

  • Running: The node is up and running.
  • Standby: CPU is halted but still powered. Moderate power savings but low latency entering and leaving this state.
  • Suspend: Also known as Suspend-to-RAM. Machine state is saved to RAM. RAM is put into self-refresh mode. Much more significant power savings with longer latency entering and leaving state.
  • Hibernate: Also known as Suspend-to-disk. Machine state is saved to disk and then powered down. Significant power savings but very long latency entering and leaving state.
  • Shutdown: Equivalent to shutdown now command as root.

The host list is a space-delimited list of node host names. See 5.120.4.A Examples.

-o Add the OFFLINE state. This is different from being marked DOWN. OFFLINE prevents new jobs from running on the specified nodes. This gives the administrator a tool to hold a node out of service without changing anything else. The OFFLINE state will never be set or cleared automatically by pbs_server; it is purely for the manager or operator.
-p Purge the node record from pbs_server. Not yet implemented.
-r Reset the listed nodes by clearing OFFLINE and adding DOWN state. pbs_server will ping the node and, if they communicate correctly, free the node.
-l

List node names and their state. If no state is specified, only nodes in the DOWN, OFFLINE, or UNKNOWN states are listed. Specifying a state string acts as an output filter. Valid state strings are "active", "all", "busy", "down", "free", "job-exclusive", "job-sharing", "offline", "reserve", "state-unknown", "time-shared", and "up".

  • Using all displays all nodes and their attributes.
  • Using active displays all nodes which are job-exclusive, job-sharing, or busy.
  • Using up displays all nodes in an "up state". Up states include job-exclusive, job-sharing, reserve, free, busy and time-shared.
  • All other strings display the nodes which are currently in the state indicated by the string.
-N Specify a "note" attribute. This allows an administrator to add an arbitrary annotation to the listed nodes. To clear a note, use -N "" or -N n.
-n Show the "note" attribute for nodes that are DOWN, OFFLINE, or UNKNOWN. This option requires -l.
-q Suppress all error messages.
-s Specify the PBS server's hostname or IP address.
-x Same as -A, but the output has an XML-like format.

5.120.4.A Examples

Example 5-48: host list

pbsnodes -m shutdown node01 node02 node03 node04

With this command, pbs_server tells the pbs_mom associated with nodes01-04 to shut down the node.

The pbsnodes output shows the current power state of nodes. In this example, note that pbsnodes returns the MAC addresses of the nodes.

pbsnodes
nuc1
    state = free
    power_state = Running
    np = 4
    ntype = cluster
    status = rectime=1395765676,macaddr=0b:25:22:92:7b:26,cpuclock=Fixed,varattr=,jobs=,state=free,netload=1242652020,gres=,loadave=0.16,ncpus=6,physmem=16435852kb,availmem=24709056kb,totmem=33211016kb,idletime=4636,nusers=3,nsessions=12,sessions=2758 998 1469 2708 2797 2845 2881 2946 4087 4154 4373 6385,uname=Linux bdaw 3.2.0-60-generic #91-Ubuntu SMP Wed Feb 19 03:54:44 UTC 2014 x86_64,opsys=linux
    note = This is a node note
    mom_service_port = 15002
    mom_manager_port = 15003
 
nuc2
    state = free
    power_state = Running
    np = 4
    ntype = cluster
    status = rectime=1395765678,macaddr=2c:a8:6b:f4:b9:35,cpuclock=OnDemand:800MHz,varattr=,jobs=,state=free,netload=12082362,gres=,loadave=0.00,ncpus=4,physmem=16300576kb,availmem=17561808kb,totmem=17861144kb,idletime=67538,nusers=2,nsessions=7,sessions=2189 2193 2194 2220 2222 2248 2351,uname=Linux nuc2 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 x86_64,opsys=linux
    mom_service_port = 15002
    mom_manager_port = 15003

Related Topics 

Non-Adaptive Computing topics

© 2017 Adaptive Computing