TORQUE Resource Manager

momctl

(PBS Mom Control)

Synopsis

momctl -c { <JOBID> | all }
momctl -C
momctl -d { <INTEGER> | <JOBID> }
momctl -f <FILE>
momctl -h <HOST>[,<HOST>]...
momctl -p <PORT_NUMBER>
momctl -q <ATTRIBUTE>
momctl -r { <FILE> | LOCAL:<FILE> }
momctl -s

Overview

The momctl command allows remote shutdown, reconfiguration, diagnostics, and querying of the pbs_mom daemon.

Format

-c — Clear
{ <JOBID> | all }
---
Clear stale job information
momctl -h node1 -c 15406
   
-C — Cycle
---
---
Cycle pbs_mom(s)
momctl -h node1 -C
Cycle pbs_mom on node1
   
-d — Diagnose
{ <INTEGER> | <JOBID> }
0
Diagnose mom(s)

See the Diagnose Detail table below for more information.
momctl -h node1 -d 2
Print level 2 and lower diagnose information for the MOM on node1
   
-f — Host File
<FILE>
---
A file contain only comma or whitespace (space, tab, or new line) delimited hostnames
momctl -f hosts.txt -d
Print diagnose information for the moms running on the hosts specified in hosts.txt
   
-h — Host List
<HOST>[,<HOST>]...
localhost
A comma separated list of hosts
momctl -h node1,node2,node3 -d
Print diagnose information for the moms running on node1, node2 and node3
   
-p — Port
<PORT_NUMBER>
TORQUE's default port number
The port number for the specified mom(s)
momctl -p 5455 -h node1 -d
Request diagnose information over port 5455 on node1
   
-q — Query
<ATTRIBUTE>
---
Query <ATTRIBUTE> on specified MOM (where <ATTRIBUTE> is a property listed by pbsnodes -a)
momctl -q physmem
Print the amount of physmem on localhost
   
-r — Reconfigure
{ <FILE> | LOCAL:<FILE> }
---
Reconfigure mom(s) with remote or local config file, <FILE>. This does not work if $remote_reconfig is not set to true when the MOM is started.
momctl -r /home/user1/new.config -h node1
Reconfigure MOM on node1 with /home/user1/new.config on node1
   
-s — Shutdown
 
---
Shutdown pbs_mom
momctl -s
Terminates pbs_mom process on localhost
   

Query Attributes

  • arch — node hardware architecture
  • availmem — available RAM
  • loadave — 1 minute load average
  • ncpus — number of CPUs available on the system
  • netload — total number of bytes transferred over all network interfaces
  • nsessions — number of sessions active
  • nusers — number of users active
  • physmem — configured RAM
  • sessions — list of active sessions
  • totmem — configured RAM plus configured swap

Diagnose Detail

Level Description
Display the following information:
  • Local hostname
  • Expected server hostname
  • Execution version
  • MOM home directory
  • MOM config file version (if specified)
  • Duration MOM has been executing
  • Duration since last request from pbs_server daemon
  • Duration since last request to pbs_server daemon
  • RM failure messages (if any)
  • Log verbosity level
  • Local job list
All information for level 0 plus the following:
  • Interval between updates sent to server
  • Number of initialization messages sent to pbs_server daemon
  • Number of initialization messages received from pbs_server daemon
  • Prolog/epilog alarm time
  • List of trusted clients
All information from level 1 plus the following:
  • PID
  • Event alarm status

Example 1:  MOM Diagnostics

> momctl -d 1

Host: nsrc/nsrc.fllcl.com   Server: 10.10.10.113   Version: torque_1.1.0p4
HomeDirectory:          /usr/spool/PBS/mom_priv
ConfigVersion:          147
MOM active:             7390 seconds
Last Msg From Server:   7389 seconds (CLUSTER_ADDRS)
Server Update Interval: 20 seconds
Server Update Interval: 20 seconds
Init Msgs Received:     0 hellos/1 cluster-addrs
Init Msgs Sent:         1 hellos
LOGLEVEL:               0 (use SIGUSR1/SIGUSR2 to adjust)
Prolog Alarm Time:      300 seconds
Trusted Client List:    12.14.213.113,127.0.0.1
JobList:                NONE

diagnostics complete

Example 2:  System Shutdown

> momctl -s -f /opt/clusterhostfile

shutdown request successful on node001
shutdown request successful on node002
shutdown request successful on node003
shutdown request successful on node004
shutdown request successful on node005
shutdown request successful on node006