TORQUE Resource Manager
1.8 TORQUE Multi-MOM

1.8 TORQUE Multi-MOM

Starting in TORQUE version 3.0 users can run multiple MOMs on a single node. The initial reason to develop a multiple MOM capability was for testing purposes. A small cluster can be made to look larger since each MOM instance is treated as a separate node.

When running multiple MOMs on a node each MOM must have its own service and manager ports assigned. The default ports used by the MOM are 15002 and 15003. With the multi-mom alternate ports can be used without the need to change the default ports for pbs_server even when running a single instance of the MOM.

1.8.1 Configuration

There are three steps to setting up multi-mom capability:

  1. Configure server_priv/nodes file
  2. Edit /etc/hosts file
  3. Start pbs_mom with multi-mom options

1.8.1.1 Configure server_priv/nodes

The attributes mom_service_port and mom_manager_port were added to the nodes file syntax to accommodate multiple MOMs on a single node. By default pbs_mom opens ports 15002 and 15003 for the service and management ports respectively. For multiple MOMs to run on the same IP address they need to have their own port values so they can be distinguished from each other. pbs_server learns about the port addresses of the different MOMs from entries in the server_priv/nodes file. The following is an example of a nodes file configured for multiple MOMs:

	hosta   np=2
	hosta-1 np=2 mom_service_port=30001 mom_manager_port=30002
	hosta-2 np=2 mom_service_port=31001 mom_manager_port=31002
	hosta-3 np=2 mom_service_port=32001 mom_manager_port=32002

Note that all entries have a unique host name and that all port values are also unique. The entry hosta does not have a mom_service_port or mom_manager_port given. If unspecified, then the MOM defaults to ports 15002 and 15003.

1.8.1.2 /ect/hosts file

Host names in the server_priv/nodes file must be resolvable. Creating an alias for each host enables the server to find the IP address for each MOM; the server uses the port values from the server_priv/nodes file to contact the correct MOM. An example /etc/hosts entry for the previous server_priv/nodes example might look like the following:
	192.65.73.10 hosta hosta-1 hosta-2 hosta-3

Even though the host name and all the aliases resolve to the same IP address, each MOM instance can still be distinguished from the others because of the unique port value assigned in the server_priv/nodes file.

1.8.1.3 Starting pbs_mom with multi-mom options

To start multiple instances of pbs_mom on the same node, use the following syntax:

	pbs_mom -m -M <port value of mom_service_port> -R <port value of mom_manager_port>

Continuing based on the earlier example, if you want to create four MOMs on hosta, type the following at the command line:

# pbs_mom -m -M 30001 -R 30002
# pbs_mom -m -M 31001 -R 31002
# pbs_mom -m -M 32001 -R 32002
# pbs_mom

Notice that the last call to pbs_mom uses no arguments. By default pbs_mom opens on ports 15002 and 15003. No arguments are necessary because there are no conflicts.

1.8.2 Stopping pbs_mom in multi-mom mode

Terminate pbs_mom by using the momctl -s command. For any MOM using the default manager port 15003, the momctl -s command stops the mom. However, to terminate moms with a manager port value not equal to 15003, you must use the following syntax:

momctl -s -p 

The -p option sends the terminating signal to the MOM manager port and the MOM is terminated.