The directory TORQUE_HOME/server_priv/ contains configuration and other information needed for pbs_server. One of the files in this directory is serverdb. The serverdb file contains configuration parameters for pbs_server and its queues. For pbs_server to run, serverdb must be initialized.
You can initialize serverdb in two different ways, but the recommended way is to use the ./torque.setup script:
Restart pbs_server after initializing serverdb.
> qterm > pbs_server
The torque.setup script uses pbs_server -t create to initialize serverdb and then adds a user as a manager and operator of TORQUE and other commonly used attributes. The syntax is as follows:
> ./torque.setup ken > qmgr -c 'p s' # # Create queues and set their attributes. # # # Create and define queue batch # create queue batch set queue batch queue_type = Execution set queue batch resources_default.nodes = 1 set queue batch resources_default.walltime = 01:00:00 set queue batch enabled = True set queue batch started = True # # Set server attributes. # set server scheduling = True set server acl_hosts = kmn set server managers = ken@kmn set server operators = ken@kmn set server default_queue = batch set server log_events = 511 set server mail_from = adm set server scheduler_iteration = 600 set server node_check_rate = 150 set server tcp_timeout = 6 set server mom_job_sync = True set server keep_completed = 300
The -t create option instructs pbs_server to create the serverdb file and initialize it with a minimum configuration to run pbs_server. To see the configuration, use qmgr:
> pbs_server -t create > qmgr -c 'p s' # # Set server attributes. # set server acl_hosts = kmn set server log_events = 511 set server mail_from = adm set server scheduler_iteration = 600 set server node_check_rate = 150 set server tcp_timeout = 6
A single queue named batch and a few needed server attribues are created.
The environment variable TORQUE_HOME is where configuration files are stored. For TORQUE 2.1 and later, TORQUE_HOME is /var/spool/torque/. For earlier versions, TORQUE_HOME is /usr/spool/PBS/.
The pbs_server must recognize which systems on the network are its compute nodes. Specify each node on a line in the server's nodes file. This file is located at TORQUE_HOME/server_priv/nodes. In most cases, it is sufficient to specify just the names of the nodes on individual lines; however, various properties can be applied to each node.
Syntax of nodes file:
node-name[:ts] [np=] [gpus=] [properties]
The [:ts] option marks the node as timeshared. Timeshared nodes are listed by the server in the node status report, but the server does not allocate jobs to them.
The [np=] option specifies the number of virtual processors for a given node. The value can be less than, equal to, or greater than the number of physical processors on any given node.
The [gpus=] option specifies the number of GPUs for a given node. The value can be less than, equal to, or greater than the number of physical GPUs on any given node.
The node processor count can be automatically detected by the TORQUE server if auto_node_np is set to TRUE. This can be set using the command qmgr -c set server auto_node_np = True. Setting auto_node_np to TRUE overwrites the value of np set in TORQUE_HOME/server_priv/nodes.
The [properties] option allows you to specify arbitrary strings to identify the node. Property strings are alphanumeric characters only and must begin with an alphabetic character.
Comment lines are allowed in the nodes file if the first non-white space character is the pound sign (#).
The following example shows a possible node file listing.
TORQUE_HOME/server_priv/nodes:# Nodes 001 and 003-005 are cluster nodes # node001 np=2 cluster01 rackNumber22 # # node002 will be replaced soon node002:ts waitingToBeReplaced # node002 will be replaced soon # node003 np=4 cluster01 rackNumber24 node004 cluster01 rackNumber25 node005 np=2 cluster01 rackNumber26 RAM16GB node006 node007 np=2 node008:ts np=4 ...
If using TORQUE self extracting packages with default compute node configuration, no additional steps are required and you can skip this section.
If installing manually, or advanced compute node configuration is needed, edit the TORQUE_HOME/mom_priv/config file on each node. The recommended settings follow.
TORQUE_HOME/mom_priv/config:$pbsserver headnode # note: hostname running pbs_server $logevent 255 # bitmap of which events to log
This file is identical for all compute nodes and can be created on the head node and distributed in parallel to all systems.
After configuring the serverdb and the server_priv/nodes files, and after ensuring minimal MOM configuration, restart the pbs_server on the server node and the pbs_mom on the compute nodes.
Compute Nodes:> pbs_mom
> qterm -t quick > pbs_server
After waiting several seconds, the pbsnodes -a command should list all nodes in state free.