Download the TORQUE distribution file from http://clusterresources.com/downloads/torque.
Extract and build the distribution on the machine that will act as the "TORQUE server" - the machine that will monitor and control all compute nodes by running the pbs_server daemon. See the example below:
> tar -xzvf torque.tar.gz > cd torque > ./configure > make > make install |
OSX 10.4 users need to change the #define __TDARWIN in src/include/pbs_config.h to #define __TDARWIN_8.
After installation, verify you have PATH environment variables configured for /usr/local/bin/ and /usr/local/sbin/. Client commands are installed to /usr/local/bin and server binaries are installed to /usr/local/sbin.
In this document, TORQUE_HOME corresponds to where TORQUE stores its configuration files. The default is /var/spool/torque.
Initialize/Configure TORQUE on the server (pbs_server)
Install TORQUE on the compute nodes
To configure a compute node do the following on each machine (see page 19, Section 3.2.1 of PBS Administrator's Manual for full details):
Configure TORQUE on the compute nodes
Configure data management on the compute nodes
Data management allows jobs' data to be staged in/out or to and from the server and compute nodes.
(Example: $usecp gridmaster.tmx.com:/home /home)
Update TORQUE server configuration
On the TORQUE server, append the list of newly configured compute nodes to the TORQUE_HOME/server_priv/nodes file:
server_priv/nodes
computenode001.cluster.org computenode002.cluster.org computenode003.cluster.org |
Start the pbs_mom daemons on compute nodes
Run the trqauthd daemon to run client commands (see Configuring trqauthd for client commands). This enables running client commands.
Verifying correct TORQUE installation
The pbs_server daemon was started on the TORQUE server when the torque.setup file was executed or when it was manually configured. It must now be restarted so it can reload the updated configuration changes.
# shutdown server > qterm # shutdown server
# start server > pbs_server
# verify all queues are properly configured > qstat -q
# view additional server configuration > qmgr -c 'p s'
# verify all nodes are correctly reporting > pbsnodes -a
# submit a basic job >echo "sleep 30" | qsub
# verify jobs display > qstat |
At this point, the job will not start because there is no scheduler running. The scheduler is enabled in the next step below.
Selecting the cluster scheduler is an important decision and significantly affects cluster utilization, responsiveness, availability, and intelligence. The default TORQUE scheduler, pbs_sched, is very basic and will provide poor utilization of your cluster's resources. Other options, such as Maui Scheduler or Moab Workload Manager are highly recommended. If using Maui/Moab, refer to the Moab-PBS Integration Guide. If using pbs_sched, start this daemon now.
If you are installing ClusterSuite, TORQUE and Moab were configured at installation for interoperability and no further action is required.
Startup/Shutdown service script for TORQUE/Moab (OPTIONAL)
Optional startup/shutdown service scripts are provided as an example of how to run TORQUE as an OS service that starts at bootup. The scripts are located in the contrib/init.d/ directory of the TORQUE tarball you downloaded. In order to use the script you must:
Related topics
© 2012 Adaptive Computing