A TORQUE cluster consists of one head node and many compute nodes. The head node runs the pbs_server daemon and the compute nodes run the pbs_mom daemon. Client commands for submitting and managing jobs can be installed on any host (including hosts not running pbs_server or pbs_mom).
The head node also runs a scheduler daemon. The scheduler interacts with pbs_server to make local policy decisions for resource usage and allocate nodes to jobs. A simple FIFO scheduler, and code to construct more advanced schedulers, is provided in the TORQUE source distribution. Most TORQUE users choose to use a packaged, advanced scheduler such as Maui or Moab.
Users submit jobs to pbs_server using the qsub command. When pbs_server receives a new job, it informs the scheduler. When the scheduler finds nodes for the job, it sends instructions to run the job with the node list to pbs_server. Then, pbs_server sends the new job to the first node in the node list and instructs it to launch the job. This node is designated the execution host and is called Mother Superior. Other nodes in a job are called sister moms.
Build the distribution on the machine that will act as the TORQUE server - the machine which monitors and controls all compute nodes by running the pbs_server daemon.
The built distribution package works only on compute nodes of a similar architecture. Nodes with different architecture must have the installation package built on them individually. |
> tar -xzvf torque-2.3.4.tar.gz > cd torque-2.3.4/
By default, make install installs all files in /usr/local/bin, /usr/local/lib, /usr/local/sbin, /usr/local/include, and /usr/local/man . You can specify an installation prefix other than /usr/local using --prefix as an argument to ./configure. Note that TORQUE cannot be installed into a directory path that contains a space.
./configure --prefix=$HOME
Verify you have environment variables configured so your system can find the shared libraries and binary files for TORQUE.
To set the library path, add the directory where the TORQUE libraries will be installed. For example, if your TORQUE libraries are installed in /opt/torque/lib, execute the following:
> set LD_LIBRARY_PATH=$(LD_LIBRARY_PATH):/opt/torque/lib > ldconfig
Cluster Resources recommends that the TORQUE administrator be root. |
For information on customizing the build at configure time, see the configure options list. |
> ./configure
TORQUE must be installed by a root user. |
> make > sudo make install
OSX 10.4 users need to change #define __TDARWIN in src/include/pbs_config.h to #define __TDARWIN_8. |
After installation, verify you have the PATH environment variable configured to include/usr/local/bin/ and /usr/local/sbin/. |
By default, make install creates a directory at /var/spool/torque. This directory is referred to as TORQUE_HOME. TORQUE_HOME has several sub-directories, including server_priv/, server_logs/, mom_priv/, mom_logs/, and other directories used in the configuration and running of TORQUE.
TORQUE 2.0.2 and later includes a torque.spec file for building your own RPMs. You can also use the checkinstall program to create your own RPM, tgz, or deb package.
While Adaptive Computing distributes the RPM files as part of the build, it does not support those files. Not every Linux distribution uses RPM. Adaptive Computing provides a single solution using make and make install that works across all Linux distributions and most UNIX systems. We recognize the RPM format provides many advantages for deployment but it is up to the indiviual site to repackage the TORQUE installation to match their individual needs. |
Use the Cluster Resources tpackage system to create self-extracting tarballs which can be distributed and installed on compute nodes. The tpackages are customizable. See the INSTALL file for additional options and features.
To create tpackages
> make packages Building ./torque-package-clients-linux-i686.sh ... Building ./torque-package-mom-linux-i686.sh ... Building ./torque-package-server-linux-i686.sh ... Building ./torque-package-gui-linux-i686.sh ... Building ./torque-package-devel-linux-i686.sh ... Done. The package files are self-extracting packages that can be copied and executed on your production machines. Use --help for options.
> cp torque-package-mom-linux-i686.sh /shared/storage/ > cp torque-package-clients-linux-i686.sh /shared/storage/
Cluster Resources recommends that you use a remote shell, such as SSH, to install tpackages on remote systems. Set up shared SSH keys if you do not want to supply a password for each host.
The only required package for the compute nodes is mom-linux. Additional packages are recommended so you can use client commands and submit jobs from compute nodes. |
The following is an example on how to copy and install mom-linux in a distributed fashion:
> for i in node01 node02 node03 node04 ; do scp torque-package-mom-linux-i686.sh ${i}:/tmp/. ; done > for i in node01 node02 node03 node04 ; do scp torque-package-clients-linux-i686.sh ${i}:/tmp/. ; done > for i in node01 node02 node03 node04 ; do ssh ${i} /tmp/torque-package-mom-linux-i686.sh --install ; done > for i in node01 node02 node03 node04 ; do ssh ${i} /tmp/torque-package-clients-linux-i686.sh --install ; done
Alternatively, you can use a tool like xCAT instead of dsh. |
Copy the tpackage to the nodes.
> prcp torque-package-linux-i686.sh noderange:/destinationdirectory/
> psh noderange /tmp/torque-package-linux-i686.sh --install
Alternatively, users with RPM-based Linux distributions can build RPMs from the source tarball in two ways.
> rpmbuild -ta torque-2.3.4.tar.gz
Although optional, it is possible to use the TORQUE server as a compute node and install a pbs_mom with the pbs_server daemon.
The method for enabling TORQUE as a service is dependent on the Linux variant you are using. Startup scripts are provided in the contrib/init.d/ directory of the source package.
> cp contrib/init.d/pbs_mom /etc/init.d/pbs_mom > chkconfig --add pbs_mom
> cp contrib/init.d/suse.pbs_mom /etc/init.d/pbs_mom > insserv -d pbs_mom
> cp contrib/init.d/debian.pbs_mom /etc/init.d/pbs_mom > update-rc.d pbs_mom defaults