1.1 TORQUE Installation

1.1.1 TORQUE Architecture

A TORQUE cluster consists of one head node and many compute nodes. The head node runs the pbs_server daemon and the compute nodes run the pbs_mom daemon. Client commands for submitting and managing jobs can be installed on any host (including hosts not running pbs_server or pbs_mom).

The head node also runs a scheduler daemon. The scheduler interacts with pbs_server to make local policy decisions for resource usage and allocate nodes to jobs. A simple FIFO scheduler, and code to construct more advanced schedulers, is provided in the TORQUE source distribution. Most TORQUE users choose to use a packaged, advanced scheduler such as Maui or Moab.

Users submit jobs to pbs_server using the qsub command. When pbs_server receives a new job, it informs the scheduler. When the scheduler finds nodes for the job, it sends instructions to run the job with the node list to pbs_server. Then, pbs_server sends the new job to the first node in the node list and instructs it to launch the job. This node is designated the execution host and is called Mother Superior. Other nodes in a job are called sister moms.

1.1.2 Installing TORQUE

Build the distribution on the machine that will act as the TORQUE server - the machine which monitors and controls all compute nodes by running the pbs_server daemon.

Note The built distribution package works only on compute nodes of a similar architecture. Nodes with different architecture must have the installation package built on them individually.

  1. Download the TORQUE distribution file from http://clusterresources.com/downloads/torque. Source code can also be downloaded using Subversion from the repository at svn://clusterresources.com/torque/. Use the command svn list svn://clusterresources.com/torque/ to display all branches.
  2. Extract the packaged file and navigate to the unpackaged directory.
  3. > tar -xzvf torque-2.3.4.tar.gz
    > cd torque-2.3.4/

  4. Configure the package.

  5. By default, make install installs all files in /usr/local/bin, /usr/local/lib, /usr/local/sbin, /usr/local/include, and /usr/local/man . You can specify an installation prefix other than /usr/local using --prefix as an argument to ./configure. Note that TORQUE cannot be installed into a directory path that contains a space.

    ./configure --prefix=$HOME

    Verify you have environment variables configured so your system can find the shared libraries and binary files for TORQUE.

    To set the library path, add the directory where the TORQUE libraries will be installed. For example, if your TORQUE libraries are installed in /opt/torque/lib, execute the following:

    > set LD_LIBRARY_PATH=$(LD_LIBRARY_PATH):/opt/torque/lib
    > ldconfig

    Note Cluster Resources recommends that the TORQUE administrator be root.

    Note For information on customizing the build at configure time, see the configure options list.

    > ./configure

  6. Run make and make install.
  7. Note TORQUE must be installed by a root user.

    > make
    > sudo make install

Note OSX 10.4 users need to change #define __TDARWIN in src/include/pbs_config.h to #define __TDARWIN_8.

Note After installation, verify you have the PATH environment variable configured to include/usr/local/bin/ and /usr/local/sbin/.

By default, make install creates a directory at /var/spool/torque. This directory is referred to as TORQUE_HOME. TORQUE_HOME has several sub-directories, including server_priv/, server_logs/, mom_priv/, mom_logs/, and other directories used in the configuration and running of TORQUE.

TORQUE 2.0.2 and later includes a torque.spec file for building your own RPMs. You can also use the checkinstall program to create your own RPM, tgz, or deb package.

Note While Adaptive Computing distributes the RPM files as part of the build, it does not support those files. Not every Linux distribution uses RPM. Adaptive Computing provides a single solution using make and make install that works across all Linux distributions and most UNIX systems. We recognize the RPM format provides many advantages for deployment but it is up to the indiviual site to repackage the TORQUE installation to match their individual needs.

1.1.3 Compute Nodes

Use the Cluster Resources tpackage system to create self-extracting tarballs which can be distributed and installed on compute nodes. The tpackages are customizable. See the INSTALL file for additional options and features.

To create tpackages

  1. Configure and make as normal, and then run make packages.
  2. > make packages
    Building ./torque-package-clients-linux-i686.sh ...
    Building ./torque-package-mom-linux-i686.sh ...
    Building ./torque-package-server-linux-i686.sh ...
    Building ./torque-package-gui-linux-i686.sh ...
    Building ./torque-package-devel-linux-i686.sh ...
    Done.
    
    The package files are self-extracting packages that can be copied
    and executed on your production machines.  Use --help for options.

  3. Copy the desired packages to a shared location.
  4. > cp torque-package-mom-linux-i686.sh /shared/storage/
    > cp torque-package-clients-linux-i686.sh /shared/storage/

  5. Install the tpackages on the compute nodes.
  6. Cluster Resources recommends that you use a remote shell, such as SSH, to install tpackages on remote systems. Set up shared SSH keys if you do not want to supply a password for each host.

    Note The only required package for the compute nodes is mom-linux. Additional packages are recommended so you can use client commands and submit jobs from compute nodes.

    The following is an example on how to copy and install mom-linux in a distributed fashion:

    > for i in node01 node02 node03 node04 ; do scp torque-package-mom-linux-i686.sh ${i}:/tmp/. ; done
    >  for i in node01 node02 node03 node04 ; do scp torque-package-clients-linux-i686.sh ${i}:/tmp/. ; done
    > for i in node01 node02 node03 node04 ; do ssh ${i} /tmp/torque-package-mom-linux-i686.sh --install ; done
    > for i in node01 node02 node03 node04 ; do ssh ${i} /tmp/torque-package-clients-linux-i686.sh --install ; done

    Note Alternatively, you can use a tool like xCAT instead of dsh.

    1. Copy the tpackage to the nodes.

    2. > prcp torque-package-linux-i686.sh noderange:/destinationdirectory/

    3. Install the tpackage.
    4. > psh noderange /tmp/torque-package-linux-i686.sh --install

Alternatively, users with RPM-based Linux distributions can build RPMs from the source tarball in two ways.

Although optional, it is possible to use the TORQUE server as a compute node and install a pbs_mom with the pbs_server daemon.

1.1.4 Enabling TORQUE as a service (optional)

The method for enabling TORQUE as a service is dependent on the Linux variant you are using. Startup scripts are provided in the contrib/init.d/ directory of the source package.

These options can be added to the self-extracting packages. For more details, see the INSTALL file.

See Also