(Click to open topic with navigation)
This topic contains information and instructions to configure your server.
In this topic:
Also see Setting Up the MOM Hierarchy (Optional)
4.250.1 Server Configuration Overview
There are several steps to ensure that the server and the nodes are completely aware of each other and able to communicate directly. Some of this configuration takes place within Torque directly using the qmgr command. Other configuration settings are managed using the pbs_server nodes file, DNS files such as /etc/hosts and the /etc/hosts.equiv file.
4.250.2 Name Service Configuration
Each node, as well as the server, must be able to resolve the name of every node with which it will interact. This can be accomplished using /etc/hosts, DNS, NIS, or other mechanisms. In the case of /etc/hosts, the file can be shared across systems in most cases.
A simple method of checking proper name service configuration is to verify that the server and the nodes can "ping" each other.
4.250.3 Configuring Job Submission Hosts
When jobs can be submitted from several different hosts, these hosts should be trusted via the R* commands (such as rsh and rcp). This can be enabled by adding the hosts to the /etc/hosts.equiv file of the machine executing the pbs_server daemon or using other R* command authorization methods. The exact specification can vary from OS to OS (see the man page for ruserok to find out how your OS validates remote users). In most cases, configuring this file is as simple as adding a line to your /etc/hosts.equiv file, as in the following:
/etc/hosts.equiv:
#[+ | -] [hostname] [username]
mynode.myorganization.com
.....
Either of the hostname or username fields may be replaced with a wildcard symbol (+). The (+) may be used as a stand-alone wildcard but not connected to a username or hostname, e.g., +node01 or +user01. However, a (-) may be used in that manner to specifically exclude a user.
Following the Linux man page instructions for hosts.equiv may result in a failure. You cannot precede the user or hostname with a (+). To clarify, node1 +user1 will not work and user1 will not be able to submit jobs.
For example, the following lines will not work or will not have the desired effect:
+node02 user1
node02 +user1
These lines will work:
node03 +
+ jsmith
node04 -tjones
The most restrictive rules must precede more permissive rules. For example, to restrict user tsmith but allow all others, follow this format:
node01 -tsmith
node01 +
Please note that when a hostname is specified, it must be the fully qualified domain name (FQDN) of the host. Job submission can be further secured using the server or queue acl_hosts and acl_host_enabled parameters (for details, see Queue Attributes).
Using the "submit_hosts" service parameter
Trusted submit host access may be directly specified without using RCmd authentication by setting the server submit_hosts parameter via qmgr as in the following example:
> qmgr -c 'set server submit_hosts = host1'
> qmgr -c 'set server submit_hosts += host2'
> qmgr -c 'set server submit_hosts += host3'
Use of submit_hosts is potentially subject to DNS spoofing and should not be used outside of controlled and trusted environments.
Allowing job submission from compute hosts
If preferred, all compute nodes can be enabled as job submit hosts without setting .rhosts or hosts.equiv by setting the allow_node_submit parameter to true.
4.250.4 Configuring Torque on a Multi-Homed Server
If the pbs_server daemon is to be run on a multi-homed host (a host possessing multiple network interfaces), the interface to be used can be explicitly set using the SERVERHOST parameter.
4.250.5 Architecture Specific Notes
With some versions of Mac OS/X, it is required to add the line $restricted *.<DOMAIN> to the pbs_mom configuration file. This is required to work around some socket bind bugs in the OS.
4.250.6 Specifying Non-Root Administrators
By default, only root is allowed to start, configure and manage the pbs_server daemon. Additional trusted users can be authorized using the parameters managers and operators. To configure these parameters use the qmgr command, as in the following example:
> qmgr
Qmgr: set server managers += josh@*.fsc.com
Qmgr: set server operators += josh@*.fsc.com
All manager and operator specifications must include a user name and either a fully qualified domain name or a host expression.
To enable all users to be trusted as both operators and administrators, place the + (plus) character on its own line in the server_priv/acl_svr/operators and server_priv/acl_svr/managers files.
Moab relies on emails from Torque about job events. To set up email, do the following:
To set up email
> ./configure --with-sendmail=<path_to_executable>
> qmgr -c 'set server mail_domain=clusterresources.com'
> qmgr -c 'set server mail_body_fmt=Job: %i \n Name: %j \n On host: %h \n \n %m \n \n %d'
> qmgr -c 'set server mail_subject_fmt=Job %i - %r'
By default, users receive e-mails on job aborts. Each user can select which kind of e-mails to receive by using the qsub -m option when submitting the job. If you want to dictate when each user should receive e-mails, use a submit filter (for details, see Job Submission Filter ("qsub Wrapper")).
4.250.8 Using MUNGE Authentication
MUNGE is an authentication service that creates and validates user credentials. It was developed by Lawrence Livermore National Laboratory (LLNL) to be highly scalable so it can be used in large environments such as HPC clusters. To learn more about MUNGE and how to install it, see http://code.google.com/p/munge/.
Configuring Torque to use MUNGE is a compile time operation. When you are building Torque, use -enable-munge-auth as a command line option with ./configure.
> ./configure -enable-munge-auth
You can use only one authorization method at a time. If -enable-munge-auth is configured, the privileged port ruserok method is disabled.
Torque does not link any part of the MUNGE library into its executables. It calls the MUNGE and UNMUNGE utilities which are part of the MUNGE daemon. The MUNGE daemon must be running on the server and all submission hosts. The Torque client utilities call MUNGE and then deliver the encrypted credential to pbs_server where the credential is then unmunged and the server verifies the user and host against the authorized users configured in serverdb.
Authorized users are added to serverdb using qmgr and the authorized_users parameter. The syntax for authorized_users is authorized_users=<user>@<host>. To add an authorized user to the server you can use the following qmgr command:
> qmgr -c 'set server authorized_users=user1@hosta
> qmgr -c 'set server authorized_users+=user2@hosta
The previous example adds user1 and user2 from hosta to the list of authorized users on the server. Users can be removed from the list of authorized users by using the -= syntax as follows:
> qmgr -c 'set server authorized_users-=user1@hosta
Users must be added with the <user>@<host> syntax. The user and the host portion can use the '*' wildcard to allow multiple names to be accepted with a single entry. A range of user or host names can be specified using a [a-b] syntax where a is the beginning of the range and b is the end.
> qmgr -c 'set server authorized_users=user[1-10]@hosta
This allows user1 through user10 on hosta to run client commands on the server.
Related Topics