(Click to open topic with navigation)
Launch scripts have been provided for Torque, SLURM, LSF, and Cray resource managers or environments. The launch scripts can be found in the /opt/nitro/scripts directory. If you use another resource manager, you may need to build a new script. Also, if you want to change the basic configuration or paths used with your installation, you may want to customize the launch script that the users of your system will use to start Nitro.
There are several basic points that a launch script needs to cover to interface the resource manager with Nitro.
Nitro uses the job ID to customize the output files so if several copies of Nitro are running at the same time they don't corrupt each other's information. Torque, for example, defines the $PBS_JOBID environment variable that contains a job ID as defined by Torque.
The provided launch scripts add the "‑‑job‑id <job id>" parameter to Nitro's command line parameters (to the workers and the coordinator) if a job ID is provided, or if no job ID is found, then it defaults to a job ID with the format "YYYYMMDDHHMMSS" containing the date and time the Nitro launch script runs.
5.2.2 Location of the Nitro Binary
The launch script assumes the Nitro binary will be found in the /opt/nitro/bin directory. This requires that Nitro has been installed to each node, or that the /opt/nitro directory has been mapped to a remote file system.
If you have installed Nitro to a directory other than the default, you need to customize the Nitro launch script with this location. For example, if you installed Nitro to /mysharedfs/nitro, you need to change the line of the Nitro launch script
from
NITRO=/opt/nitro/bin/nitro
to
NITRO=/mysharedfs/nitro/bin/nitro
Resource managers typically send the job to one node in the allocated set of nodes. The script then executes work on any of the remote nodes within the job allocation. The Nitro launch script then launches a Nitro worker on each of the remote nodes before launching the coordinator.
The coordinator only accepts a connection from the list of workers provided via the ‑‑workers or ‑‑workers‑file parameter or by using the ‑‑key command line option to specify a session key "<keyvalue>".
Resource managers typically have an environment variable set with the list of nodes allocated to the job. For example, in Torque, this environment variable is "$PBS_NODEFILE" and contains the file that is accessible to the job containing the list of nodes allocated to the job.
The file containing the list of nodes is typically a file with a single node name per line, such as:
node01 node02 node03
5.2.4 Launch Nitro Workers and Coordinator
The Nitro launch script needs to remotely start up Nitro workers and return control to the script.
Please refer to your resource manager's documentation for instructions and options to run the remote command.
The workers are started first in the Nitro launch script so the coordinator can be executed last using "exec". This is so the coordinator gains control of the process. The Nitro launch script currently uses (and assumes) the first node in the nodes list as the coordinator and all other nodes as workers.
5.2.5 Customize Command Line Parameters
If all your user's jobs are expected to use less than 20 nodes, you may want to add the option to the script to run a local worker on the coordinator node to maximize task throughput. Users can already do this by adding the "‑‑run‑local‑worker" flag to the NITRO_COORD_OPTIONS environment variable at job submission or in the user job script, but you may elect to add this to the Nitro launch script to reduce the parameters necessary for a user to configure. See 4.2 Submit a Nitro Job to a Scheduler.
If you want to add command line parameters to Nitro, the best way is to prepend the option to the beginning of either the NITRO_OPTIONS, NITRO_COORD_OPTIONS, or NITRO_WORKER_OPTIONS environment variable(s), as appropriate. For example, if you wanted to add the "‑‑run‑local‑worker" flag to the coordinator command line, add the following line to the Nitro launch script after the line containing the NITRO_OPTIONS.
NITRO_OPTIONS="--job-id ${NITROJOBID} ${NITRO_OPTIONS}" NITRO_COORD_OPTIONS="--run-local-worker ${NITRO_COORD_OPTIONS}"