You are here: References > Launch Scripts

5.5 Launch Scripts

Nitro comes packaged with a launch_nitro.sh and a launch_worker.sh script for Torque, SLURM, LSF, and Cray resource managers or environments. If you use another resource manager, you may need to build the scripts.

There are several basic points that each launch script needs to cover to interface the resource manager with Nitro.

  1. Getting the resource manager job ID and passing it to Nitro. See 5.5.1 Resource Manager Job ID.
  2. Specifying the location of the Nitro binary. See 5.5.2 Location of the Nitro Binary.
  3. Getting the list of nodes that Nitro is to run on for the current job. See 5.5.3 List of Job Nodes.
  4. Launching the Nitro workers and coordinator. 5.5.4 Launch Nitro Workers and Coordinator.
  5. Customize the command line parameters of workers and/or coordinator. See 5.5.5 Customize Command Line Parameters.

5.5.1 Resource Manager Job ID

Nitro uses the job ID to customize the output files so if several copies of Nitro are running at the same time they don't corrupt each other's information. Torque, for example, defines the $PBS_JOBID environment variable that contains a job ID as defined by Torque.

The provided launch scripts add the "‑‑job‑id <job id>" parameter to Nitro's command line parameters (to the workers and the coordinator) if a job ID is provided, or if no job ID is found, then it defaults to a job ID with the format "YYYYMMDDHHMMSS" containing the date and time the Nitro launch script runs.

5.5.2 Location of the Nitro Binary

The launch scripts assume the Nitro binary will be found in the /opt/nitro/bin directory. This requires that Nitro has been installed all the nodes, or that the /opt/nitro directory has been mapped to a remote file system.

If you have installed Nitro to a directory other than the default, you need to customize the launch scripts with this location. For example, if you installed Nitro to /mysharedfs/nitro, you need to change the line of the launch scripts

from

NITRO=/opt/nitro/bin/nitro

to

NITRO=/mysharedfs/nitro/bin/nitro

5.5.3 List of Job Nodes

Resource managers typically have an environment variable set with the list of nodes allocated to the job. In Torque, this environment variable is "$PBS_NODEFILE" and contains the file that is accessible to the job containing the list of nodes allocated to the job.

The file containing the list of nodes is typically a file with a single node name per line, such as:

node01
node02
node03

5.5.4 Launch Nitro Workers and Coordinator

The launch scripts need to include the resource manager's remote command in order to set up the workers and the coordinator.

Please refer to your resource manager's documentation for instructions and options to run the remote command.

For static and dynamic jobs, when executing the launch_nitro.sh script, the workers are started first and the coordinator will be executed last using "exec". This is so the coordinator gains control of the process. The launch_nitro.sh script currently uses (and assumes) the first node in the node list is the coordinator and all other nodes are workers.

In addition, for dynamic jobs, the launch_worker.sh script is executed to add one or more workers to the coordinator that was executed by the launch_nitro.sh script.

5.5.5 Customize Command Line Parameters

The system administrator may customize the Nitro launch script to suit the needs of the system. The launch scripts may examine and modify, or simply pass through, the command line options specified by the environment variables set by the nitro_job.sh and/or worker.job.sh scripts. The command line options are then passed via the environment variables to the Nitro workers and coordinator that are started by the launch scripts.

If you want to add command line parameters to Nitro, the best way is to prepend the option to the beginning of either the NITRO_OPTIONS, NITRO_COORD_OPTIONS, or NITRO_WORKER_OPTIONS environment variable(s), as appropriate. See 5.2 Command Line Flags, or Options, and Positional Parameters for more information.

For example, if all of your user's jobs are expected to use less than 20 nodes, you may want to add the option to run a local worker on the coordinator node to maximize task throughput. If you add this option to the launch_nitro.sh script, it reduces the parameters necessary to configure the nitro_job.sh script. Alternatively, you can add the "‑‑run‑local‑worker" flag to the NITRO_COORD_OPTIONS environment variable in the nitro_job.sh script (if using the nitrosub command). If your configuration allows users to submit jobs using the resource manager's job submission command (such as Torque's qsub), those users can either add this flag to their customized nitro_job.sh script (saved in their work directory) or add this flag at job submission.

To add the "‑‑run‑local‑worker" flag to the coordinator command line, add the following line to the launch_nitro.sh script after the line containing the NITRO_OPTIONS.

NITRO_OPTIONS="--job-id ${NITROJOBID} ${NITRO_OPTIONS}"
NITRO_COORD_OPTIONS="--run-local-worker ${NITRO_COORD_OPTIONS}"	

The launch_worker.sh script must be congruent with the launch_nitro.sh script.

© 2017 Adaptive Computing