Moab Workload Manager

14.8 Diagnostic Scripts

Moab Workload Manager provides diagnostic scripts that can help aid in monitoring the state of the scheduler, resource managers, and other important components of the cluster software stack. These scripts can also be used to help diagnose issues that may need to be resolved with the help of Cluster Resources support staff. This section introduces available diagnostic scripts.

14.8.1 The support.diag.pl Script

The tools/moab/support.diag.pl script has a two-fold purpose. First, it can be used by a Moab trigger or cron job to create a regular snapshot of the state of Moab. The script captures the output of several Moab diagnostic commands (such as showq, mdiag -n, and mdiag -S), gathers configuration/log files, and records pertinent operating system information. This data is then compressed in a time-stamped tarball for easy long-term storage.

The second purpose of the support.diag.pl script is to provide Cluster Resources support personnel with a complete package of information that can be used to help diagnose configuration issues or system bugs. After capturing the state of Moab, the resulting tarball could be sent to your Cluster Resources support contact for further diagnosis.

The support.diag.pl will ask you for the trouble ticket number then guide you through the process of uploading the data to Adaptive Computing Customer Support. The uploading and ticket number request may be prevented using the --no-upload and --support-ticket=<SUPPORT_TICKET_ID> flags detailed below.

Synopsis

support.diag.pl [--include-log-lines=<NUM>] [--diag-torque]

Arguments

Argument Description
Instead of including the entire moab.log file, only the last <NUM> lines are captured in the diagnostics.
Diagnostic commands pertinent to the TORQUE resource manager are included.
Prevents the system from asking the user if they want to upload the tarball to Adaptive Computing Customer Support.
Prevents the system from asking the user for a support ticket number.