5.88 Integrating With Slurm

Moab Accounting Manager can be configured to interact with Slurm to track and charge for resources utilized by jobs. The integration involves the use of an epilog script as well as a patch and the use of a prolog script if enforcing allocations.

5.88.1 Copy MAM's Slurm Contrib Scripts

If you installed MAM from tarball, the Slurm integration scripts can be found in the directory where you unpacked the tarball. If you installed from RPM, the Slurm integration scripts can be found in /usr/share/moab-accounting-manager/contrib. Copy MAM's Slurm contrib scripts to /opt/slurm/etc and ensure that they are owned and executable by the Slurm user.

Example 5-27: Copying the Slurm Contrib Scripts

[root]# cp /software/mam-<version>/contrib/slurm/mam-*.slurm.pl /opt/slurm/etc
[root]# chown slurm:slurm /opt/slurm/etc/mam-*.slurm.pl
[root]# chmod +x /opt/slurm/etc/mam-*.slurm.pl

5.88.2 Configure the Controller Epilog to Call the MAM Charge Script

If you do not intend to use the slurmctld epilog for any purpose other than for integration with MAM, you can configure Slurm to call the script directly by editing the Slurm configuration file, setting the EpilogSlurmctld to point to the mam-charge.slurm.pl file, and reconfiguring slurmctld.

Example 5-28: Setting the Controller Epilog to Call the Charge Script Directly

[root]# vi /opt/slurm/etc/slurm.conf

EpilogSlurmctld=/opt/slurm/etc/mam.charge.slurm.pl

[root]# scontrol reconfigure

If you already have a slurmctld epilog configured, the charge script may be called within your existing epilog script. Edit your slurmctld epilog script and add a section at the end of the epilog that calls the charge script and exits with the status returned by the charge script. The exit in this case is optional and may be excluded if desired, as its only use is for logging purposes.

Example 5-29: Editing the Existing Epilog Script to Call the Charge Script

[root]# vi <slurmctld_epilog_script>

If you are using a bash script for your slurmctld epilog, include an excerpt similar to the following:

/opt/slurm/etc/mam.charge.slurm.pl
exit $?

If you are using a Perl script, include an excerpt similar to the following:

my $cmd = "/opt/slurm/etc/mam.charge.slurm.pl";
my $output = `$cmd 2>&1` || `sh -c "$cmd 2>&1"`;
exit $? >> 8;

If you are using a Python script, include an excerpt similar to the following:

import subprocess
cmd = '/opt/slurm/etc/mam.charge.slurm.pl'
rc = subprocess.Popen(cmd).wait()
exit(rc)

5.88.3 Patch Slurm

If you intend to use the strict allocation accounting mode in MAM, you will need to patch Slurm in order for Slurm to enforce your configured failure action when unable to obtain a lien with MAM. This patch will need to be reapplied each time Slurm is upgraded.

Example 5-30: Patching Slurm

[root]# scontrol shutdown slurmctld
[root]# cd /software/slurm-<version>
[root]# patch -p 0 < /software/mam-<version>/contrib/slurm/slurm-mam.patch
[root]# make
[root]# make install
[root]# su - slurm -c "slurmctld"

5.88.4 Configure the Controller Prolog to Call the MAM Reserve Script

If you intend to use the strict allocation accounting mode in MAM, you will need to configure SLURM to call the reserve script from the slurmctld prolog.

If you do not intend to use the slurmctld prolog for any purpose other than for integration with MAM, you can configure Slurm to call the script directly by editing the Slurm configuration file, setting the PrologSlurmctld to point to the mam-reserve.slurm.pl file, and reconfiguring slurmctld.

Example 5-31: Setting the Controller Prolog to Call the Reserve Script Directly

[root]# vi /opt/slurm/etc/slurm.conf

PrologSlurmctld=/opt/slurm/etc/mam.reserve.slurm.pl

[root]# scontrol reconfigure

If you already have a slurmctld prolog configured, the reserve script may be called within your existing prolog script. Edit your slurmctld prolog script and add a section in the prolog that calls the reserve script and exits with an appropriate exit code.

Example 5-32: Editing the Existing Prolog Script to Call the Reserve Script

[root]# vi <slurmctld_prolog_script>

If you are using a bash script for your slurmctld prolog, include an excerpt similar to the following:

/opt/slurm/etc/mam.reserve.slurm.pl
rc=$?
if (( $rc >= 78 && $rc <= 103 )); then
   exit $?
fi

If you are using a Perl script, include an excerpt similar to the following:

my $cmd = "/opt/slurm/etc/mam.reserve.slurm.pl";
my $output = `$cmd 2>&1` || `sh -c "$cmd 2>&1"`;
my $rc =  $? >> 8;
exit $rc if ($rc >= 78 && $rc <= 103);

If you are using a Python script, include an excerpt similar to the following:

import subprocess
cmd = '/opt/slurm/etc/mam.reserve.slurm.pl'
rc = subprocess.Popen(cmd).wait()
if rc >= 78 and rc <= 103:
   exit(rc)

5.88.5 Customize the Reserve Script

If you intend to use the strict allocation accounting mode in MAM, edit the mam.reserve.slurm.pl script and set the connection failure action, funds failure action, and general failure action values according to your desired policy.

Before starting a job, the prolog will call MAM to create a lien in order to verify and protect the funds required for the job run. If the lien fails, one of four failure actions can be applied:

A separate failure action can be configured for each of three different situations:

Example 5-33: Configuring the Failure Action Policies in the Reserve Script

[root]# vi /opt/slurm/etc/mam.reserve.slurm.pl

my $connectionFailureAction = 'DEFER';
my $fundsFailureAction = 'HOLD';
my $generalFailureAction = 'CANCEL';

When an accounting failure occurs in the prolog, the MAM response message and the resulting failure action is recorded in the job's comment field.

© 2017 Adaptive Computing