(Click to open topic with navigation)
This topic provides instructions for enabling NUMA-aware, including cgroups, and requires Torque 6.0 or later. For instructions on NUMA-support configurations, see 2.24 Torque NUMA-Support Configuration. This topic assumes you have a basic understanding of cgroups. See RedHat Resource Management Guideor cgroups on kernel.orgfor basic information on cgroups.
Torque uses cgroups to better manage cpu and memory accounting, memory enforcement, cpuset management, and binding jobs to devices such as MICs and GPUs.
Be aware of the following:
Prerequisites
$ tar -xzvf hwloc-1.9.tar.gz
$ cd hwloc-1.9.tar.gz
$ sudo ./configure
Installation Instructions
Do the following:
Red Hat-based Systems must use libcgroup version 0.40.rc1-16.el6 or later; SUSE-based systems need to use a comparative libcgroup version.
yum install libcgroup-tools libcgroup
zypper install libcgroup-tools
$ ./configure --enable-cgroups
$ lssubsys -am nsperf_event net_prio cpuset cpu cpuacct memory devices freezer net_cls blkio
$ lssubsys -am ns perf_event net_prio cpuset,cpu,cpuacct /cgroup/cpu memory /cgroup/memory devices /cgroup/devices freezer /cgroup/freezer net_cls /cgroup/net_cls blkio /cgroup/blkio
mount -t cgroup -o <subsystem>[,<subsystem>,...] name <dir path>/name
The name parameter will be the name of the hierarchy.
The following commands create five hierarchies, one for each subsystem.
mount -t cgroup -o cpuset cpuset /var/spool/torque/cgroup/cpuset mount -t cgroup -o cpu cpu /var/spool/torque/cgroup/cpu mount -t cgroup -o cpuacct cpuacct /var/spool/torque/cgroup/cpuacct mount -t cgroup -o memory memory /var/spool/torque/cgroup/memory mount -t cgroup -o devices devices /var/spool/torque/cgroup/devices
Once you have mounted the cgroups, run lssubsys -am again. You should now see:
cpuset /var/spool/torque/cgroup/cpuset cpu /var/spool/torque/cgroup/cpu cpuacct /var/spool/torque/cgroup/cpuacct memory /var/spool/torque/cgroup/memory devices /var/spool//torque/cgroup/devices freezer blkio perf_event
2.23.1 Multiple cgroup Directory Configuration
If your system has more than one cgroup directory configured, you must create the trq‑cgroup‑paths file in the $TORQUE_HOME directory. This file has a list of the cgroup subsystems and the mount points for each subsystem in the syntax of <subsystem> <mount point>.
All five subsystems used by pbs_mom must be in the trq‑cgroup‑paths file. In the example that follows, a directory exists at /cgroup with subdirectories for each subsystem. Torque uses this file first to configure where it will look for cgroups.
cpuset /cgroup/cpuset
cpuacct /cgroup/cpuacct
cpu /cgroup/cpu
memory /cgroup/memory
devices /cgroup/devices
2.23.2 Change Considerations for pbs_mom
In order to improve performance when removing cgroup hierarchies and job files Torque 6.0.0 added a new MOM configuration parameter $thread_unlink_calls. This parameter puts the job file cleanup processes on their own thread which increases the performance of the MOM. However, the addition of new threads also increases the size of pbs_mom from around 50 mb to 100mb.
$thread_unlink_calls is true by default which will thread job deletion. If pbs_mom is too large for your configuration set $thread_unlink_calls to false and jobs will be deleted within the main pbs_mom thread.