(Click to open topic with navigation)
Intel Many-Integrated Cores (MIC) architecture-based device (e.g., Intel Xeon Phi™) metrics can be collected for nodes that:
MIC-based device metric tracking must be enabled in moab.cfg:
RMCFG[torque] flags=RECORDMICMETRICS
There are 11 metrics for each MIC-based device within a node. If the maximum MIC-based devices within a node is 4, you must increase the MAXGMETRIC value in moab.cfg by (maxmicdevices x micmetrics). In this case, the formula is (4 x 11) = 44, so whatever the MAXGMETRIC value is, it must be increased by 44. This way, when enabling MIC-based device metrics recording, Moab has enough GMETRIC types to accommodate the additional metrics.
The MIC-based metric names map is as follows (where X is the MIC-based device number):
Metric name as returned by pbsnodes | GMETRIC name as stored in Moab | Metric output |
---|---|---|
mic_id | micX_mic_id | The ID of the MIC-based device |
num_cores | micX_num_cores | The number of cores in the MIC-based device |
num_threads | micX_num_threads | The number of hardware threads on the MIC-based device |
physmem | micX_physmem | The total physical memory in the MIC-based device |
free_physmem | micX_free_physmem | The available physical memory in the MIC-based device |
swap | micX_swap | The total swap space on the MIC-based device |
free_swap | micX_free_swap | The unused swap space on the MIC-based device |
max_frequency | micX_max_frequency | The maximum frequency speed of any core in the MIC-based device |
isa | micX_isa | The hardware interface type of the MIC-based device |
load | micX_load | The total current load of the MIC-based device |
normalized_load | micX_normalized_load | The normalized load of the MIC-based device (total load divided by number of cores in the MIC-based device) |