(Click to open topic with navigation)
N.158.1 Synopsis
checknode options nodeID
ALL
N.158.2 Overview
This command shows detailed state information and statistics for nodes that run jobs.
The following information is returned by this command:
Name | Description |
---|---|
ACL | Node Access Control List (displayed only if set) |
ActiveTime | Total time node has been busy (allocated to active jobs) since statistics initialization expressed in HH:MM:SS notation (percent of time busy: BusyTime/TotalTime) |
Adapters | Network adapters available |
Arch | Architecture |
Classes | Classes available |
Disk | Disk space available |
Downtime | Displayed only if downtime is scheduled |
EffNodeAccessPolicy | Configured effective node access policy |
Features | Features available |
Load | CPU Load (Berkley one-minute load average) |
Memory | Memory available |
Opsys | Operating system |
RequestID | Dynamic Node RequestID set by the RM (displayed only if set) |
State | Node state |
StateTime | Time node has been in current state in HH:MM:SS notation |
Swap | Swap space available |
TotalTime | Total time node has been detected since statistics initialization expressed in HH:MM:SS notation |
TTL | Dynamic Node Time To Live set by the RM (expiration date, displayed only if set) |
UpTime | Total time node has been in an available (Non-Down) state since statistics initialization expressed in HH:MM:SS notation (percent of time up: UpTime/TotalTime) |
After displaying this information, some analysis is performed and any unusual conditions are reported.
N.158.3 Access
By default, this command can be run by any Moab Administrator (see ADMINCFG).
N.158.4 Parameters
Name | Description |
---|---|
NODE | Node name you want to check. Moab uses regular expressions to return any node that contains the provided argument. For example, if you ran checknode node1, Moab would return information about node1, node10, node100, etc. If you want to limit the results to node1 only, you would run checknode "^node1$". |
N.158.5 Flags
Name | Description |
---|---|
ALL | Returns checknode output on all nodes in the cluster. |
-h | Help for this command. |
-v | Returns verbose output. |
--xml | Output in XML format. Same as mdiag -n --xml. |
Example N-143: checknode
> checknode P690-032 node P690-032 State: Busy (in current state for 11:31:10) Configured Resources: PROCS: 1 MEM: 16G SWAP: 2000M DISK: 500G Utilized Resources: PROCS: 1 Dedicated Resources: PROCS: 1 Opsys: AIX Arch: P690 Speed: 1.00 CPULoad: 1.000 Network: InfiniBand,Myrinet Features: Myrinet Attributes: [Batch] Classes: [batch] Total Time: 5:23:28:36 Up: 5:23:28:36 (100.00%) Active: 5:19:44:22 (97.40%) Reservations: Job '13678'(x1) 10:16:12:22 -> 12:16:12:22 (2:00:00:00) Job '13186'(x1) -11:31:10 -> 1:12:28:50 (2:00:00:00) Jobs: 13186
Example N-144: checknode ALL
> checknode ALL node ahe State: Idle (in current state for 00:00:30) Configured Resources: PROCS: 12 MEM: 8004M SWAP: 26G DISK: 1M Utilized Resources: PROCS: 1 SWAP: 4106M Dedicated Resources: --- MTBF(longterm): INFINITY MTBF(24h): INFINITY Opsys: linux Arch: --- Speed: 1.00 CPULoad: 1.400 Flags: rmdetected Classes: [batch] RM[ahe]*: TYPE=PBS EffNodeAccessPolicy: SHARED Total Time: 00:01:44 Up: 00:01:44 (100.00%) Active: 00:00:00 (0.00%) Reservations: --- node ahe-ubuntu32 State: Running (in current state for 00:00:05) Configured Resources: PROCS: 12 MEM: 2013M SWAP: 3405M DISK: 1M Utilized Resources: PROCS: 6 SWAP: 55M Dedicated Resources: PROCS: 6 MTBF(longterm): INFINITY MTBF(24h): INFINITY Opsys: linux Arch: --- Speed: 1.00 CPULoad: 2.000 Flags: rmdetected Classes: [batch] RM[ahe]*: TYPE=PBS EffNodeAccessPolicy: SHARED Total Time: 00:01:44 Up: 00:01:44 (100.00%) Active: 00:00:02 (1.92%) Reservations: 6x2 Job:Running -00:00:07 -> 00:01:53 (00:02:00) 7x2 Job:Running -00:00:06 -> 00:01:54 (00:02:00) 8x2 Job:Running -00:00:05 -> 00:01:55 (00:02:00) Jobs: 6,7,8 node ahe-ubuntu64 State: Busy (in current state for 00:00:06) Configured Resources: PROCS: 12 MEM: 2008M SWAP: 3317M DISK: 1M Utilized Resources: PROCS: 12 SWAP: 359M Dedicated Resources: PROCS: 12 MTBF(longterm): INFINITY MTBF(24h): INFINITY Opsys: linux Arch: --- Speed: 1.00 CPULoad: 0.000 Flags: rmdetected Classes: [batch] RM[ahe]*: TYPE=PBS EffNodeAccessPolicy: SHARED Total Time: 00:01:44 Up: 00:01:44 (100.00%) Active: 00:00:55 (52.88%) Reservations: 0x2 Job:Running -00:01:10 -> 00:00:50 (00:02:00) 1x2 Job:Running -00:00:20 -> 00:01:40 (00:02:00) 2x2 Job:Running -00:00:20 -> 00:01:40 (00:02:00) 3x2 Job:Running -00:00:17 -> 00:01:43 (00:02:00) 4x2 Job:Running -00:00:13 -> 00:01:47 (00:02:00) 5x2 Job:Running -00:00:07 -> 00:01:53 (00:02:00) Jobs: 0,1,2,3,4,5 ALERT: node is in state Busy but load is low (0.000)
Example N-145: checknode n001 (Dynamic Node)
> checknode node001 node node001 State: Idle (in current state for 00:13:50) Configured Resources: PROCS: 2 MEM: 4096M Utilized Resources: PROCS: 2 Dedicated Resources: --- ACL: USER==FRED+:==BOB+ GROUP==DEV+ MTBF(longterm): INFINITY MTBF(24h): INFINITY Opsys: --- Arch: --- Speed: 1.00 CPULoad: 2.000 Partition: local Rack/Slot: --- NodeIndex: 1 RM[local]*: TYPE=NATIVE:AGFULL EffNodeAccessPolicy: SHARED RequestID: 1234 TTL: Tue Nov 10 00:00:00 2015 Total Time: 2:21:19:05 Up: 2:21:19:05 (100.00%) Active: 00:00:00 (0.00%) Reservations: node001-TTL-1234x1 User 441days -> INFINITY ( INFINITY) Blocked Resources@ 441days Procs: 2/2 (100.00%) Mem: 4096/4096 (100.00%) Swap: 1/1 (100.00%) Disk: 1/1 (100.00%) ALERT: node is in state Idle but load is high (2.000)
Related Topics