(Click to open topic with navigation)
TORQUE provides the ability to report on the status of completed jobs for a configurable duration after the job has completed. This can be enabled by setting the keep_completed attribute on the job execution queue or the keep_completed parameter on the server. This should be set to the number of seconds that jobs should be held in the queue. If you set keep_completed on the job execution queue, completed jobs will be reported in the C state and the exit status is seen in the exit_status job attribute.
If the Mother Superior and TORQUE server are on the same server, expect the following behavior:
By maintaining status information about completed (or canceled, failed, etc.) jobs, administrators can better track failures and improve system performance. This allows TORQUE to better communicate with Moab Workload Manager and track the status of jobs. This gives Moab the ability to track specific failures and to schedule the workload around possible hazards. (See NODEFAILURERESERVETIME in "Appendix F: Parameters" of the Moab Workload Manager Administrator Guide for more information.)
Related topics