5.649 Server Parameters

Torque server parameters are specified using the qmgr command. The set subcommand is used to modify the server object. For example:

> qmgr -c 'set server default_queue=batch'

5.649.1 Parameters

acl_group_hosts

acl_hosts

acl_host_enable

acl_logic_or

acl_user_hosts

allow_node_submit

allow_proxy_user

auto_node_np

automatic_requeue_exit_code

cgroup_per_task

checkpoint_defaults

clone_batch_delay

clone_batch_size

copy_on_rerun

cray_enabled

default_gpu_mode

default_queue

disable_automatic_requeue

disable_server_id_check

display_job_server_suffix

dont_write_nodes_file

down_on_error

email_batch_seconds

exit_code_canceled_job

ghost_array_recovery

gres_modifiers

idle_slot_limit

interactive_jobs_can_roam

job_exclusive_on_use

job_force_cancel_time

job_full_report_time

job_log_file_max_size

job_log_file_roll_depth

job_log_keep_days

job_nanny

job_stat_rate

job_start_timeout

job_suffix_alias

job_sync_timeout

keep_completed

kill_delay

legacy_vmem

lock_file

lock_file_update_time

lock_file_check_time

log_events

log_file_max_size

log_file_roll_depth

log_keep_days

log_level

mail_body_fmt

mail_domain

mail_from

mail_subject_fmt

managers

max_job_array_size

max_slot_limit

max_threads

max_user_queuable

max_user_run

min_threads

moab_array_compatible

mom_job_sync

next_job_number

node_check_rate

node_pack

node_ping_rate

node_submit_exceptions

no_mail_force

np_default

operators

pass_cpuclock

poll_jobs

query_other_jobs

record_job_info

record_job_script

resources_available

scheduling

submit_hosts

tcp_incoming_timeout

tcp_timeout

thread_idle_seconds

timeout_for_job_delete

timeout_for_job_requeue

use_jobs_subdirs

acl_group_hosts
Format group@host[.group@host]...
Default ---
Description Users who are members of the specified groups will be able to submit jobs from these otherwise untrusted hosts. Users who aren't members of the specified groups will not be able to submit jobs unless they are specified in acl_user_hosts.
acl_hosts
Format <HOST>[,<HOST>]... or <HOST>[range] or <HOST*> where the asterisk (*) can appear anywhere in the host name
Default Not set.
Description

Specifies a list of hosts which can have access to pbs_server when acl_host_enable is set to TRUE. This does not enable a node to submit jobs. To enable a node to submit jobs use submit_hosts.

Hosts which are in the TORQUE_HOME/server_priv/nodesfile do not need to be added to this list.

Qmgr: set queue batch acl_hosts="hostA,hostB"
Qmgr: set queue batch acl_hosts+=hostC
Qmgr: set server acl_hosts="hostA,hostB"
Qmgr: set server acl_hosts+=hostC

 

In version 2.5 and later, the wildcard (*) character can appear anywhere in the host name, and ranges are supported; these specifications also work for managers and operators.

Qmgr: set server acl_hosts = "galaxy*.tom.org"
Qmgr: set server acl_hosts += "galaxy[0-50].tom.org"
acl_host_enable
Format <BOOLEAN>
Default FALSE
Description When set to TRUE, hosts not in the pbs_server nodes file must be added to the acl_hosts list in order to get access to pbs_server.
acl_logic_or
Format <BOOLEAN>
Default FALSE
Description When set to TRUE, the user and group queue ACLs are logically OR'd. When set to FALSE, they are AND'd.
acl_user_hosts
Format group@host[.group@host]...
Default ---
Description The specified users are allowed to submit jobs from otherwise untrusted hosts. By setting this parameter, other users at these hosts will not be allowed to submit jobs unless they are members of specified groups in acl_group_hosts.
allow_node_submit
Format <BOOLEAN>
Default FALSE
Description

When set to TRUE, allows all hosts in the PBSHOME/server_priv/nodes file (MOM nodes) to submit jobs to pbs_server.

To only allow qsub from a subset of all MOMs, use submit_hosts.

allow_proxy_user
Format <BOOLEAN>
Default FALSE
Description When set to TRUE, specifies that users can proxy from one user to another. Proxy requests will be either validated by ruserok() or by the scheduler (see Job Submission).
auto_node_np
Format <BOOLEAN>
Default DISABLED
Description When set to TRUE, automatically configures a node's np (number of processors) value based on the ncpus value from the status update. Requires full manager privilege to set or alter.
automatic_requeue_exit_code
Format <LONG>
Default ---
Description This is an exit code, defined by the admin, that tells pbs_server to requeue the job instead of considering it as completed. This allows the user to add some additional checks that the job can run meaningfully, and if not, then the job script exits with the specified code to be requeued.
cgroup_per_task
Format <BOOLEAN>
Default FALSE
Description

When set to FALSE, jobs submitted with the -L syntax will have one cgroup created per host unless they specify otherwise at submission time. This behavior is similar to the pre-6.0 cpuset implementation.

When set to TRUE, jobs submitted with the -L syntax will have one cgroup created per task unless they specify otherwise at submission time.

Some MPI implementations are not compatible with using one cgroup per task.

 

See -L NUMA Resource Request for more information.

checkpoint_defaults
Format <STRING>
Default ---
Description

Specifies for a queue the default checkpoint values for a job that does not have checkpointing specified. The checkpoint_defaults parameter only takes effect on execution queues.

set queue batch checkpoint_defaults="enabled, periodic, interval=5"

clone_batch_delay
Format <INTEGER>
Default 1
Description Specifies the delay (in seconds) between clone batches (see clone_batch_size).
clone_batch_size
Format <INTEGER>
Default 256
Description Job arrays are created in batches of size X. X jobs are created, and after the clone_batch_delay, X more are created. This repeats until all are created.
copy_on_rerun
Format <BOOLEAN>
Default FALSE
Description

When set to TRUE, Torque will copy the output and error files over to the user-specified directory when the grerun command is executed (i.e. a job preemption). Output and error files are only created when a job is in running state before the preemption occurs.

pbs_server and pbs_mom need to be on the same version.

When you change the value, you must perform a pbs_server restart for the change to effect.

cray_enabled
Format <BOOLEAN>
Default FALSE
Description When set to TRUE, specifies that this instance of pbs_server has Cray hardware that reports to it. See Installation Notes for Moab and Torque for Cray in the Moab Workload Manager Administrator Guide.
default_queue
Format <STRING>
Default ---
Description Indicates the queue to assign to a job if no queue is explicitly specified by the submitter.
default_gpu_mode
Format <STRING>
Default exclusive_thread
Description

Determines what GPU mode will be used for jobs that request GPUs but do not request a GPU mode. Valid entries are exclusive_thread, exclusive, exclusive_process, default, and shared.

If you are using CUDA 8, the default of exclusive_thread is no longer supported. We recommend that you set the default to exclusive_process.

disable_automatic_requeue
Format <BOOLEAN>
Default FALSE
Description

Normally, if a job cannot start due to a transient error, the MOM returns a special exit code to the server so that the job is requeued instead of completed. When this parameter is set, the special exit code is ignored and the job is completed.

disable_server_id_check
Format <BOOLEAN>
Default FALSE
Description

When set to TRUE, makes it so the user for the job doesn't have to exist on the server. The user must still exist on all the compute nodes or the job will fail when it tries to execute.

If you have disable_server_id_check set to TRUE, a user could request a group to which they do not belong. Setting VALIDATEGROUP to TRUE in the torque.cfg file prevents such a scenario (see "torque.cfg" Configuration File).

display_job_server_suffix
Format <BOOLEAN>
Default TRUE
Description

When set to TRUE, Torque will display both the job ID and the host name. When set to FALSE, only the job ID will be displayed.

If set to FALSE, the environment variable NO_SERVER_SUFFIX must be set to TRUE for pbs_track to work as expected.

display_job_server_suffix should not be set unless the server has no queued jobs. If it is set while the server has queued jobs, it will cause problems correctly identifying job ids with all existing jobs.

dont_write_nodes_file
Format <BOOLEAN>
Default FALSE
Description

When set to TRUE, the nodes file cannot be overwritten for any reason; qmgr commands to edit nodes will be rejected.

down_on_error
Format <BOOLEAN>
Default TRUE
Description

When set to TRUE, nodes that report an error from their node health check to pbs_server will be marked down and unavailable to run jobs.

email_batch_seconds
Format <INTEGER>
Default 0
Description

If set to a number greater than 0, emails will be sent in a batch every specified number of seconds, per addressee. For example, if this is set to 300, then each user will only receive emails every 5 minutes in the most frequent scenario. The addressee would then receive one email that contains all of the information which would've been sent out individually before. If it is unset or set to 0, then emails will be sent for every email event.

exit_code_canceled_job
Format <INTEGER>
Default ---
Description

When set, the exit code provided by the user is given to any job that is canceled, regardless of the job's state at the time of cancellation.

ghost_array_recovery
Format <BOOLEAN>
Default TRUE
Description

When TRUE, array subjobs will be recovered regardless of whether the .AR file was correctly recovered. This prevents the loss of running and queued jobs. However, it may no longer enforce a per-job slot limit or handle array dependencies correctly, as some historical information will be lost. When FALSE, array subjobs will not be recovered if the .AR file is invalid or non-existent.

gres_modifiers
Format <INTEGER>
Default ---
Description

List of users granted permission to modify the gres resource of their own running jobs. Note that users do not need special permission to modify the gres resource of their own queued jobs.

idle_slot_limit
Format <INTEGER>
Default 300
Description

Sets a default idle slot limit that will be applied to all arrays submitted after it is set.

The idle slot limit is the maximum number of sub jobs from an array that will be instantiated at once. For example, if this is set to 2, and an array with 1000 sub jobs is submitted, then only two will ever be idle (queued) at a time. Whenever an idle sub job runs or is deleted, then a new sub job will be instantiated until the array no longer has remaining sub jobs.

If this parameter is set, and user during job submission (using qsub -i) requests an idle slot limit that exceeds this setting, that array will be rejected. See also the qsub -i option.

Example
qmgr -c 'set server idle_slot_limit = 50'
interactive_jobs_can_roam
Format <BOOLEAN>
Default FALSE
Description By default, interactive jobs run from the login node that they submitted from. When TRUE, interactive jobs may run on login nodes other than the one where the jobs were submitted from. See Installation Notes for Moab and Torque for Cray in the Moab Workload Manager Administrator Guide.

With interactive_jobs_can_roam enabled, jobs will only go to nodes with the alps_login property set in the nodes file.

job_exclusive_on_use
Format <BOOLEAN>
Default FALSE
Description When job_exclusive_on_use is set to TRUE, pbsnodes will show job-exclusive on a node when there's at least one of its processors running a job. This differs with the default behavior which is to show job-exclusive on a node when all of its processors are running a job.
Example
set server job_exclusive_on_use=TRUE
job_force_cancel_time
Format <INTEGER>
Default Disabled
Description If a job has been deleted and is still in the system after x seconds, the job will be purged from the system. This is mostly useful when a job is running on a large number of nodes and one node goes down. The job cannot be deleted because the MOM cannot be contacted. The qdel fails and none of the other nodes can be reused. This parameter can used to remedy such situations.
job_full_report_time
Format <INTEGER>
Default 300 seconds
Description Sets the time in seconds that a job should be fully reported after any kind of change to the job, even if condensed output was requested.
job_log_file_max_size
Format <INTEGER>
Default ---
Description This specifies a soft limit (in kilobytes) for the job log's maximum size. The file size is checked every five minutes and if the current day file size is greater than or equal to this value, it is rolled from <filename> to <filename.1> and a new empty log is opened. If the current day file size exceeds the maximum size a second time, the <filename.1> log file is rolled to <filename.2>, the current log is rolled to <filename.1>, and a new empty log is opened. Each new log causes all other logs to roll to an extension that is one greater than its current number. Any value less than 0 is ignored by pbs_server (meaning the log will not be rolled).
job_log_file_roll_depth
Format <INTEGER>
Default ---
Description This sets the maximum number of new log files that are kept in a day if the job_log_file_max_size parameter is set. For example, if the roll depth is set to 3, no file can roll higher than <filename.3>. If a file is already at the specified depth, such as <filename.3>, the file is deleted so it can be replaced by the incoming file roll, <filename.2>.
job_log_keep_days
Format <INTEGER>
Default ---
Description This maintains logs for the number of days designated. If set to 4, any log file older than 4 days old is deleted.
job_nanny
Format <BOOLEAN>
Default FALSE
Description When set to TRUE, enables the experimental "job deletion nanny" feature. All job cancels will create a repeating task that will resend KILL signals if the initial job cancel failed. Further job cancels will be rejected with the message "job cancel in progress." This is useful for temporary failures with a job's execution node during a job delete request.
job_stat_rate
Format <INTEGER>
Default 300 (30 in Torque 1.2.0p5 and earlier)
Description

If the mother superior has not sent an update by the specified time, at the specified time pbs_server requests an update on job status from the mother superior.

job_start_timeout
Format <INTEGER>
Default ---
Description Specifies the pbs_server to pbs_mom TCP socket timeout in seconds that is used when the pbs_server sends a job start to the pbs_mom. It is useful when the MOM has extra overhead involved in starting jobs. If not specified, then the tcp_timeout parameter is used.
job_suffix_alias
Format <STRING>
Default ---
Description

Allows the job suffix to be defined by the user.

job_suffix_alias should not be set unless the server has no queued jobs. If it is set while the server has queued jobs, it will cause problems correctly identifying job ids with all existing jobs.

Example
qmgr -c 'set server job_suffix_alias = biology'

When a job is submitted after this, its jobid will have .biology on the end: 14.napali.biology. If display_job_server_suffix is set to false, it would be named 14.biology.

job_sync_timeout
Format <INTEGER>
Default 60
Description When a stray job is reported on multiple nodes, the server sends a kill signal to one node at a time. This timeout determines how long the server waits between kills if the job is still being reported on any nodes.
keep_completed
Format <INTEGER>
Default

300

Description

The amount of time (in seconds) a job will be kept in the queue after it has entered the completed state. keep_completed must be set for job dependencies to work.

For more information, see Keeping Completed Jobs.

kill_delay
Format <INTEGER>
Default

If using qdel, 2 seconds

If using qrerun, 0 (no wait)

Description

Specifies the number of seconds between sending a SIGTERM and a SIGKILL to a job you want to cancel. It is possible that the job script, and any child processes it spawns, can receive several SIGTERM signals before the SIGKILL signal is received.

All MOMs must be configured with $exec_with_exec true in order for kill_delay to work, even when relying on default kill_delay settings.

If kill_delay is set for a queue, the queue setting overrides the server setting. See kill_delay in 5.645 Queue Attributes.

Example

qmgr -c "set server kill_delay=30"

legacy_vmem
Format <BOOLEAN>
Default

FALSE

Description

When set to true, the vmem request will be the amount of memory requested for each node of the job. When it is unset or false, vmem will be the amount of memory for the entire job and will be divided accordingly

lock_file
Format <STRING>
Default torque/server_priv/server.lock
Description

Specifies the name and location of the lock file used to determine which high availability server should be active.

If a full path is specified, it is used verbatim by Torque. If a relative path is specified, Torque will prefix it with torque/server_priv.

lock_file_update_time
Format <INTEGER>
Default 3
Description Specifies how often (in seconds) the thread will update the lock file.
lock_file_check_time
Format <INTEGER>
Default 9
Description Specifies how often (in seconds) a high availability server will check to see if it should become active.
log_events
Format Bitmap
Default ---
Description

By default, all events are logged. However, you can customize things so that only certain events show up in the log file. These are the bitmaps for the different kinds of logs:

#define PBSEVENT_ERROR 0x0001 /* internal errors */
#define PBSEVENT_SYSTEM 0x0002 /* system (server) events */
#define PBSEVENT_ADMIN 0x0004 /* admin events */
#define PBSEVENT_JOB 0x0008 /* job related events */
#define PBSEVENT_JOB_USAGE 0x0010 /* End of Job accounting */
#define PBSEVENT_SECURITY 0x0020 /* security violation events */
#define PBSEVENT_SCHED 0x0040 /* scheduler events */
#define PBSEVENT_DEBUG 0x0080 /* common debug messages */
#define PBSEVENT_DEBUG2 0x0100 /* less needed debug messages */
#define PBSEVENT_FORCE 0x8000 /* set to force a message */

 

If you want to log only error, system, and job information, use qmgr to set log_events to 11:

set server log_events = 11

log_file_max_size
Format <INTEGER>
Default 0
Description Specifies a soft limit, in kilobytes, for the server's log file. The file size is checked every 5 minutes, and if the current day file size is greater than or equal to this value then it will be rolled from X to X.1 and a new empty log will be opened. Any value less than or equal to 0 will be ignored by pbs_server (the log will not be rolled).
log_file_roll_depth
Format <INTEGER>
Default 1
Description Controls how deep the current day log files will be rolled, if log_file_max_size is set, before they are deleted.
log_keep_days
Format <INTEGER>
Default 0
Description Specifies how long (in days) a server or MOM log should be kept.
log_level
Format <INTEGER>
Default 0
Description Specifies the pbs_server logging verbosity. Maximum value is 7.
mail_body_fmt
Format A printf-like format string
Default PBS Job Id: %i Job Name: %j Exec host: %h %m %d
Description

Override the default format for the body of outgoing mail messages. A number of printf-like format specifiers and escape sequences can be used:

\n new line
\t tab
\\ backslash
\' single quote
\" double quote
%d details concerning the message
%h PBS host name
%i PBS job identifier
%j PBS job name
%m long reason for message
%r short reason for message
%% a single %

mail_domain
Format <STRING>
Default ---
Description Override the default domain for outgoing mail messages. If set, emails will be addressed to <user>@<hostdomain>. If unset, the job's Job_Owner attribute will be used. If set to never, Torque will never send emails.
mail_from
Format <STRING>
Default adm
Description Specify the name of the sender whenTorquesends emails.
mail_subject_fmt
Format A printf-like format string
Default PBS JOB %i
Description

Override the default format for the subject of outgoing mail messages. A number of printf-like format specifiers and escape sequences can be used:

\n new line
\t tab
\\ backslash
\' single quote
\" double quote
%d details concerning the message
%h PBS host name
%i PBS job identifier
%j PBS job name
%m long reason for message
%r short reason for message
%% a single %

managers
Format <user>@<host.sub.domain>[,<user>@<host.sub.domain>...]
Default root@localhost
Description List of users granted batch administrator privileges. The host, sub-domain, or domain name may be wildcarded by the use of an asterisk character (*). Requires full manager privilege to set or alter.
max_job_array_size
Format <INTEGER>
Default Unlimited
Description Sets the maximum number of jobs that can be in a single job array.
max_slot_limit
Format <INTEGER>
Default Unlimited
Description

This is the maximum number of jobs that can run concurrently in any job array. Slot limits can be applied at submission time with qsub, or it can be modified with qalter.

qmgr -c 'set server max_slot_limit=10'

No array can request a slot limit greater than 10. Any array that does not request a slot limit receives a slot limit of 10. Using the example above, slot requests greater than 10 are rejected with the message: "Requested slot limit is too large, limit is 10."

max_user_run
Format <INTEGER>
Default unlimited
Description

This limits the maximum number of jobs a user can have running for the given server.

Example
qmgr -c "set server max_user_run=5"
max_threads
Format <INTEGER>
Default The value of min_threads ((2 * the number of procs listed in /proc/cpuinfo) + 1) * 20
Description This is the maximum number of threads that should exist in the thread pool at any time. See Setting min_threads and max_threads for more information.
max_user_queuable
Format <INTEGER>
Default Unlimited
Description

When set, max_user_queuable places a system-wide limit on the amount of jobs that an individual user can queue.

qmgr -c 'set server max_user_queuable=500'

min_threads
Format <INTEGER>
Default (2 * the number of procs listed in /proc/cpuinfo) + 1. If Torque is unable to read /proc/cpuinfo, the default is 10.
Description This is the minimum number of threads that should exist in the thread pool at any time. See Setting min_threads and max_threads for more information.
moab_array_compatible
Format <BOOLEAN>
Default TRUE
Description This parameter places a hold on jobs that exceed the slot limit in a job array. When one of the active jobs is completed or deleted, one of the held jobs goes to a queued state.
mom_job_sync
Format <BOOLEAN>
Default TRUE
Description

When set to TRUE, specifies that the pbs_server will synchronize its view of the job queue and resource allocation with compute nodes as they come online. If a job exists on a compute node, it will be automatically cleaned up and purged. (Enabled by default in Torque 2.2.0 and higher.)

Jobs that are no longer reported by the mother superior are automatically purged by pbs_server. Jobs that pbs_server instructs the MOM to cancel have their processes killed in addition to being deleted (instead of leaving them running as in versions of Torque prior to 4.1.1).

next_job_number
Format <INTEGER>
Default ---
Description

Specifies the ID number of the next job. If you set your job number too low and Torque repeats a job number that it has already used, the job will fail. Before setting next_job_number to a number lower than any number that Torque has already used, you must clear out your .e and .o files.

If you use Moab Workload Manager (and have configured it to synchronize job IDs with Torque), then Moab will generate the job ID and next_job_number will have no effect on the job ID. See Resource Manager Configuration in the Moab Workload Manager Administrator Guide for more information.

node_check_rate
Format <INTEGER>
Default 600
Description Specifies the minimum duration (in seconds) that a node can fail to send a status update before being marked down by the pbs_server daemon.
node_pack
Description This is deprecated.
node_ping_rate
Format <INTEGER>
Default 300
Description Specifies the maximum interval (in seconds) between successive "pings" sent from the pbs_server daemon to the pbs_mom daemon to determine node/daemon health.
node_submit_exceptions
Format String
Default ---
Description When set in conjunction with allow_node_submit, these nodes will not be allowed to submit jobs.
no_mail_force
Format <BOOLEAN>
Default FALSE
Description When set to TRUE, eliminates all e-mails when mail_options (see qsub) is set to "n". The job owner won't receive e-mails when a job is deleted by a different user or a job failure occurs. If no_mail_force is unset or is FALSE, then the job owner receives e-mails when a job is deleted by a different user or a job failure occurs.
np_default
Format <INTEGER>
Default ---
Description

Allows the administrator to unify the number of processors (np) on all nodes. The value can be dynamically changed. A value of 0 tells pbs_server to use the value of np found in the nodes file. The maximum value is 32767.

np_default sets a minimum number of np per node. Nodes with less than the np_default get additional execution slots.

operators
Format <user>@<host.sub.domain>[,<user>@<host.sub.domain>...]
Default root@localhost
Description List of users granted batch operator privileges. Requires full manager privilege to set or alter.
pass_cpuclock
Format <BOOLEAN>
Default TRUE
Description

If set to TRUE, the pbs_server daemon passes the option and its value to the pbs_mom daemons for direct implementation by the daemons, making the CPU frequency adjustable as part of a resource request by a job submission.

If set to FALSE, the pbs_server daemon creates and passes a PBS_CPUCLOCK job environment variable to the pbs_mom daemons that contains the value of the cpuclock attribute used as part of a resource request by a job submission. The CPU frequencies on the MOMs are not adjusted. The environment variable is for use by prologue and epilogue scripts, enabling administrators to log and research when users are making cpuclock requests, as well as researchers and developers to perform CPU clock frequency changes using a method outside of that employed by the Torque pbs_mom daemons.

poll_jobs
Format <BOOLEAN>
Default TRUE (FALSE in Torque 1.2.0p5 and earlier)
Description

If set to TRUE, pbs_server will poll job info from MOMs over time and will not block on handling requests which require this job information.

If set to FALSE, no polling will occur and if requested job information is stale, pbs_server may block while it attempts to update this information. For large systems, this value should be set to TRUE.

query_other_jobs
Format <BOOLEAN>
Default FALSE
Description When set to TRUE, specifies whether or not non-admin users may view jobs they do not own.
record_job_info
Format <BOOLEAN>
Default FALSE
Description This must be set to TRUE in order for job logging to be enabled.
record_job_script
Format <BOOLEAN>
Default FALSE
Description

If set to TRUE, this adds the contents of the script executed by a job to the log.

For record_job_script to take effect, record_job_info must be set to TRUE.

resources_available
Format <STRING>
Default ---
Description Allows overriding of detected resource quantities (see Assigning Queue Resource Limits). pbs_server must be restarted for changes to take effect. Also, resources_available is constrained by the smallest of queue.resources_available and the server.resources_available.
scheduling
Format <BOOLEAN>
Default ---
Description Allows pbs_server to be scheduled. When FALSE, pbs_server is a resource manager that works on its own. When TRUE, Torque allows a scheduler, such as Moab or Maui, to dictate what pbs_server should do.
submit_hosts
Format <HOSTNAME>[,<HOSTNAME>]...
Default Not set.
Description

Hosts in this list are able to submit jobs. This applies to any node whether within the cluster or outside of the cluster.

If acl_host_enable is set to TRUE and the host is not in the PBSHOME/server_priv/nodes file, then the host must also be in the acl_hosts list.

To allow qsub from all compute nodes instead of just a subset of nodes, use allow_node_submit.

tcp_incoming_timeout
Format <INTEGER>
Default 600
Description

Specifies the timeout for incoming TCP connections to pbs_server. Functions exactly the same as tcp_timeout, but governs incoming connections while tcp_timeout governs only outgoing connections (or connections initiated by pbs_server).

If you use Moab Workload Manager, prevent communication errors by giving tcp_incoming_timeout at least twice the value of the Moab RMPOLLINTERVAL. See RMPOLLINTERVAL for more information.

tcp_timeout
Format <INTEGER>
Default 300
Description

Specifies the timeout for idle outbound TCP connections. If no communication is received by the server on the connection after the timeout, the server closes the connection. There is an exception for connections made to the server on port 15001 (default); timeout events are ignored on the server for such connections established by a client utility or scheduler. Responsibility rests with the client to close the connection first (See Large Cluster Considerations for additional information.).

 

Use tcp_incoming_timeout to specify the timeout for idle inbound TCP connections.

thread_idle_seconds
Format <INTEGER>
Default 300
Description This is the number of seconds a thread can be idle in the thread pool before it is deleted. If threads should not be deleted, set to -1. Torque will always maintain at least min_threads number of threads, even if all are idle.
timeout_for_job_delete
Format <INTEGER> (seconds)
Default 120
Description The specific timeout used when deleting jobs because the node they are executing on is being deleted.
timeout_for_job_requeue
Format <INTEGER> (seconds)
Default 120
Description The specific timeout used when requeuing jobs because the node they are executing on is being deleted.
use_jobs_subdirs
Format <BOOLEAN>
Default Not set (FALSE).
Description

Lets an administrator direct the way pbs_server will store its job-related files.

  • When use_jobs_subdirs is unset (or set to FALSE), job and job array files will be stored directly under $PBS_HOME/server_priv/jobs and $PBS_HOME/server_priv/arrays.
  • When use_jobs_subdirs is set to TRUE, job and job array files will be distributed over 10 subdirectories under their respective parent directories. This method helps to keep a smaller number of files in a given directory.

    This setting does not automatically move existing job and job array files into the respective subdirectories. If you choose to use this setting (TRUE), you must first

    • set use_jobs_subdirs to TRUE,
    • shutdown the Torque server daemon,
    • in the contrib directory, run the "use_jobs_subdirs_setup" python script with -m option,
    • start the Torque server daemon.

© 2016 Adaptive Computing