20.0 Accelerators > Configuring Intel® Xeon Phi™ Co-processor Architecture

Conventions

20.5 Intel® Xeon Phi™ Coprocessor Configuration

20.5-A Intel Many-Integrated Cores (MIC) architecture configuration

If you use an Intel Many-Integrated Cores (MIC) architecture-based product (e.g., Intel Xeon Phi™) in your cluster for parallel processing, you must configure TORQUE to detect them.

Prerequisites

Setup Options

There are two ways to configure MIC-based devices with TORQUE: (1) manually and (2) by auto-detection.

Manual configuration

napali np=12 mics=2

Auto-detect

When you use auto-detection, pbs_mom discovers the MIC-based devices and reports them to pbs_server.

./configure --enable-mics <other configure options>

20.5-B Validating the configuration

TORQUE

pbsnodes

Example 20-2: pbsnodes output

slesmic
	state = free
	np = 100
	ntype = cluster
	status = rectime=1347634381,varattr=,jobs=,state=free,netload=7442004852,gres=,loadave=0.00,ncpus=32,physmem=65925692kb,availmem=66531344kb,totmem=68028984kb,idletime=59059,nusers=2,nsessions=8,sessions=4387 4391 4392 4436 4439 4443 4459 100395,uname=Linux slesmic 3.0.13-0.27-default #1 SMP Wed Feb 15 13:33:49 UTC 2012 (d73692b) x86_64,opsys=linux
	mom_service_port = 15002
	mom_manager_port = 15003
	mics = 2
	mic_status = mic[1]=mic_id=8796;num_cores=61;num_threads=244;physmem=8065748992;free_physmem=7854972928;swap=0;free_swap=0;max_frequency=1090;isa=COI_ISA_KNC;load=0.000000;normalized_load=0.000000;,mic[0]=mic_id=8796;num_cores=61;num_threads=244;physmem=8065748992;free_physmem=7872712704;swap=0;free_swap=0;max_frequency=1090;isa=COI_ISA_KNC;load=0.540000;normalized_load=0.008852;

rhmic.ac
	state = free
	np = 100
	ntype = cluster
	status = rectime=1347634381,varattr=,jobs=,state=free,netload=3006171583,gres=,loadave=0.00,ncpus=32,physmem=65918268kb,availmem=66901588kb,totmem=67982644kb,idletime=59477,nusers=2,nsessions=2,sessions=3401 29320,uname=Linux rhmic.ac 2.6.32-220.el6.x86_64 #1 SMP Tue Dec 6 19:48:22 GMT 2011 x86_64,opsys=linux
	mom_service_port = 15002
	mom_manager_port = 15003
	mics = 1
	mic_status = mic[0]=mic_id=8796;num_cores=61;num_threads=244;physmem=8065748992;free_physmem=7872032768;swap=0;free_swap=0;max_frequency=1090;isa=COI_ISA_KNC;load=0.540000;normalized_load=0.008852;<mic_status>;

Moab

mdiag -n -v

Example 20-3: mdiag -n -v output

$ mdiag -n -v 
compute node summary
Name                    State   Procs      Memory         Disk          Swap      Speed   Opsys   Arch Par   Load Classes                        Features              

hola                     Idle    4:4      8002:8002        1:1       10236:13723   1.00   linux      - hol   0.24 [batch]                       -                    GRES=MICS:2,
-----                     ---    4:4      8002:8002        1:1       10236:13723  

Total Nodes: 1  (Active: 0  Idle: 1  Down: 0)

checknode -v

Example 20-4: checknode output

$ checknode slesmic
node slesmic

State:      Idle  (in current state for 00:00:16)
Configured Resources: PROCS: 100  MEM: 62G  SWAP: 64G  DISK: 1M  MICS: 2
Utilized   Resources: SWAP: 1581M
Dedicated  Resources: ---
Generic Metrics:    mic1_mic_id=8796.00,mic1_num_cores=61.00,mic1_num_threads=244.00,mic1_physmem=8065748992.00,mic1_free_physmem=7854972928.00,mic1_swap=0.00,mic1_free_swap=0.00,mic1_max_frequency=1090.00,mic1_load=0.12,mic1_normalized_load=0.00,mic0_mic_id=8796.00,mic0_num_cores=61.00,mic0_num_threads=244.00,mic0_physmem=8065748992.00,mic0_free_physmem=7872679936.00,mic0_swap=0.00,mic0_free_swap=0.00,mic0_max_frequency=1090.00
  MTBF(longterm):   INFINITY  MTBF(24h):   INFINITY
Opsys:      linux     Arch:      ---   
Speed:      1.00      CPULoad:   0.000
Classes:    [batch]
RM[napali]* TYPE=PBS
EffNodeAccessPolicy: SHARED

Total Time: 3:45:43  Up: 3:45:43 (100.00%)  Active: 00:00:00 (0.00%)

Reservations:
  ---

20.5-C Job submission

Syntax

Example 20-5: Request MIC-based device(s) in qsub

qsub .... -l nodes=X:mics=Y

Because these resources are delimited with a colon, this command requests a job with X nodes and Y mics per task. If you run the same command and delimit the resources with a comma (qsub .... -l nodes=X,mics=Y), you request a job with X nodes and Y mics per job.

qstat -f

Example 20-6: qstat -f output

Job Id: 5271.napali
Job_Name = STDIN
Job_Owner = dbeer@napali
job_state = Q
queue = batch
server = napali
Checkpoint = u
ctime = Fri Sep 14 08:56:33 2012
Error_Path = napali:/home/dbeer/dev/private-torque/trunk/STDIN.e5271
Hold_Types = n
Join_Path = oe
Keep_Files = n
Mail_Points = a
mtime = Fri Sep 14 08:56:33 2012
Output_Path = napali:/home/dbeer/dev/private-torque/trunk/STDIN.o5271
Priority = 0
qtime = Fri Sep 14 08:56:33 2012
Rerunable = True
Resource_List.neednodes = 1:mics=1
Resource_List.nodect = 1
Resource_List.nodes = 1:mics=1
substate = 10
Variable_List = PBS_O_QUEUE=batch,PBS_O_HOME=/home/dbeer,
	PBS_O_LOGNAME=dbeer,
	PBS_O_PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/b
	in:/usr/games,PBS_O_MAIL=/var/mail/dbeer,PBS_O_SHELL=/bin/bash,
	PBS_O_LANG=en_US.UTF-8,
	PBS_O_SUBMIT_FILTER=/usr/local/sbin/torque_submitfilter,
	PBS_O_WORKDIR=/home/dbeer/dev/private-torque/trunk,PBS_O_HOST=napali,
	PBS_O_SERVER=napali
euser = dbeer
egroup = company
queue_rank = 3
queue_type = E
etime = Fri Sep 14 08:56:33 2012
submit_args = -l nodes=1:mics=1
fault_tolerant = False
job_radix = 0
submit_host = napali

checkjob -v

Example 20-7: checkjob -v output

dthompson@mahalo:~/dev/moab-test/trunk$ checkjob -v 2
job 2 (RM job '2.mahalo')

AName: STDIN
State: Idle 
Creds:  user:dthompson  group:dthompson  class:batch
WallTime:   00:00:00 of 1:00:00
SubmitTime: Thu Sep 13 17:06:06
(Time Queued  Total: 00:00:24  Eligible: 00:00:02)

TemplateSets:  DEFAULT
Total Requested Tasks: 1

Req[0]  TaskCount: 1  Partition: ALL
Dedicated Resources Per Task: PROCS: 1  MICS: 1

...