(Click to open topic with navigation)
You can stage data in an environment where multiple instances of Moab run in a grid configuration. For this type of data staging, each cluster utilizes a shared file system with all compute nodes. This type of data staging will make data available to a job that will run on a set of nodes in one of the clusters in the grid. You must specify where the remote data can be obtained with a username, host name, and path to a file or directory and where on the shared storage Moab will store the data. The remote data source location is known at job submission time, but you must use the $CLUSTERHOST placeholder for the host name of the data transfer server on which the job will be scheduled. After the job runs, you can also copy data from the cluster shared file system to a remote file system.
Note that you cannot stage data to or from a local compute node with its own local storage in a grid environment.
Image 24-2: Data staging in a grid |
Click to enlarge |
To stage data to or from a shared file system in a grid
Create your job templates for data staging jobs in moab.cfg. The templates in the example below create a compute job that stages data in before it starts and stages data out when it completes. For more information about creating job templates, see About Job Templates.
Add FLAGS=GRESONLY to indicate that this data staging job does not require any compute resources.
If you use the rsync protocol, you can configure your data staging jobs to report the actual number of bytes transferred and the total data size to be transferred. To do so, use the Sets attribute to ^BYTES_IN.^DATA_SIZE_IN for stage in jobs and ^BYTES_OUT.^DATA_SIZE_OUT for stage out jobs. For example, a stage in trigger would look like the following:
JOBCFG[dsin] TRIGGER=EType=start,AType=exec,Action="/opt/moab/tools/data-staging/ds_move_rsync --stagein",Flags=objectxmlstdin:user:attacherror,Sets=^BYTES_IN.^DATA_SIZE_IN
A stage out trigger would look like the following:
JOBCFG[dsout] TRIGGER=EType=start,AType=exec,Action="/opt/moab/tools/data-staging/ds_move_rsync --stageout",Flags=objectxmlstdin:user:attacherror,Sets=^BYTES_OUT.^DATA_SIZE_OUT
These variables show up as events if you set your WIKIEVENTS parameter to TRUE.
JOBCFG[ds] TEMPLATEDEPEND=AFTEROK:dsin TEMPLATEDEPEND=BEFORE:dsout SELECT=TRUE
JOBCFG[dsin] DATASTAGINGSYSJOB=TRUE
JOBCFG[dsin] GRES=bandwidth:2
JOBCFG[dsin] FLAGS=GRESONLY
JOBCFG[dsin] TRIGGER=EType=start,AType=exec,Action="/opt/moab/tools/data-staging/ds_move_rsync --stagein",Flags=attacherror:objectxmlstdin:user
JOBCFG[dsout] DATASTAGINGSYSJOB=TRUE
JOBCFG[dsout] GRES=bandwidth:2
JOBCFG[dsout] FLAGS=GRESONLY
JOBCFG[dsout] TRIGGER=EType=start,AType=exec,Action="/opt/moab/tools/data-staging/ds_move_rsync --stageout",Flags=attacherror:objectxmlstdin:user
Note that if you do not know the cluster where the job will run but want the data staged to the same location, you can use the $CLUSTERHOST variable in place of a host. If you choose to use the $CLUSTERHOST variable, you must first customize the ds_config.py file. For more information, see Configuring the $CLUSTERHOST variable.
If the destination partition is down or does not have configured resources, the data staging workflow submission will fail.
> msub ... --stagein=annasmith@labs:/patient-022678/%\$CLUSTERHOST:/davidharris/research/patientrecords <jobScript>
Moab copies the /patient-022678 directory from the hospital's labs server to the cluster where the job will run prior to job start.
<path>/<fileName>
of the file. The file must contain at least one line with this format: [<user>@]<host>:/<path>[<fileName>]. See Staging multiple files or directories for more information.If the destination partition is down or does not have configured resources, the data staging workflow submission will fail.
> msub ... --stageinfile=/davidharris/research/recordlist <jobScript>
Moab copies all files specified in the /davidharris/research/recordlist file to the cluster where the job will run prior to job start.
/davidharris/research/recordlist:
annasmith@labs:/patient-022678/tests/blood02282014%$CLUSTERHOST:/davidharris/research/patientrecords/blood02282014
annasmith@labs:/patient-022678/visits/stats02032014%$CLUSTERHOST:/davidharris/research/patientrecords/stats02032014
annasmith@labs:/patient-022678/visits/stats02142014%$CLUSTERHOST:/davidharris/research/patientrecords/stats02142014
annasmith@labs:/patient-022678/visits/stats02282014%$CLUSTERHOST:/davidharris/research/patientrecords/stats02282014
annasmith@labs:/patient-022678/visits/stats03032014%$CLUSTERHOST:/davidharris/research/patientrecords/stats03032014
annasmith@labs:/patient-022678/visits/stats03142014%$CLUSTERHOST:/davidharris/research/patientrecords/stats03142014
annasmith@labs:/patient-022678/visits/stats03282014%$CLUSTERHOST:/davidharris/research/patientrecords/stats03282014
Moab copies the seven patient record files from the hospital's labs server to the cluster where the job will run prior to job start.
The --stageinsize/--stageoutsize option lets you specify the estimated size of the files and/or directories to help Moab more quickly and accurately calculate the amount of time it will take to stage the data and therefore schedule your job correctly. If you used the $CLUSTERHOST variable to stage in, then setting --stageinsize is required. --stageoutsize is always required for staging data out. If you provide an integer, Moab will assume the number is in megabytes. To change the unit, add another suffix. See Stage in or out file size for more information.
> msub ... --stageinfile=/davidharris/research/recordlist --stageinsize=100 <jobScript>
Moab copies the /davidharris/research/recordlist file, which is approximately 100 megabytes, from the biology node to the host where the job will run prior to job start.
To see the status, errors, and other details associated with your data staging job, run checkjob -v. See "checkjob" for details.
Related Topics