Using the Verification Cluster¶
The Verification Cluster
is a cluster of servers run by NCSA for LSST DM development work.
To get an account, see the Onboarding Checklist.
This page is designed help you get started on the Verification Cluster
:
- Overview of the Verification Cluster
- Account Password
- GPFS Directory Spaces
- Shared Software Stack in GPFS
- SLURM Job Submission
Overview of the Verification Cluster¶
lsst-dev01
is a system with 24 physical cores, 256 GB RAM, running the latest CentOS 7.x that serves as the front end of the
Verification Cluster
. lsst-dev01
is described in further detail at Using the lsst-dev Server and on the
page of available development servers .
The Verification Cluster
consists of 48 Dell C6320 nodes with 24 physical cores (2 sockets, 12 cores per processor) and 128 GB RAM. As such, the system provides a total of 1152 physical cores. Due to hyperthreading, each physical core appears as 2 Linux cores, so users can access 48 logical or virtual cores per node.
The Verification Cluster
runs the Simple Linux Utility for Resource Management (SLURM) cluster management and job scheduling system. lsst-dev01
runs the SLURM controller and serves as the login or head node , enabling LSST DM users to submit SLURM jobs to the Verification Cluster
.
lsst-dev01
and the Verification Cluster
utilize the General Parallel File System (GPFS) to provided shared-disk across all of the nodes. The GPFS will have spaces for archived datasets and scratch space per user to support computation/analysis.
The legacy NFS /home directories are available on the front end lsst-dev01
(serving as the current
home directories), but are not mounted on the compute nodes of the Verification Cluster
.
Report system issues by filing a JIRA ticket in the IT Helpdesk Support (IHS) project.
Account Password¶
You can log into LSST development servers at NCSA such as lsst-dev01
with your NCSA account and password. You can reset your NCSA password at the following URL:
GPFS Directory Spaces¶
GPFS is available on the login node lsst-dev01
and on all of the compute nodes of the Verification Cluster
. For convenience the bind mounts /scratch
, /project
, /datasets
, and /software
have been created to provide views into corresponding spaces in GPFS. Users will find directories
/scratch/<username>
ready and available for use. The per user /scratch
space is volatile with a 180 day purge policy
and is not backed up.
The /project
space is volatile, not backed up, and has no purge policy; cooperative self-management of this space for longer-lived but non-permanent files is expected.
As examples, /project/<username>
directories could be created, or, for shared data, /project/<projectname>
directories might be appropriate.
Project managed datasets will be stored within the /datasets
space. The population of
/datasets
with reference data collections is still in the early stages; a first
example is the SDSS DR7 Stripe82 data, which can be found at
/datasets/stripe82/dr7/runs
To add/change/delete datasets, see Common Dataset Organization and Policy.
SLURM Job Submission¶
Documentation on using SLURM client commands and submitting jobs may be found
at standard locations (e.g., a quickstart guide).
In addition to the basic SLURM client commands, there are higher level tools
that can serve to distribute jobs to a SLURM cluster, with one example being
the combination of pipe_drivers and
ctrl_pool within LSST DM.
For exhaustive documentation and specific use cases, we refer the user
to such resources. On this page we display some simple examples for
getting started with submitting jobs to the Verification Cluster
.
The Verification Cluster
SLURM is configured with 2 queues (partitions):
- normal: 45 nodes, no run time limit. For runs after your code is debugged. Default.
- debug: 3 nodes, 30 min run time limit. For short testing & debugging runs.
The normal
queue is the default, so any debug
jobs will need to be told to run in the debug queue. This can be done by adding -p debug
to your sbatch command line, or adding the following to your job’s batch file:
#SBATCH -p debug
To examine the current state and availability of the nodes in the Verification Cluster
,
one can use the SLURM command sinfo
:
% sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
normal* up infinite 15 alloc lsst-verify-worker[02-16]
normal* up infinite 30 idle lsst-verify-worker[01,17-45]
debug up 30:00 1 drain* lsst-verify-worker48
debug up 30:00 2 idle lsst-verify-worker[46-47]
% sinfo -N -l --states="idle"
Wed Jan 31 10:53:52 2018
NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON
lsst-verify-worker01 1 normal* idle 48 48:1:1 1 0 1 (null) none
lsst-verify-worker17 1 normal* idle 48 48:1:1 1 0 1 (null) none
lsst-verify-worker18 1 normal* idle 48 48:1:1 1 0 1 (null) none
lsst-verify-worker19 1 normal* idle 48 48:1:1 1 0 1 (null) none
lsst-verify-worker20 1 normal* idle 48 48:1:1 1 0 1 (null) none
...
lsst-verify-worker44 1 normal* idle 48 48:1:1 1 0 1 (null) none
lsst-verify-worker45 1 normal* idle 48 48:1:1 1 0 1 (null) none
lsst-verify-worker46 1 debug idle 48 48:1:1 1 0 1 (null) none
lsst-verify-worker47 1 debug idle 48 48:1:1 1 0 1 (null) none
In this view sinfo
shows the nodes to reside within a single partition debug
, and the worker nodes show 48 possible hyperthreads on a node (in the future this may be reduced to reflect the actual 24 physical cores per node). At the time of this sinfo
invocation there were 42 verification nodes available, shown by the “idle” state. The SLURM configuration currently does not perform accounting, and places no quotas on users’ total time usage.
Simple SLURM jobs¶
In submitting SLURM jobs to the Verification Cluster
it is advisable to have the
software stack, data, and any utilities stored on the GPFS /scratch
, /datasets
, and/or /software
spaces so that all are reachable from lsst-dev01
and each of the worker nodes. Some simple SLURM job description files that make use of the srun
command
are shown in this section. These are submitted to the queue from a standard login shell on the front end lsst-dev01
using the SLURM client command sbatch
, and their status can be checked with the
command squeue
:
For a single task on a single node:
% cat test1.sl
#!/bin/bash -l
#SBATCH -p debug
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -t 00:10:00
#SBATCH -J job1
srun sleep.sh
% cat sleep.sh
#!/bin/bash
hostname -f
echo "Sleeping for 30 ... "
sleep 30
Submit with :
% sbatch test1.sl
Check status :
% squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
109 debug job1 daues R 0:02 1 lsst-verify-worker11
This example job was assigned jobid 109 by the SLURM scheduler, and consequently the standard output and error of the job were written to a default file slurm-109.out
in the current working directory.
% cat slurm-109.out
lsst-verify-worker11.ncsa.illinois.edu
Sleeping for 30 ...
To distribute this script for execution to 6 nodes by 24 tasks per node (total 144 tasks), the form of the job description is:
% cat test144.sl
#!/bin/bash -l
#SBATCH -p debug
#SBATCH -N 6
#SBATCH -n 144
#SBATCH -t 00:10:00
#SBATCH -J job2
srun sleep.sh
Submit with :
% sbatch test144.sl
For these test submissions a user might submit from a working directory
in the /scratch/<username>
space with the executable script sleep.sh
and the job description file located in the current working directory.
Interactive SLURM jobs¶
A user can schedule and gain interactive access to Verification Cluster
compute nodes
using the SLURM salloc
command. Example usage is:
For a single node:
% salloc -N 1 -p debug -t 00:30:00 /bin/bash
salloc: Granted job allocation 108
% squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
108 debug bash daues R 0:58 1 lsst-verify-worker46
% hostname -f
lsst-dev01.ncsa.illinois.edu
% srun hostname -f
lsst-verify-worker46.ncsa.illinois.edu
One can observe that after the job resources have been granted, the user shell is still on
the login node lsst-dev01
. The command srun
can be utilized to run commands on the job’s allocated
compute nodes. Commands issued without srun
will still be executed locally on lsst-dev01
.
You can also use srun
without first being allocated resources (via salloc
).
For example, to immediately obtain a command-line prompt on a compute node:
% srun -I --pty bash
SLURM Example Executing Tasks with Different Arguments¶
In order to submit multiple tasks that each have distinct command line arguments (e.g., data ids),
one can utilize the srun
command with the --multi-prog
option. With this option, rather than
specifying a single script or binary for srun
to execute, a filename is provided as the argument
of the --multi-prog
option. In this scenario an example job description file is:
% cat test1_24.sl
#!/bin/bash -l
#SBATCH -p debug
#SBATCH -N 1
#SBATCH -n 24
#SBATCH -t 00:10:00
#SBATCH -J sdss24
srun --output job%j-%2t.out --ntasks=24 --multi-prog cmds.24.conf
This description specifies that 24 tasks will be executed on a single node,
and the standard output/error from each of the tasks will be written to a unique filename with format specified by the argument to --output
. The 24 tasks to be executed are specified in the file
cmds.24.conf
provided as the argument to the --multi-prog
option. This
commands file will have a format that maps SLURM process ids (SLURM_PROCID) to programs to execute
and their commands line arguments. An example command file has the form :
% cat cmds.24.conf
0 /scratch/daues/exec_sdss_i.sh run=4192 filter=r camcol=1 field=300
1 /scratch/daues/exec_sdss_i.sh run=4192 filter=r camcol=4 field=300
2 /scratch/daues/exec_sdss_i.sh run=4192 filter=g camcol=4 field=297
3 /scratch/daues/exec_sdss_i.sh run=4192 filter=z camcol=4 field=299
4 /scratch/daues/exec_sdss_i.sh run=4192 filter=u camcol=4 field=300
...
22 /scratch/daues/exec_sdss_i.sh run=4192 filter=u camcol=4 field=303
23 /scratch/daues/exec_sdss_i.sh run=4192 filter=i camcol=4 field=298
The wrapper script exec_sdss_i.sh
used in this example could serve to
“set up the stack” and place the data ids on the command line of processCcd.py
:
% cat exec_sdss_i.sh
#!/bin/bash
# Source an environment setup script that holds the resulting env vars from e.g.,
# . ${STACK_PATH}/loadLSST.bash
# setup lsst_distrib
source /software/daues/envDir/env_lsststack.sh
inputdir="/scratch/daues/data/stripe82/dr7/runs/"
outdir="/scratch/daues/output/"
processCcd.py ${inputdir} --id $1 $2 $3 $4 --output ${outdir}/${SLURM_JOB_ID}/${SLURM_PROCID}