Quick Start (using Unix command line)

This is a "quick start" introduction into using the HPC clusters at the University of Maryland with the Linux/Unix command line. This covers the general activities most users will deal with when using the clusters.

NOTE: This document covers the command line interface. Users without experience with the Unix/Linux command line might prefer to look at the quickstart OnDemand web portal which they will likely find easier to use. Also, users of Matlab Parallel Server (formerly known as Matlab Distributed Computing Server (MDCS)) can access some aspects of the cluster from Matlab running on their workstation, avoiding some of the complexities of the Linux command line.

Prequisites
Logging into one of the login nodes
Creating a job script
Submitting a job
Monitoring job status
Monitoring your allocation

Prequisites for the Command Line Quick Start

This quick start assumes that you already

have a TerpConnect/Glue account (REQUIRED for the Zaratan cluster, advisable for the others)
have an account on/access to the cluster (i.e., have an allocation or have been granted access to someones allocation)
know how to use ssh
have at least a basic familiarity with Unix

If not, follow the above links before proceeding with this quick start.

Logging into one of the login nodes

All of the clusters have at least 2 nodes available for users to log into. From these nodes you can submit and monitor your jobs, look at results of the jobs, etc.

DO NOT RUN computationally intensive processes on the login nodes!!!. These are in violation of policy, interfere with other users of the clusters, and will be killed without warning. Repeated offenses can lead to suspension of your privilege to use the clusters.

For most tasks you will wish to accomplish, you will start by logging into one of the login nodes for the appropriate cluster. These are:

Cluster	Login Node	Examples
Zaratan	login.zaratan.umd.edu	ssh johndoe@login.zaratan.umd.edu ssh -l johndoe login.zaratan.umd.edu

See the section on logging into the clusters for more information.

Creating a job script

Next, you'll need to create a job script. This is just a simple shell script that will specify the necessary job parameters and then run your program.

Here's an example of a simple script, we'll call test.sh:

#!/bin/bash
#SBATCH -t 1
#SBATCH -n 4
#SBATCH --mem-per-cpu=128
#SBATCH --oversubscribe

. ~/.bashrc
module load python

hostname
date

The first line, the shebang, specifies the shell to be used to run the script. Note that you must have a shebang specifying a valid shell in order for Slurm to accept and run your job; this differs from Moab/PBS/Torque which ignores the shebang and runs the job in your default shell unless you gave an option to qsub for a different shell.

The next three lines specify parameters to the scheduler.

The first, -t, specifies the maximum amount of time you expect your job to run. It can take various forms, but usually you will want to give minutes, hours:minutes:seconds, or days-hours. You should always set a reasonable wall time limit; this will help improve utilization of the cluster and reduce the amount of time your job will wait in the queue. To encourage this, the default wall time limit is rather short. In this example, we specified a wall time limit of 1 minute; normally this would be much longer, but this is a trivial job.

See the section on specifying the walltime limit for more information.

The second line, -n, tells the scheduler on how many tasks/cores your job will have (by default Slurm assigns a distinct core to each task). We do not specify how Slurm should distribute these cores across machines, so Slurm can distribute them however it sees fit. That is usually sufficient for many MPI jobs, and there are other options that allow for very detailed specification on how the cores should be distributed, as briefly described here and in the examples page.

In this example, we are requesting 4 cores (which is way more than needed for this trivial example). Most likely we will get all 4 cores on a single node, but that is NOT guaranteed. We could possibly get one core on each of 4 nodes, or some allocation of 4 cores on 2 or 3 nodes.

See the section on specifying the node/core requirements for more information.

The third line, --mem-per-core, tells the scheduler on how much memory to allocate for your jobs. This particular form, --mem-per-core=N reserves the requested amount of memory (N MB) per CPU core assigned. A similar form, --mem=N, reserves the requested amount of memory (N MB) for the entire job. The --mem-per-core is usually more convenient. Nodes on the Zaratan cluster should have at least 4 GB/core.

In this example, we are requesting 128 MB per core, for a total of 512 MB for our 4 core job. If we used --mem=128 we would get a total of 128 MB (or effectively 32 MB per core), which for this trivial job is still way more than is actually needed.

See the section on specifying the memory requirements for more information.

The fourth line, --oversubscribe, is the default for the Zaratan HPC cluster, and states that we are willing to share a node with other jobs. E.g., on Zaratan, all nodes have 128 cores; by using --oversubscribe mode, if all of our cores are assigned to one such node, Slurm will reserve 4 cores for us, but can assign the other 124 cores to other jobs while our job is running. The opposite is --exclusive, which prevents other jobs from running on the same node(s) as the exclusive job. If our sample job was --exclusive and assigned to a 128 core node, the other 124 cores would be unassigned and idle while the job ran.

NOTE: exclusive jobs get charged for both the cores they use AND for the cores they prevent from being used by anyone else due to the exclusive status. E.g., if the example job was --exclusive and assigned a 128 core node, it would accrue charges for 128 cores for as long as it ran.

See the section on specifying whether other jobs can be on the same node for more information.

Users of the Zaratan HPC cluster do not need to specify a partition when using the standard partition. However a partition will need to be specified when using GPUs or large memory nodes, or the debug or scavenger partitions.

It is advisable to include at least these four above options (wall time limit, number of cores, memory and either exclusivity or partition depending on the cluster) for all jobs, either in the job script as shown, or on the sbatch command line (see for general information on providing options to the sbatch command). There are many other possible arguments to the sbatch command, the more commonly used ones are described here.

The remaining lines in the file are just standard commands, you will replace them with whatever your job requires. In this case once the job runs, it will print out the time and hostname to the output file. The script will be run in whatever shell is specified by the shebang on the first line of the script. NOTE: unlike with the Moab scheduler, you MUST provide a valid shebang on the first line.

Note that when your job starts, your job script is executed on the first node assigned to your job. The list of nodes assigned to your job, etc. are available in Slurm environmental variables, but Slurm does not do anything to parallelize your job. Your script is responsible for farming out tasks to the different cores/nodes that are part of the job. Normally, a parallel application will handle that, or you issue your MPI-aware code with mpirun which handles that.

See the section on running MPI jobs for more information.

In particular, note the the example given is BAD. Although it requests 4 cores, all the commands listed (hostname, date) are single core commands, so 3 of the requested cores will actual be idle while the job is running. Since this job is just a simple example and will finish in seconds, that is not a big issue in this case. But in general, simply submitting serial code as a sbatch job requesting more than one core DOES NOT parallelize a job.

For users of the Zaratan HPC cluster: If your job script used bash and that is NOT your default shell, you should begin the code section of your script with

. ~/.bashrc

to set up your environment properly. In particular, this sets up the module command.

Generally, this should be followed by module loads of whatever modules your job requires.

See the section on using the module command for more information.

It is recommended that you include the relevant module commands for a job in the actual job script, as opposed to relying on modules loaded by your dot files.

For more information than is suitable for a quick start document, follow one or more of the links below:

Submitting a job

Now that you have a job script, you need to submit the job to the cluster with the sbatch command. For example,

login-1:~: sbatch test.sh 
Submitted batch job 13222

The number that is returned to you is the identifier for the job, and you should use that anytime you want to find out more information about your job, and you should include this number if you are opening a help ticket about a job.

Do NOT start jobs from your home directory. It is NOT optimized for heavy I/O.

At this point, your job has been placed in the queue, and will wait its turn for resources to be available. Depending on how heavily used the cluster is at that time, and how many resources you are requesting, your job might start within minutes or it might wait for hours or even days. (And this is assuming that there are sufficient funds in the allocation, etc.) See the FAQ for tips on how to reduce the amount of time your job spends waiting in the queue..

Once resources become available, Slurm will assign resources to your job, including one or more cores on one or more nodes. A shell process will start on the first core of the first node assigned, and your script will run. Normally, your script will start any other tasks on the same or on other nodes as needed.

The standard output and standard error streams will be directed to a file, by default slurm-NNNN in the directory where you started the job, where the NNNN is the job number as described above. See the section on specifying output options for more information.

Do NOT start jobs from your home directory. It is NOT optimized for high I/O. Use scratch space instead. See the section on storage for more information on storage options.

Output from your job can be viewed in the above specified file shortly after it starts running (assuming it has output something). This can be used to check the status of your job, although it is advisable if your code generates a lot of output to redirect it to another file. See the section on storage for more information on storage options.

For our trivial example from the last section, when the job completes we should see something like

l:~: cat slurm-13222.out
compute-b6-39.zaratan.umd.edu
Wed May 21 18:38:06 EDT 2022

As you can see in the output files above, the script ran and printed the hostname and date as specified by the job script.

Monitoring job status

The basic command for monitoring your jobs' status is the squeue command. Because normally you are only interested in your jobs, it is advisable to add the -u USERNAME flags, to speed up the command and only show your jobs. Replace USERNAME with your username.

For more information on monitoring jobs than is suitable for a quick start document, follow the links below.

Monitoring and managing your jobs in general
The squeue command
Getting the estimated start time of your job
Obtaining detailed information about a job

Monitoring your allocation

It is often useful to be able to see the status of the cluster as a whole, including information about how busy the cluster is at a given point in time.

The squeue command without any arguments will list all jobs in the queue. This can be overwhelming, however, as there are often many, many jobs.

The sinfo -N command can show you information about the nodes in the cluster. Again, this is a dense text output, so can be difficult to process.

The smap uses ascii graphics to present this information in a more graphical and hopefully more digestible fashion.

The sview uses X11 graphics for an even prettier overview of the cluster.

More information about the commands above can be found in the section on on monitoring the cluster..