Files, Storage, and Securing your Data
On the cluster, you have several options available to you regarding where files are stored. This page discusses the various options, and the differences between them in terms of performance, backup, policies, and use for archiving of data.
- Your home directory
- Data directories
- Scratch space
- Using lustre
- Archival storage
- Securing your data
- Policies regarding usage of Disk Space on the Deepthought HPCCs
Your home directory
Your home directory is private to you, and should be used as little as possible for data storage. In particular, you should NOT run jobs out of your home directory --- run your jobs from the lustre filesystem; this is optimized to provide better read and write performance to improve the speed of your job. After the job is finished, you might wish to copy the more critical results files back to your home directory, which gets backed up nightly. (The /data and lustre filesystems are NOT backed up.)
Do not run jobs out of your home directory, or run jobs doing extensive I/O from your home directory, as it is NOT optimized for that.
Your home directory is the ONLY directory that gets backed up by the Division of IT. You should copy your precious, irreplaceable files (custom codes, summarized results, etc) here.
Home directories on the Deepthought clusters are limited by a 10 GB "soft quota" policy. Realizing the need for storage can sometimes vary dramatically over the span of a few days, we have adopted a policy with some flexibility in this regard. There is no hard limit (short of available disk space) on how much you can store in your home directory, but your usage is monitored daily. If it exceeds 10 GB, you will receive email informing you of such and asking you to rectify it. You are given 7 days to bring your usage back under 10 GB. If you are engaged in some activity wherein you need to use more than 10 GB in your home directory for a few days, feel free to do so and this policy allows for that. After 7 days, however, the email will be stronger in tone, and will go to systems staff. At that point, you must bring your home directory usage down below 10 GB ASAP or be in violation of our policy. Failure to comply in a timely manner can lead to the loss of privileges to use the cluster.
On the MARCC/Bluecrab cluster, there is a 20 GB quota on home directories.
The Deepthought cluster has a large (over 100 TB) amount of disk space available
for the use of active jobs. The Deepthought2 cluster has over 1 PB of disk
space available for the use of active jobs.
The lustre filesystems and DIT provided NFS data volumes are for the temporary storage of files supporting active research on the cluster only. They are NOT for archival storage. Files more than 6 months old on lustre or the data volumes are subject to deletion without notice by systems staff.
There are two types of data storage available on original Deepthought cluster. There is about 6 TB of NFS storage, and over 100 TB (and growing) of lustre storage. All of these are network filesystems accessible from all of the compute nodes. On the Deepthought2 cluster, only home directories are kept on NFS, everything else is in lustre.
Because much of the data generated on the cluster is of a transient nature and because of its size, data stored in the /data and Lustre partitions is not backed up. This data resides on RAID protected filesystems, however there is always a small chance of loss or corruption. If you have critical data that must be saved, be sure to copy it elsewhere.
There are several general purpose areas that are intended for storage of computational data. These areas are accessible to all users of the cluster and as such you should be sure to protect any files or directories you create there. See Securing Your Data for more information.
The data volumes and lustre storage listed below are NOT BACKED UP. Any valuable data should be copied elsewhere (home directory or off cluster) to prevent loss of critical data due to hardware issues. You are responsible for backing up any valuable data.
|Path||Filesystem Type||Approximate Size||Comments|
|The following filesystems are available to all Deepthought2 users|
|/lustre||Lustre||1.1 PB||Not backed up
For temporary storage of data for active jobs
Good place for running jobs to read/write from
|The following filesystems are available on the Bluecrab cluster|
|~/scratch||Lustre, private to the user||quotas vary|| Not backed up
Please use for I/O for running jobs
|~/work||Lustre, shared by the group||quotas vary|| Not backed up
Please use for I/O for running jobs
|~/data||ZFS, shared by group||quotas vary||DO NOT have running jobs read/write from here||The following filesystems are available to all Deepthought users|
|/export/lustre_1||Lustre||137TB||Not backed up
For temporary storage of data for active jobs
Good place for running jobs to read/write from
The paths above are for the root of the filesystem. You should
use a subdirectory with the name of your username beneath the listed
directories. These should already exist for you in lustre, but you might
need to create it (with
mkdir command) on some of the other
data volumes. E.g., if your username is
johndoe, your lustre
directory on Deepthought2 would be
The NFS on RAID5 filesystems on the Deepthought HPC cluster are DEPRECATED. The disks and related hardware are old and out of warranty/support, and likely to fail. Data is NOT backed up. Use at your own risk
Please remember that you are sharing these filesystems with other
researchers and other groups. If you have data residing there that
you don't need, please remove it promptly. If you know you are going
to create large files, make sure there is sufficient space available
in the filesystem you are using. You can check this yourself with
login-1:~: df -h /lustre Filesystem Size Used Avail Use% Mounted on 192.168.64.64@o2ib0:192.168.64.65@o2ib0:/scratch 1.1P 928T 120T 89% /lustre
This output shows that there are currently 120 TB of free space
To see how much space is currently being used by a particular directory,
login-1:~: du -sh /lustre/bob 1.5T bob
This output shows that the directory
currently using 1.5 TB of space.
There are no pre-set limits on how much data you can store on lustre. However, to ensure there is adequate space for everyone using the cluster, this space is only to be used for temporarily storing files in active use for jobs and projects currently running on the system. I.e., when your job is finished, remove all the data files, etc. that are no longer needed. See the section on archival storage for a discussion of some of the archival storage options available to you if you need to retain the data for longer periods of time.
Although there are no hard and fast limits on disk usage on the lustre filesystems, when the disks fill up, we will send emails to the people consuming the most space on the disks in question requesting that they clean up, removing any unneeded files and moving files off the disks as appropriate. Timely compliance with such requests is required to ensure the cluster remains usable for everyone; failure to do so is in violation of cluster policies and can result in loss of privileges to use the cluster.
These emails will be sent to your @umd.edu email address; you are required to receive and respond to emails sent to that address. If you prefer using a different email address, be sure to have your @umd.edu address forward to that address. Contact the Division of IT helpdesk if you need assistance with that.
If you have a Glue account and you want to share your data back and
forth with that account, you can access it at
/glue_homes/<username>. Note that you cannot have
jobs read or write directly from your Glue directory, you'll need to
copy data back and forth by hand as needed.
It is not uncommon for jobs to require a fair amount of temporary storage.
All of the nodes on the original Deepthought cluster
have between 1 GB and 250 GB of local scratch space available,
with most nodes having at least 30 GB.
For Deepthought2, all nodes should have over 750 GB of scratch space available.
This space is mounted as
/tmp, is is accessible by all processes running on that node.
It is NOT available by processes running on different nodes.
Scratch space is temporary. It will be deleted once your jobs complete --- if there is anything you need to save, you must copy it out of scratch space to a data directory, etc. before the job completes. It is NOT backed up.
Because scratch space is local to the system, it is usually quite fast. Lustre storage in theory can be faster, but because that is shared by many users and jobs, scratch space is usually faster than lustre in practice, and typically has rather consistent performance (it can be affected by other jobs running on the system, though these should only be your jobs).
See for information on how to specify the amount of scratch space needed by your job.
Lustre is a high performance distributed file system designed for HPC clusters. Files are distributed among multiple servers, even in some cases different parts of the same file are on different servers. By spreading the load across multiple file servers, this allows for the faster responses to file requests required to deal with the heavy load some parallel codes demand.
The lustre filesystems are NOT BACKED UP. Any valuable data should be copied elsewhere (home directory or off cluster) to prevent loss of critical data due to hardware issues. You are responsible for backing up any valuable data.
Every user is provided a personal lustre directory when their account on the cluster is created. The location of this directory varies a bit from cluster to cluster. For an user with username username, their personal lustre directory is located at:
- On the Deepthought2 cluster:
- On the Bluecrab cluster:
/scratch/users/usernameRemember that your username will include
@umd.edu. You should also have a symlink
scratchin your home directory pointing to that directory.
- On the Juggernaut cluster:
On the Bluecrab cluster, you might also have a lustre directory shared by
all members of your allocation. This will typically have a name like
allocation_name is typically the UMD directory ID of the principal
investigator for the allocation (i.e. their
address with the
@umd.edu omitted). This is also typically
smylinked to from the symlink
work in your home directory.
Your lustre directory is visible from the login nodes, data transfer nodes, AND from all of the compute nodes. Note that the lustre filesystems on the two Deepthought clusters are distinct, and files on one are not available on the other unless you manually copy them.
For the most part, you can use lustre as you would any other filesystem; the standard unix commands work, and you should just notice better performance in IO heavy codes.
Normally, lustre will keep the data for an individual file on the same
fileserver, but will distribute your files across the available servers.
lfs getstripe and
lfs setstripe commands
can be used to control the striping. More information can be found
in the section on Lustre and striping.
Lustre stores the "metadata" about a file (its name, path, etc) separately from the data. Normally, the IO intensive applications contact the metadata server (MDS) once when opening the file, and then contact the object storage servers (OSSes) as they do the heavy IO. This generally improves performance for these IO heavy applications.
Certain common interactive tasks, e.g.
ls -l require data
from both the MDS and the OSSes, and take a bit longer on lustre. Again,
these are not the operations lustre is optimized for, as they are not commonly
done frequently in IO heavy codes.
lfs find command is a version of find optimized for
lustre. It tries to avoid system calls that require information from the OSSes
in addition to the MDS, and so generally will run much faster than the
find command. Usage is by design similar to the
If you want to see how much space you are currently using in any of the Lustre
filesystems, run the command
lustre_usage. This will show you
total usage for yourself and for any groups you belong to. Note that this
will only show you Lustre usage, and will not include any files outside
login-1:~: lustre_usage Usage for /export/lustre_1: ====================================================================== Username Space Used Num Files Avg Filesize ------------------------------------------------------------ rubble 2.3T 4134684 607.7K Group Space Used Num Files Avg Filesize ------------------------------------------------------------ flint 4.6T 6181607 795.4K
Lustre and striping
As mentioned previously, lustre gets its speed by "striping" files over multiple Object Storage Targets (OSTs); basically multiple fileserver nodes each of which holds a part of the file. This is mostly transparent to the user, so you would not normally know if/that your file is split over multiple OSTs.
By default on the Deepthought clusters, every file is kept on a single
OST, and this striping just means that different files are more or less randomly
spread across different file servers/OSTs. This is fine for files of moderate
size, but might need adjustment if dealing with files of size 10 or 100 GB or
lfs getstripe and
lfs setstripe commands
exist for this.
getstripe subcommand is the simplest, and just gives
information about the striping of a file or directory. Usage is just
lfs getstripe FILEPATH and it prints out information about the
named file's striping. E.g.:
login-1> lfs getstripe test.tar test.tar lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: 1 lmm_layout_gen: 0 lmm_stripe_offset: 9 obdidx objid objid group 9 2549120 0x26e580 0 login-1>
The above example shows a file created using default settings. The
file in this case is on a single OST (the number of stripes for the
file, given by lmm_stripe_count, is 1). The lmm_stripe_offset gives the
index to the starting OST, in this case 9, and below that show alls the
stripes (in this case, just the single one). One case use the command
lfs osts to correlate the index to the name of an actual OST.
The lmm_stripe_size value is the size of the stripe, in bytes, in this case
1048576 bytes or 1 MiB.
While examining a file's striping parameters is nice, it is not particularly
useful unless one can also change it, which can be done with the
setstripe subcommand. Actually, the striping for a file is
NOT MUTABLE, and is set in stone at the time of file creation. So one needs
to use the
setstripe subcommand before the file is
created. E.g., to create our
test.tar file again, this time
striped over 20 OSTs and using a stripe size of 10 MiB, we could do something
login-1> rm test.tar login-1> lfs setstripe -c 20 -S 10m test.tar login-1> ls -l test.tar -rw-r--r-- 1 payerle glue-staff 0 Sep 18 17:02 test.tar login-1> tar -cf test.tar ./test login-1> ls -l test.tar -rw-r--r-- 1 payerle glue-staff 8147281920 Sep 18 17:04 test.tar login-1> lfs getstripe test.tar test.tar lmm_stripe_count: 20 lmm_stripe_size: 10485760 lmm_pattern: 1 lmm_layout_gen: 0 lmm_stripe_offset: 55 obdidx objid objid group 55 419995932 0x1908a11c 0 63 468577296 0x1bedec10 0 45 419403761 0x18ff97f1 0 68 435440970 0x19f44d4a 0 57 409176967 0x18638b87 0 44 377767950 0x1684480e 0 61 419414421 0x18ffc195 0 65 356701609 0x1542d5a9 0 31 408705898 0x185c5b6a 0 12 429746020 0x199d6764 0 50 379985276 0x16a61d7c 0 16 372211487 0x162f7f1f 0 46 468289628 0x1be9885c 0 10 402610097 0x17ff57b1 0 30 425031271 0x19557667 0 60 423186185 0x19394f09 0 69 496205056 0x1d937d00 0 35 409685517 0x186b4e0d 0 70 415859549 0x18c9835d 0 15 449399811 0x1ac94c03 0
We start by deleting the previously created
is necessary because one cannot use
lfs setstripe on an existing
file. We then use the -c option to
setstripe to set the stripe
count, and the -S option to set the stripe size, in this case 10 MiB. One
can also use the suffices 'k' for kiB, or 'g' for GiB. The
setstripe creates an empty file with the desired striping
parameters. We then issue the tar command to put content in the file, and
then run the
getstripe subcommand to confirm the file has the
As mentioned before, one cannot use the
on an existing file. So what if we want to change the striping of an existing
file? E.g., what if we decide now we want test.tar to have 5 stripes of
size 1 GiB? Because we cannot directly change the striping of an existing file,
we need to use
setstripe to create a new file with the desired
striping, and copy the old file to the new file (you can then delete the
old file and rename the new file to the old name if desired). E.g.
login-1> lfs getstripe test.tar test.tar lmm_stripe_count: 20 lmm_stripe_size: 10485760 lmm_pattern: 1 lmm_layout_gen: 0 lmm_stripe_offset: 55 obdidx objid objid group 55 419995932 0x1908a11c 0 63 468577296 0x1bedec10 0 ... login-1> ls -l test2.tar ls: cannot access test2.tar: No such file or directory login-1> lfs setstripe -c 5 -S 1g test2.tar login-1> ls -l test2.tar -rw-r--r-- 1 payerle glue-staff 0 Sep 18 17:16 test2.tar login-1> cp test.tar test2.tar login-1> ls -l test2.tar -rw-r--r-- 1 payerle glue-staff 8147281920 Sep 18 17:17 test2.tar login-1> diff test.tar test2.tar; echo >/dev/null Make sure they are the same login-1> lfs getstripe test2.tar; echo >/dev/null Verify striping test2.tar lmm_stripe_count: 5 lmm_stripe_size: 1073741824 lmm_pattern: 1 lmm_layout_gen: 0 lmm_stripe_offset: 61 obdidx objid objid group 61 419416513 0x18ffc9c1 0 31 408708503 0x185c6597 0 66 422684037 0x1931a585 0 49 429032715 0x1992850b 0 16 372213361 0x162f8671 0 login-1> rm test.tar; mv test2.tar test.tar
This only touches the surface of what can be done with striping in lustre, for additional information look at:
- National Institute for Computational Sciences Lustre Striping Guide
- Lustre Wiki page on striping
- Intel's Overview of Lustre Striping
The lustre and data filesystems on the various HPC resources are intended for ongoing research on the cluster. These storage resources are limited and are not intended for long term or archival storage, as they are needed for people to run jobs on the cluster. You are required to delete data that is no longer needed, and to move data that needs to be retained elsewhere once it is not needed to support queued or running jobs.
Campus provides user the ability to store large amounts of data via on campus and cloud-based services, namely:
Using the Campus Isilon Network Storage service for Archival Storage
Campus maintains a networked file storage system which can be accessed either by the NFS protocol (suitable for access by Unix-like systems) or the CIFS/SMB protocol (suitable for access by Windows-like systems).
Pricing and other information, along with links to the forms to request such service, can be found at the Networked Storage service catalog
Using Google G Suite Drive for Archival Storage
Campus provides the ability to store large amounts of data on Google's G Suite drive. Please see the Google drive service catalog entry for more information, including restrictions on what data can be stored there and how to set up your account if you have not done so already.
The recommended utility for accessing Google drive from the HPC cluster is to use the rclone command:
In addition to supporting many Cloud storage providers, it also has features to prevent exceeding Google's data rate limits.
The gdrive command is also available, but it tends to exceed Google's rate limits when moving large amounts of data.
Using Box for Archival Storage
Campus also provides the ability to store large amounts on the Box cloud-based storage platform. Please see the UMD Box service catalog entry for more information, including restrictions on what data can be stored there and how to set up your account if you have not done so already.
The UMD Box service can be accessed from the cluster in several ways. We recommend using the rclone command:
Alternatively, one could use an
NOTE: although similar in name and function,
ftps is not the same as
sftp. They are different protocols, and Box
does NOT support sftp at this time. Probably the best command line ftps
utility is the
lftp command; see:
Securing Your Data
Your home directory as configured is private and only you have access to it. Any directories you create outside your home directory are your responsibility to secure appropriately. If you are unsure of how to do so, please submit a help ticket requesting assistance.
If you're a member of a group, you'll want to make sure that you give
your group access to these directories, and you may want to consider
setting your umask so that any files you create automatically have
group read and write access. To do so, add the line
002 to your
If your jobs process sensitive data, it is strongly recommended that you submit all such jobs in exclusive mode to prevent other jobs/users from running on the same node(s) as your job.