Allocations and Job Accounting

The Basics
Choosing the Account to Use
Managing Allocations
1. For PIs
2. For College/Dept Pool Managers
Monitoring Usage

The Basics of Allocations and Job Accounting

As an user of the cluster you belong to at least one project , and each project contains of one or more allocations , at least one of which you must also be a member of. Each allocation represents an allotment of resources on an HPC cluster. These resources include compute time (measured in SUs (or more commonly kSUs), as well as storage space (measured in TB ) on both the scratch and the SHELL tiers.

The reason that there may be multiple allocations within a project is because the allocations can come from different resource pools and have different expiration dates and replenishment schedules. Allocations from the Allocations and Advisory Committee (AAC) typically have an duration of one year, although they may be renewed via via an application showing reasonable past use. In most cases such allocations are awarded a fixed amount of resources for the duration of the allocation, i.e. for the year. Allocations coming from college or departmental pools will be subject to the policies of the college/department granting the allocation, but usually these will also be for one year terms, but allocated and replenished quarterly. Allocations purchased from DIT will be governed by the MOU signed at the time of purchase, but typically will be for one year, allocated and replenished quarterly.

The storage allotments for all of the different allocations within a project are typically summed, individually for each tier, to get the effective storage limits for the entire project (the group of members in the project) on that storage tier. I.e., typically the storage limits apply across all allocations, so you do not have to assign specific files to specific allocations. Storage allotments are for the duration of the allocation; they do not increase automatically with time. Note that if an allocation expires, the effective limits on storage for the members of the project may be reduced, which could potentially lead to to the disk usage exceeding the limit. In this case, members will be notified of this issue, and given a week to resolve the situation, either by reducing the amount of disk storage used (deleting unneeded files, moving files off the cluster, etc), or increasing your storage limit (by renewing the expired allocation, obtaining an increase in the storage allotment on a remaining allocation, or obtaining (perhaps purchasing) a new allocation with additional storage).

Compute allotments are distinct across allocations. Each allocation with an compute allotment has its own Slurm allocation account , and when submitting a job you can specify which allocation account the job should charge against. Allocations awarded by the AAC typically award a fixed amount of compute resources for the duration of the allocation (e.g. one year). Allocations from college or departmental pools will typically have their SUs allotted quarterly; e.g. if an allocation is granted 800 kSU/year, this will be meted out at 200 kSU/quarter for each of four quarters. All jobs that are submitted are associated with an account; this can be specified with the -A flag when the job is submitted, or will use the submitter's default account. For more information on specifying the account, or changing your default account.

Generally, the allocation account will get charged a number of SUs for the amount of resources used while the job is running. This SU cost is based on the amount of time the job ran in hours (the walltime of the job) This factor is the maximum of: The hourly SU cost for a job is the maximum of:

1 SU/hour per CPU core times the number of CPU cores allocated by the job
0.25 SU/hour/GB times the total amount of CPU memory allocated by the job, in GB, for jobs in the standard or gpu partitions. This is for the total memory allocated to the job, so if you allocated memory per CPU core it would be the product of GB-per-core times the number of CPU cores The average amount of memory per CPU core on the standard compute node on Zaratan is 4 GB, so if you use 4 GB/core this factor is the same as the CPU core factor). (Note:: jobs in the bigmem partition get charged at 0.0625 SU/hour/GB instead of 0.25 SU/hour/GB as those nodes have an average of 16 GB/core.)
144 SU/hour/GPU per H100 GPU allocated to the job
48 SU/hour/GPU per full A100 GPU allocated to the job
7 SU/hour/GPU per a100)_1g.5gb GPU (i.e. 1/7 A100)

NOTE: The SU costs above are based on the amount of resources allocated to a job, not what is actually used by the job, as the requested and allocated resources are not available to any other jobs while your job is running. So if you request 100 GB of RAM for a job that only used 40 GB of RAM, you are wasting both cluster resources and the SUs in your allocation. Similarly, requesting nodes in exclusive mode will cause you to be charged for all of the resources on the allocated nodes.

However, you are charged based on the actual time the job ran, not the requested walltime. So if you submit a job with a requested walltime of 1 day and it terminated after only 30 minutes, your job is only charged for 0.5 hour. So you should always set the walltime to the longest time you expect the job to run, perhaps with a little padding. But you should not set the requested walltime excessively long, as that could penalize your job in scheduling (plus there can be situations wherein the job stops running usefully but does not terminate --- in those situations, you will be charged for the full walltime until the job terminates).

Job in the scavenger paritition are an exception --- jobs in this low priority allocation are not charged, but are subject to preemption .

You are charged for cores consumed, not used. I.e., if you request 1 core on a node, but also request no other jobs be run on the node, you will be charged for ALL cores on the node assigned, since no one else can use them while your job is running. See for more information.

The scheduler keeps track of all jobs running against a given account, and keeps track of how many SUs are required to complete these jobs (using the walltime requirements requested when the job was submitted). Before a new job charging against that account is started, the scheduler makes sure that there are sufficient funds to complete it AND all currently running jobs charging against that account. If there are, the job can be started; otherwise, it is left in the pending state with a reason code AssociationJobLimit.

Research groups can get allocations in one of several ways:

If the group purchased resources on the cluster from DIT, they will have a "paid" allocation. These are governed by the Memorandum of Understanding (MOU) between the purchaser and DIT signed at purchase time.
Some colleges and/or departments have purchased pools of resources on the HPC cluster which they then meet out to members of their unit. These departmental allocations are essentially the same as the "paid" allocations above, except they are governed by the policies of the granting department/college.
All faculty members are elibigle to request an allocation for free from the campus HPC Allocations and Advisory Committee (AAC). This process is initiated by submitting an application to the AAC. This can be submitted by the faculty member themselves, or by a post-doc or student on behalf of the faculty member (in the latter case, the faculty member will be required to consent to assuming "ownership" of any resulting allocation). The review of the application by the AAC is more rigorous as more resources are requested; faculty members are eligible for up to 50 kSU per year with very little information, and with proper justification and testing, can get up to 550 kSU per year for free from the AAC. If you have questions, submit the form with what information you can provide, and the HPC staff will help you through the process for any additional information needed.

Paid and Departmental Allocations

This section discusses some general concepts related to allocations which were purchased from DIT, which includes allocations awarded by departments and/or colleges from pools of HPC resources which they have purchased from DIT. Allocations purchased directly from DIT are governed by the MOU between the purchaser and DIT which was signed at the time of purchase. Allocations which were granted from departmental and/or college pools are subject to whatever policies the department and/or college wish to impose. The information below generally holds for such allocations, but the aforementioned MOUs and departmental/college policies have precedence.

These allocations have an expiration date; typically for one year from the start date, but this is negotiable in the MOU, and shorter terms (down to even a single quarter) are available on request. For departmental/college allocations, the expiration date is still nominally set to one year, but the allocation will persist until the contact for the department/college tells us otherwise.

SUs are meted out quarterly; if your allocation is for 800 kSU per year, you will normally get 200 kSU per quarter for 4 quarters. This can be modified beforehand if needed; if you know you will need more compute time in the first two quarters you could set up the same 800 kSU per year as e.g. 250 kSU/quarter for the first two quarters and 150 kSU/quarter for the next two quarters. SUs which are not used in one quarter do not roll over to future quarters --- at the end of a quarter, all unused SUs simply disappear, and (if the allocation did not expire), the allocation will be replenished with the SUs for the next quarter.

Storage (both scratch and SHELL tier) are also allocated quarterly, but as files generally persist across quarters, this is usually not as noticeable. If the storage quota for your project decreases at some point (e.g. an allocation expires or is revoked, causing the loss of the contribution of that allocation to your storage quotas) resulting in your project being over quota on one or more storage tier, the PI of the allocation will receive email from HPC staff informing them of the issue, and requesting that the storage be brought under quota in a timely fashion. This can be either by reducing the storage footprint, or increasing the storage quota (e.g. getting more quota from the AAC, department/college pools, and/or purchasing from DIT). If this is not done in a reasonable time period, we may be forced to bill the PI for the excessive storage used.

AAC-Granted Allocations

The HPC Allocations and Advisory Committee (AAC) can grant one-time unpaid allocations to faculty and students for small projects, classes, feasibility tests, etc. These allocations are granted out of computing resources purchased by funds from the Provost's Office and the Division of Information Technology.

Faculty members can apply for such allocations by submitting an application to the AAC. This can be submitted by the faculty member themselves, or by a post-doc or student on behalf of the faculty member (in the latter case, the faculty member will be required to consent to assuming "ownership" of any resulting allocation). The review of the application by the AAC is more rigorous as more resources are requested; faculty members are eligible for up to 50 kSU per year with very little information. With proper justification and benchmarking and approval by the AAC, faculty members can get up to 550 kSU per year (including the aforementioned 50 kSU) for free from the AAC.

Generally, the AAC will not grant more than 50 kSU/year to faculty members without the faculty member having run jobs on the cluster to (and summarizing such in the application):

show that the job is scaling well as a function of the resources requested, and determine the optimal amount of resources for a job. (Typically, job performance increases as resources (e.g. CPU cores, GPUs, memory) are allocated to a job only up to some threshold, and increase the resources allotted to a job beyond that yields only minimal if any performance gains. You should run tests comparing performance of jobs similar to what you plan to use in production with different amounts of resources allotted to them to determine the optimal resources to give to a job).
make a quantifiable estimate of the amount of resources needed to complete the stated scientific goals based on actual runs of similar codes on the cluster proposing to be used.

If the above criteria have not been met, the AAC will generally limit a researcher to an initial grant of 50 kSU/year. That allocation can be used to start the research, and while doing so collect the aforementioned benchmarks which can be used in a renewal request for additional resources (which can be submitted anytime; you do not have to wait for the initial allocation to expire).

The 50 kSU/year allocation size is also useful if you are new to HPC or the cluster. While the form is the same for all allocation sizes, the review of the application for the first 50 kSU/year for a faculty member is relaxed .

In general, when applying, answer all fields to the best of your ability, and HPC staff will get back to you with questions if more information is needed.

Choosing the Account to Use

If you only have a single account (check with the sbalance command), you can skip this section. You only have the one account, so there is nothing to choose.

If you have multiple accounts due to your membership in multiple research groups and/or projects, you may wish to choose which account you use based on your job. I.e., if the job is doing something for group A, you probably should only submit it using one of the group A accounts, even if you also have access to group B accounts. If the research areas of the two groups overlap, you will need to follow what ever group-specific policies may exist (contact your colleagues).

If you have access to multiple allocation accounts within the same research group/project, then there is a choice to be made. If your research group has group-specific policies about which allocation to use, follow those. Otherwise, you will normally see an allocation from the Allocations and Advisory Committee (AAC) plus one or more allocations from college and/or departmental resource pools, and maybe an allocation purchased from the Division of IT. Again, the sbalance command is an easy way to see this, e.g.

login.zaratan.umd.edu:~$ sbalance
Account: smith-prj-aac (DEFAULT)
Limit: 	   250.00 kSU
Unused:    126.50
Used:  	   123.50 (49.4% of limit)

Account: smith-prj-eng
Limit: 	   200.00 kSU
Unused:    110.00 kSU 
Used:  	   90.00 kSU (45.0 % of limit)

Account: smith-prj-ipst
Limit: 	   175.00 kSU
Unused:    160.00 kSU 
Used:  	   15.00 kSU (8.7 % of limit)

Account: smith-prj-paid
Limit: 	   275.00 kSU
Unused:    180.40 kSU 
Used:  	   94.60 kSU (34.4 % of limit)

login.zaratan.umd.edu:~$

In the above example, the user is a member of 4 allocations in the smith-prj research group/project; the first being awarded from the AAC, the second from the School of Engineering, the third from the IPST department, and the last was purchased from DIT.

In such a case, we generally encourage users to use the AAC allocation as a last resort. Paid and College/Departmental allocations are typically awarded quarterly, meaning that at the start of each new quarter year (1 Jan, 1 Apr, 1 Jul, 1 Oct), any unused SUs in that allocation disappear, and the allocation is replenished at its nominal quarterly level. Since the SUs in these allocation typically have the shortest lifespan, you generally want to use those SUs first.

Allocations granted from the AAC, on the other hand, represent an one-time grant of resources, and although these SUs will also expire and vanish at the end of the term of the AAC allocation, this is typical on a timescale of about one year. Also, AAC allocations do not automatically renew --- you (or your advisor) must apply to the AAC for any renewals, etc.

Thus, we normally recommend that you set your default allocation to a paid and/or departmental allocation, and normally charge jobs against those allocations. If and when you encounter a situation wherein your workload for a given quarter is exceeding the quarterly allotment from your paid and/or college/departmental allocations, then you can dip into the AAC allocations to make up the difference. Although exceptions can arise, we find that this type of arrangement is likely to maximize your benefit from the allocations.

Another consider to be accounted for in this decision is the amount of SUs left in the allocation, and the number of running and pending jobs charging against the allocation. The scheduler will not start a job unless it determines that there are sufficient SUs in the allocation to complete the job in question along with all currently running jobs charging against that allocation. In order to estimate the number of SUs needed to complete a job, the scheduler uses the maximum walltime requested for the job.

For example, if you have 5 kSU unused in your allocA allocation, and the job you wish to submit has a walltime of 50 hours and an SU billing rate of 96 SU/hour, the scheduler will assume the job needs 4.8 kSU to complete. If there are no other running jobs charging against the allocA allocation (this includes jobs by other people in your group), the scheduler will consider the job able to start if sufficient compute resources are available. However, if there are 5 jobs already running and charging against the allocA allocation which have an SU billing rate of 50 SU/hour and are halfway through their requested 6 hours of walltime, the scheduler will estimate that each job will run for 3 more hours and so consume 150 SU each, or 0.75 kSU for all 5 such jobs. In that case, the new job will not start because 0.75 kSU for the running jobs plus 4.8 kSU for the new job will exceed the 5 kSU unused in the allocation. This calculation will change over time; e.g. if 4 of those jobs finish right after this calculation (so effectively do not consume any of the 5 kSU unused), the next time the scheduler looks at the job it will see 0.15 kSU needed to finish the remaining job, and 4.8 kSU to finish the new job, or 4.95 kSU total, and the job will be able to start.

Unfortunately, the schedule cannot handle things like "charge this job to allocA, unless there are not enough SUs in which case charge it against allocB". So once the allocations are nearing depletion, you will need to more closely monitor the usage and make such determinations as to which allocation to schedule a job against. But this is generally only an issue when the allocation is close to being depleted.

Note that the queuing system will NOT automatically select another account if there are insufficient funds in the account specified for the job. E.g., if you have access to both allocA and allocB and you specify that a job should charge against allocA (either explicitly or via the default account), the scheduler will not change that to allocB if allocA is depleted. The job will just wait in the queue until such time that additional SUs are available in allocA.

Note also that others in your group may have access to the same account, so just because funds were there when you submitted a job, someone else's jobs may have started since then and may reduce the funds in the account.

See here for more information about specifying the account to be charged when submitting a job.

Managing Allocations

This section discusses various topics related to the management of allocations. It is broken into two sections depending on what level of management:

For PIs managing their own research project
For managers of College or Departmental pools of HPC Resources

Allocation Management for PIs

Most of the tools for managing allocations, etc. for PIs are through the ColdFront allocation management portal. A full list of these tools can be found at the aforementioned pages, these including:

Viewing information about your projects and allocations.
Managing membership of your projects and allocations.

You can also get a bit of useful information from the followimg commands:

The sbalance command can display the amount of compute time used by your allocations, and with additional flags it will show this broken down by user.
The scratch_quota command can display the amount of scratch space used by your projects, and with additional flags it will show this broken down by user.
The shell_quota command can display the amount of SHELL space used by your projects, and with additional flags it will show this broken down by subvolume (including user subvolumes) in the project.

Allocation Management for College/Department Pool Managers

Certain units may have pools of resources on the cluster that they can allocate to researchers within their units. These pools are typically granted in return for contributions of hardware to the cluster. Whereas on the Deepthought2 cluster these pools were often arranged as one large allocation account, that proved to have several problems. Such an arrangement made it difficult for different members of the same research group to share files without sharing them with the entire department or college. It also made it difficult for system administrators to contact faculty members regarding student accounts, forcing the departmental/college contact to have to act as the "middleman" in all such communications. It also meant that the departmental contacts had to handle all of the requests from users from the department wishing to get access to cluster, and conversely to handle the removal of all users who should no longer have access (and unfortunately, that typically was ignored). This is all made more complicated as the departmental contact usually needs to contact the actual PI/faculty advisor to determine the eligibilty.

On Zaratan, the addition of storage tiers as allocated resources will just make that even more problematic, especially when the usage exceeds the quota. We are hoping to converting such large departmental/college allocations into pools of resources which can be suballocated to to researchers in the unit. Individual PIs in the unit can be granted allocations of resources from the pool. The PI can then manage which users have access to the allocation, which then removes some of the burden from the departmental manager and places it in the research group with more knowledge of the situation.

From the PI's perspective, they will typically already have a project , containing all of the allocations for the research group. This will typically include an allocation from the campus Allocation and Advisory Committee, and if you grant them an allocation from your pool, that will appear as another allocation under the project. There could be additional allocations as well, if the PI has allocations from another department or unit, or if they have purchased resources. The PI could also have multiple projects --- this could be because they have multiple research groups, or more typically if they have a project for a class they are teaching. If they are also a pool manager, the pool will also appear as a separate project. Generally, all users belonging to allocations within the same project belong to the same group, and scratch and SHELL storage are organized by project.

The compute resources for each allocation in the project will appear as distinct Slurm allocation accounts , and when a job is submitted it will need to specify which account to charge against (or charge against the default allocation account). Storage resources are handled differently --- because it is difficult to classify a file as belonging to one allocation or another, and even messier to have to assign it in such a fashion, we sum up the storage allotments (separately for each storage tier) for all allocations in a project, and use that to set the storage limit for the project's storage directory on that tier.

Pool managers are responsible for allocating the resources in the pool to the individual researchers. Unfortunately, system administrators are not a this time (or in the foreseeable future) able to delegate the actual ability to create/modify allocations to the pool managers, so pool managers will need to send email to hpcc-help@umd.edu to request your changes. This will be handled by people, so you can send multiple requests in a single email.

You are allowed to oversubscribe the compute and/or storage resources in your pool; that means it is permissible for the sum of the compute and/or storage limits (separately for compute and each tier of storage) alloted to each of the suballocations from this pool to exceed the size of these resources in the pool. This was not initially allowed on Deepthought2, which is one reason many units adopted the large departmental/etc. pools --- there were some units that had a fair number of modest HPC users who had a modest average quarterly usage of compute resources, but might need double their average usage in occasional quarters. Without oversubscription, one would need to set the suballocation size to double their average usage, which means on average the allocation would only be half utilized, and (since we are not oversubscribing), the other half of their allocation could not be used by anyone else. With oversubscribing, the unused half of that suballocation could be doubly (or more) allocated, increasing the effective utilization.

While this is certainly advantageous, it needs to used carefully, because to be fair to other users on the cluster, the total usage from suballocations from your pool are restricted to the available compute resources in pool (on a quarterly basis). E.g., if you have a pool of 100 kSU/quarter and you have two suballocations A and B to which you assign 75 kSU/quarter each, then if A uses 75 kSU in a given quarter, B is limited to 25 kSU. So this will work if both A and B only use 50 kSU/quarter on average, and when one uses more than average, the other uses correspondingly less. But clearly their will be complaints if there is a quarter where users of both A and B suballocations want/need to use more than 50 kSU.

You can also oversubscribe storage. This is more problematic, since while SUs are somehwat ephemeral (every quarter the quarterly SU usage is reset), files tend to be more permanent --- once created they remain until someone deletes it. Furthermore, there are various mechanisms by which individual projects can end up going over there filesystem limits. For instance, the project limit is contributed from various allocations, which can expire. E.g., consider a project with a 3 TB limit, with 1 TB coming from an AAC allocation, one from your pool, and one from a project level purchase agreement with DIT, and assume that the project is using 2.9 TB of storage. If one of these allocations expires, suddenly the project is consuming 2.9 TB but only has a 2 TB limit. Assuming the allocation from your pool has not expired, your pool will now be considered to be consuming 1.45 TB from that suballocation (despite your only authorizing 1 TB to that suballocation). It is even more complicated than that, as the enforcement of storage limits has some technical limitations. The Zaratan scratch storage has some delays in the quota enforcement stage, so users can in some cases continue to write data over the limit for a fraction of an hour after exceeding the limit. And the SHELL storage for each project is divided into multiple volumes (at least one per user, and perhaps more), each of which has a size limit, but does not have quotas as such. The sum of these maximum volume sizes can exceed the SHELL storage limit for the project. So it is very possible for individual projects to exceed their storage limits, at which point they will be notified and instructed to rectify the situation, But such can also impact the usage of your pool.

It is recommended that pool managers use care if/when oversubscribing, especially for storage. Ideally, you should avoid oversubscribing initially if possible, and wait until you have several quarters worth of data showing actual utilization of the resources in the pool. If there is a consistent history of under-utilization, then it might be reasonable to allow for some oversubscription if needed, but even then you probably should be a bit conservative and allow some room for fluctuations in the average usage. For storage, you should also remember that storage usage, especially on the SHELL tier, is likely to monotonically increase.

The are several things that pool managers can do in ColdFront

If you wish to delegate/share/designate a backup for managing the pool, you can do so in ColdFront. To authorize someone else to contact the HPC admins to create/delete/alter suballocations in your pool, you can just add the user as a manager of the project associated with the pool.. Similarly, you can remove their authorization by removing them as a manager. You may wish to inform us of such changes, as while adding them as a manager of the project will cause us to treat them as authorized to give us instructions regarding suballocations, it does not automatically grant them permission to view metrics for the suballocations in ColdFront.
You can see various metrics about the pool in ColdFront. To do so, login and click on the project associated with the pool. Scroll down, and you will see a section called "Allocations", which displays an allocation listing resources within the pool. Important resources values are:

kSU (Allocatable)

This gives the number of kSU /year that have been allocated (i.e. the sum of the kSU/year limits for all suballocations of the pool)/ the total number of kSUs/year available in the pool.

Storage Allocated (TB) - HPFS

This givens the number of TB of scratch (also known as High Performance File System) that has been allocated versus the amount of such storage in your pool,

Storage Allocated (TB) - SHELL

This givens the number of TB of SHELL (also known as medium term storage) that has been allocated versus the amount of such storage in your pool,

kSU (Quarterly)

This givens the number of quarterly kSU that users in the various suballocations have used in the current quarter, versus the total amount available for suballocations of the pool. This total amount is one quarter of the yearly kSUs available in the pool. This limits the usage of all the suballocations of the pool to the size of the pool --- you are able to oversubscribe the pool (i.e. the sum of the SUs allocated to all of the suballocations can exceed the number of kSUs in the pool), but for fairness to other users, the total usage of the suballocations is limited to the size of the pool.

Storage Quota (TB) - HPFS

This givens the number of TB of scratch (high-performance) storage that users in the various suballocations are currently using, versus the total amount available for suballocations of the pool. Since the storage limit for each suballocation is the sum of the limits on scratch storage for all allocations in a project, this metric is a bit tricky. We determine what fraction of the limit of scratch storage for the project is being used, then compute what that fraction is of the scratch storage contributed to that limit by each allocation and assign that as the scratch space used by the allocation. This metric is useful if you oversubscribe the scratch storage in your pool. If the used scratch storage exceeds the limit, you will be requested to quickly rectify the problem, either by getting users/research groups to reduce their usage, or to obtain/purchase additional storage.

Storage Quota (TB) - SHELL

This is similar to Storage Quota (TB) - HPFS, but for SHELL (medium-term) storage . Basically, it displays the number of TB of SHELL storage that users in the various suballocations are currently consuming, versus the total amount available for suballocations of the pool. It is useful if you oversubscribe the SHELL storage in your pool; if the used SHELL storage exceeds the limit, you will be requested to quickly rectify the situation.
You can also see the above metrics in more graphical form by going to the allocation associated with the pool. Either click on the associated allocation from your home page, or if you are on the project page, go to the Allocations section and click the icon on the right (under the Actions heading) for the allocation.
You can also see information about the various suballocations from your pool. Actually, we do not currently have sufficiently fine grained control to restrict this to only suballocations of the pool you manage, so you will have access to see information about all allocations and projects in ColdFront. Please be respectful of the privacy of projects and allocations not related to the pool you are managing. To see information for a subproject/suballocation of your pool:
1. Log into ColdFront.
2. Select Projects from the menu bar
3. Click on Filter
4. Enter any search criteria (i.e. the PI's name or field of science)
5. IMPORTANTClick the Show all projects button
6. Click the Search button
7. You will then see a list of projects. Click on the ID number to see more information about the project. When you see the Allocations list for the Project, you can click on the icon under the Action header to see more information about that Allocation.

Monitoring Allocations

You and your research group are responsible for ensuring proper rationing of the funds in your account(s). Excessive use of funds for a non-AAC allocation at the beginning of a quarter could result in no funds being available for jobs at the end of the semester.

This can be deliberate and beneficial, e.g. if you have important deadlines in the middle of the quarter and are willing to "borrow ahead" to get computations for that completed before the deadlines. This is an advantage of the model used by the UMD HPC clusters; you can use nearly 3 times the power of the computers you purchased in a single month to rush out computations, at the cost of having very limitted usage the following two months (but since it is after the deadlines, that might not be important).

But if such a shortage of funds occurs because some junior member of the group is sending an excessive number of very expensive jobs, this can be quite problematic, especially as you might not notice the impact of the errant user until too late.

The Division of Information Technology cannot tell which jobs are important and which are not, nor what is good usage of your allocation funds and what is not. If we notice seriously problematic usage (e.g. a job reserving 10 nodes but only running processes on 1 node), we will do our best to notify and instruct the relevant users. But you are responsible for monitoring your own jobs, and it behooves you to monitor jobs of other users of your allocations. We will provide the necessary tools to do such, but we strongly advise all research groups to have at least one person monitor the usage of their allocations' funds regularly to ensure there are no problems, or at least catch any problems early.

How many SUs are left in my allocation?

The first level of monitoring of your allocations is with the sbalance command. E.g.

payerle:login-1:~>sbalance
Account: test-hi (dt)
Limit:     163.52 kSU
Available: 163.47 kSU 
Used:      0.05 kSU (0.0 % of limit)

Account: test (dt)
Limit:     327.04 kSU
Available: 325.33 kSU 
Used:      1.71 kSU (0.5 % of limit)

Without any arguments, it will list usage metrics for all accounts to which you have access to. The above listing is from early in the quarter for a co-op type project; note that both accounts are nearly full, and that the test account has nearly double the amount of the test-hi account. The line starting with "Used" not only gives the number of kSU used, but also the usage as the percentage of the limit. If this percentage is significantly higher than the percentage of time between now and the start of the month (for your high-priority account), or the start of the quarter (for normal-priority accounts), you might need to get concerned. I.e., if at one week into the month you see the usage on your high-priority account is over 30% of the limit, your group is burning your SUs faster than they will be renewed, and you might have some time at the end of the month with nothing in your high-priority account.

For AAC grant type accounts, there is no monthly or quarterly replenishment. The "Limit" should reflect the amount of compute time the AAC granted you, and the percentage is how much of that you have used. If the percentage used is significantly greater than the percent of your work which is complete, you should consider working on an update to your proposal to request more time.

If you are tasked with monitoring the usage of the accounts by your colleagues in the project (or have taken said task upon yourself), you can use the -all flag to sbalance to see who is using the funds in the account. You might also wish to use the -account flag to limit the output to a single allocation account, e.g.:

login-1: sbalance -account smith-prj-aac -all
Account: smith-prj-aac (DEFAULT)
Limit:     163.52 kSU
Available: 102.07 kSU 
Used:      61.45 kSU (37.6 % of limit)
        User jtl used 17.6044 kSU (28.6 % of total usage)
        User kevin used 13.3456 kSU (21.7 % of total usage)
        User payerle used 30.5000 kSU (49.6 % of total usage)

This lists the same information as before, with the addition of showing every user who has used the account for the current quarter, showing not only the number of kSU they consumed, but what percentage of the total usage for the account. E.g., in the example above, you can see that user payerle is using almost as much as users kevin and jtl combined. You can add the flag --nosuppress0 if you want to also see lines for everyone with access to the allocation but who did not consume any time since the start of the quarter.

The --help option to sbalance will display usage options, most of which were discussed above.

How much scratch storage space is being used by my project? How much is left

The scratch_quota command will show the quota and usage for the high-performance or scratch tier of storage on the cluster. Without any arguments, it will display the usage and quotas on the scratch filesystem for all groups to which you belong. This should include a group named zt-PROJECTNAME for each project to which you belong. Currently, the list might also include other groups which do not correspond to projects; these will be shown with no usage or quota. After that, it will show the total usage (across all projects/groups to which you belong) for you.

You can use the --group flag to specify a group to display usage and quotas for. You can also use the --user flag to specify users: If you specify one or more groups, it will display the usage and quotas for those groups.

You can also use the --user flag. If no groups are explicitly provided, the script will list usages and quotas for all groups the named users belong to (if neither users nor groups are explicitly given, the code will act as if your username was explicitly given).

If any users were given (explicitly, or if your username was added by default), after the usages and quotas for all groups are listed, the script will then list the total scratch usage by each of the specified users. If the flag --all-users is provided, this will be done for all members of the groups being displayed.

NOTE: when listing usages for users, it is always the total usage for that user across all projects to which they belong (or once belonged). It is their usage for the listed project. Unfortunately the underlying filesystem quota system only calculates usage by group or separately by user, but not by the ordered pair group and user. So there is no ready way to determine usage by an user inside a single project except to traverse the directory tree and sum file sizes, which can potentially take a significant amount of time.

Other options are documented in the help option of the script; use --help to see them.

How much SHELL storage space is being used by my project? How much is left

The shell_quota command will show the quota and usage for the medium term SHELL tier of storage on the cluster. Without any arguments, it will display the usage and quotas on the SHELL filesystem for all projects to which you belong.

You can use the --project flag to specify a project to display usage and quotas for. Projects should be specified by the name of their root directory under /afs/shell.umd.edu/project.

Remember that SHELL storage is volume based and individual volumes are subject to per volume caps. You can use the flag --show_volumes to see a list of all volumes belonging to the project and their respective usages and volume caps.

NOTE: the information returned by this command is updated in a cron job which runs about every six hours or so. Therefore it will take time for changes to be visible.

Other options are documented in the help option of the script; use --help to see them.

How much of my home quota is used? How much is left?

The home_quota command will show your your usage and quota on the home space filesystem. It will also show information on the "grace" period; we normally allow users to exceed their 10 GB homespace quota by up to 10 GB for about 1 week; if you see something other than "[n/a]" in the "Grace Ends" field, you have exceeded your homespace quota and you will be unable to save any data to your home directory once the grace period ends (or you go over the 20 GB hard limit).

Seeing job history

The sacct command can be used to view the accounting records of jobs, both past and currently running. It takes some time to run, and can display a fair amount of information (which is documented in its man page). You will almost always wish to restrict it to a time range, so to see the usage of account foo for the month of November 2014, one could use


login-1> sacct --format=JobID,User,Account,ReqCPUs,AllocCPUS,Elapsed,CPUTime \
	-a  -X  -S  2014-11-01 -E 2014-11-30 -A foo

       JobID      User    Account  ReqCPUS  AllocCPUS    Elapsed    CPUTime 
------------ --------- ---------- -------- ---------- ---------- ---------- 

2717747       payerle  foo             16         20 1-00:00:09 20-00:03:00 
2717748       payerle  foo             16         20 1-00:00:09 20-00:03:00 
2717749       payerle  foo             16         20 1-00:00:09 20-00:03:00 
2717750       payerle  foo             16         20 1-00:00:08 20-00:02:40 
2717751       payerle  foo             16         20 1-00:00:08 20-00:02:40 
2717752       payerle  foo             16         20 1-00:00:08 20-00:02:40 
2717753       payerle  foo             16         20 1-00:00:17 20-00:05:40 
2717754       payerle  foo             16         20 1-00:00:17 20-00:05:40 
2717755       payerle  foo             16         20 1-00:00:17 20-00:05:40 
2717756       payerle  foo             16         20 1-00:00:12 20-00:04:00 
2718384       payerle  foo             10          0   00:00:00   00:00:00 
2718385       payerle  foo             10          0   00:00:00   00:00:00 
2718386       payerle  foo             10          0   00:00:00   00:00:00

Here,

ReqCPUs is the number of cores requested
AllocCPUs is the number of cores allocated to the job. The jobs shown were run in exclusive mode, so the full 20 cores on the node were allocated to it.
Elapsed is the elapsed walltime for the job
CPUTime is the elapsed walltime times AllocCPUs. This is what is charged against the foo account.
The last three jobs are still pending, so have not been allocated any CPUs yet, and have not accumulated any walltime (or charges).

The command man sacct will give a very complete manual for the sacct command. A set of flags which are often useful is --format=JobID,User,State,AllocTRES,Elapsed -P -X. The -P flag caused the output to be in pipe (|) delimitted fields, and the -X flag causes only allocations (and not job steps) to be displayed. The --format flag controls which fields are output. With these flags (and others to filter which jobs are displayed) you might get something like:


login-1> sacct --format=JobID,User,State,AllocTRES,Elapsed -S 2024-11-01 -u payerle -P -X 
JobID|User|State|AllocTRES|Elapsed
8517724|payerle|COMPLETED|billing=1,cpu=1,energy=45108,mem=4000M,node=1|00:01:05
8533735|payerle|COMPLETED|billing=7,cpu=1,energy=63180,gres/gpu:a100_1g.5gb=1,gres/gpu=1,mem=4000M,node=1|00:01:00
8533736|payerle|COMPLETED|billing=7,cpu=1,energy=42900,gres/gpu:a100_1g.5gb=1,gres/gpu=1,mem=4000M,node=1|00:01:00
8533737|payerle|COMPLETED|billing=7,cpu=1,energy=53040,gres/gpu:a100_1g.5gb=1,gres/gpu=1,mem=4000M,node=1|00:01:01
8533741|payerle|COMPLETED|billing=7,cpu=6,energy=46800,gres/gpu:a100_1g.5gb=1,gres/gpu=1,mem=24000M,node=1|00:01:02
8569479|payerle|COMPLETED|billing=7,cpu=6,energy=51642,gres/gpu:a100_1g.5gb=1,gres/gpu=1,mem=24000M,node=1|00:01:01
8653934|payerle|COMPLETED|billing=144,cpu=6,energy=47945170,gres/gpu:h100=1,gres/gpu=1,mem=24000M,node=1|00:01:00
8659518|payerle|COMPLETED|billing=7,cpu=6,energy=64057,gres/gpu:a100_1g.5gb=1,gres/gpu=1,mem=24000M,node=1|00:01:00
8659521|payerle|COMPLETED|billing=7,cpu=6,energy=60840,gres/gpu:a100_1g.5gb=1,gres/gpu=1,mem=24000M,node=1|00:01:00
8661531|payerle|COMPLETED|billing=6,cpu=6,energy=48454,mem=24000M,node=1|00:01:00
8708751|payerle|TIMEOUT|billing=12,cpu=12,energy=1172654,mem=48000M,node=1|01:00:21
9244826|payerle|TIMEOUT|billing=4,cpu=4,energy=131026,mem=16G,node=1|00:05:18
9299430|payerle|COMPLETED|billing=144,cpu=1,energy=162727,gres/gpu:h100=1,gres/gpu=1,mem=4000M,node=1|00:01:00

In particular, you can use the information above to calculate the SU cost per job: the value of the billing trackable resource (TRES) in the AllocTRES field represents the hourly SU cost for the job. Multiplying that by the elapsed time in hours yields the SU cost of the job. E.g., job 8653934 in the above example has an hourly billing rate of 144 SU/hour (due to the use of an H100 GPU), and it ran for 1 minute = 0.0166 hour, so the SU cost is 2.4 SU.

Usage reports

The sbalance command returns the usage for the current quarter, and the scratch_quota, shell_quota, and home_quota commands all report the current usage status. While this is probably what most users are concerned with most of the time (e.g., if I want to figure out if there are enough kSUs to run my job now, usage from previous quarters is irrelevant), but sometimes one needs information regarding usage over longer time scales. This is especially useful for people who manage "pools" of resources for departments or colleges.

There are a couple of tools available to get more historic information regarding allocation use:

The Zaratan XDMoD website is a web page running the Open XDMoD (Open XD Metrics on Demand) web application. This can present in graphical form many metrics pertaining to the Zaratan cluster. One can see how many kSUs were consumed by a given allocation as a function of time, or what the average job length for an allocation over the past year. Although some features are available without logging in, for the best experience we recommend that you log into the site using your standard UMD username and password.

The ColdFront allocation management portal allows one to command runs from the login nodes of the cluster. This command examines all the job records related to the allocation account(s) specified, and provides summaries. (As opposed to the sacct command which lists details for each job, but does not summarize.). Because it has to go through all the job records, it does tend to be a bit slow.

We only discuss some of the more commonly used options below; the command supports a --help or -h option which provides more information on its usage (including some options to provide even more usage information). The commonly used options are:

--account=ACCOUNT: this specifies which allocation accounts should be looked at. You can specify multiple allocation accounts by either repeating this argument and/or by replacing ACCOUNT with a comma-delimited list of allocation account names. If no allocation accounts are given, it defaults to all allocation accounts for which you are either a member of a point-of-contact.
--unit=UNIT: this specifies which unit should be used in output. The default is 'SU', but 'kSU', 'cpu-min' or even 'cpu-sec' are alternatives.
--start=START: this specifies the start of the time-period which is being examined. By default, it defaults to the start of the current quarter, but you can specify another start time by giving a date in the YYYY-MM-DD format (or a date and time in the YYYY-MM-DDThh:mm:ss format).
--end=END: this specifies the end of the time-period being examined. There is no default (although see also the --timeperiod flag). If given, it uses the same format as the start time.
--timeperiod=TIMEPERIOD: this is an alternative way of specifying the end of the time-period being examined. It should not be used if the --end flag was used. It defaults to 'quarter', but 'day', 'week', 'month', and 'year' are also valid options.
--machine-parsable: If this flag is given, the output produces is in a delimiter-separated-values format, using a pipe ('|') character as the delimiter. This is useful if you do further processing on the output, e.g. in a spreadsheet.
--noheaders: If this flag is given, the normal header text is not printed. This might be useful when using --machine-parsable.
--byuser: Normally the script summarizes usage at the allocation account level, but if this flag is given the information is presented by user and allocation account.