Globus

  1. Overview of Globus
  2. Globus Connect Personal
  3. Using the Globus Web Application to transfer files

Overview

Globus is a cloud-based software-as-a-service providing file transfer, sharing, and data publication functionality. It is supported by most High Performance Computing clusters in the world and is designed for efficiently moving very large (many TBs) of data. Globus has automatic support for using multiple streams (thereby speeding up transfers) as well as restarting failed transfers without manual intervention.

Globus allows for transferring files and data between any pair of "end points". As stated earlier, most HPC clusters have Globus "end points", allowing you to transfer data efficiently between HPC clusters. Also, you can use a web application to temporarily make your desktop/workstation into an "end point" so that you can transfer files to and from it as well.

Globus Connect Personal

Globus Connect Personal allows you to set up your workstation as a Globus endpoint which you can later use to transfer files to and from. If you are only interested in using Globus to transfer files between existing endpoints (e.g. HPC clusters), you will not need to use this.

To setup your workstation as an endpoint, you will need to install Globus Connect Personal on it. This needs only be done once. To do this, open a web browser on your workstation and:

  1. Visit https://www.globus.org/globus-connect-personal.
  2. There should be a blue shaded region to the right with links to instructions for installing Globus Connect Personal on various common operations systems. Click on the apprpriate link and follow the instructions.
  3. During the installation process, the installation script will prompt for where Globus should be installed. The default location usually requires administrative access to the workstation; if you do not have such you can install to a location where you have write access (e.g. beneath your home directory).
  4. You will need to configure Globus Connect Personal before using. This is covered in detail in the aforementioned installation instructions as well. In particular:
    1. Under the "Access" tab you can add folders which you want to give Globus access to. The "Writable" checkbox means that Globus will be able to transfer files into the folder. The "Shareable" checkbox means that Globus can transfer files out of the folder. It is strongly recommended that you leave the "Shareable" checkbox unchecked for security reasons. (Note that the "Shareable" box requires you to be a Globus Plus user, and that at this time the University does not provide such).
    2. Under the "General" tab, there are also checkboxes related to "Run when Windows starts" and "Automatically check for updates". It is recommended to leave the first ("Run when Windows starts") unchecked, and the latter ("Automatically check for updates") checked.

Using the Globus Web Application to transfer files

This section discusses how to transfer files with the Globus Web Application. If you wish to transfer files to or from you workstation, you will need to configure Globus Connect Personal first; that is not needed if you just wish to transfer between existing endpoints (e.g. most HPC clusters).

  1. Open a web browser on your workstation and go to https://globus.org/login
  2. You will be requested to log in. There will be a dropdown listing various organizations. There are multiple ways UMD users might log into Globus:
    1. Select University of Maryland College Park from the dropdown, you will be redirected to the campus login page, where you can use your UMD directory ID and password to log in. Most users will likely want to use this option --- it should work for faculty, staff, and graduate research assistants.
    2. Select Google from the dropdown list, and enter your @umd.edu or @terpmail.umd.edu email address and corresponding password. Since those email addresses use GMail as an underlying service, this should work, even for students without research assistanceships.
    3. Select Google from the dropdown list, and login using your personal GMail email address and password. While this should work, we strongly recommend using one of the above two mechanisms if possible.
    4. If you are a student who needs/wishes to log in directly using your UMD directory ID and password, but cannot do so because you are currently not in a research assistanceship position, you can ask your faculty advisor request that the campus data stewards release your specific account for direct authentication to Globus. Please see:
    Screenshot of Globus organizational login
  3. After logging in, you should reach the file transfer page. On this page you will get two file browser windows, one on the left and one on the right. Each window will have an "Endpoint" field and represent the two systems between which you wish to transfer files (once set up, you can transfer files in either direction). Select your endpoints (you can generally just starting typing in any part of the name to search, e.g. "Deepthought" for Deepthought2 or "MARCC" for bluecrab):
    • For the Deepthought2 cluster, choose University of Maryland - Deepthought2
    • For MARCC and/or the Bluecrab cluster, choose marcc#dtn
    • If you want to transfer data to/from your workstation, you will need to have previously configured your workstation for Globus Connect Personal. The Globus Connect Personal application will also need to be running on your workstation. Enter the name you gave your personal endpoint when you set it up. You should also be able to find it under the "Administered by Me" tab.

    Most endpoints will request an username and password for you to access them. Login in, and you should get a list of files and directories on that system.

  4. Screenshot of Globus endpoint login
  5. Once you have selected and logged into both endpoints, the file browser screens will display the filesystems on the two endpoints. You can then select files and/or directories by selecting on one system and then selecting the appropriate arrow to initiate the transfer.