-->
Skip to main content
|
*** DEPRECATED ***
NOTE: The authors of Python have stopped maintenance of all Python2 versions as of 1 Jan 2020. While the UMD Division of Information Technology continues to provide access to the existing Python2 installs for now, all Python2 installations are ***DEPRECATED*** and will not be upgraded, have new extensions installed, etc. It is likely that Python2 will not be made available when the Zaratan cluster is stood up. All Python users are strongly encouraged to migrate to Python3. |
Package: | python |
---|---|
Description: | Python scripting language |
For more information: | https://www.python.org |
Categories: | |
License: | OpenSource (Python Software Foundation) |
Python is a high-level scripting language.
This module will add the python and related commands to your path.
In case you need to link against this library in your code, the following environmental variables have been defined:
You will probably wish to use these by adding the following flags to your compilation command (e.g. to CFLAGS in your Makefile):
This section lists the available versions of the package pythonon the different clusters.
Version | Module tags | CPU(s) optimized for | GPU ready? |
---|---|---|---|
3.10.5 | python/3.10.5 | zen2 | Y |
3.7.7 | python/3.7.7 | zen | Y |
3.8.12 | python/3.8.12 | zen2 | Y |
Version | Module tags | CPU(s) optimized for | GPU ready? |
---|---|---|---|
3.7.7 | python/3.7.7 | skylake_avx512, x86_64, zen | Y |
|
*** DEPRECATED ***
NOTE: The authors of Python have stopped maintenance of all Python2 versions as of 1 Jan 2020. While the UMD Division of Information Technology continues to provide access to the existing Python2 installs for now, all Python2 installations are ***DEPRECATED*** and will not be upgraded, have new extensions installed, etc. It is likely that Python2 will not be made available when the Zaratan cluster is stood up. All Python users are strongly encouraged to migrate to Python3. |
When using in conjunction with your own code, you might wish to
note the compiler and MPI libraries used when the python binaries and
packages were built. MPI in particular can be fussy and generate strange
errors if the different parts of the code are linked against different
MPI libraries (even different versions of OpenMPI or the same version of
OpenMPI built with a different compiler), or if the mpirun
command used to start the code is from a different MPI version or was built
with a different compiler. In general, it is best to ensure everything is
built with the same compiler and, if used, the same MPI library.
Python's capabilities can be significantly enhanced through the addition
of modules. Code can import
a module to enable its functionality.
The supported python interpretters on the system have a selection of modules preinstalled. If a module you are interested in is not in that list, you can either install a personal copy of the module for yourself, or request that it be installed site wide. We will make reasonable efforts to accomodate such requests as staffing resources allow.
The mechanism for installing a module is of course dependent on the module being installed, but most modern python modules support the setup.py mechanism described below. But many packages will support installing via pip and virtual environments as well, and that is typically easier.
Note: Users might wish to look at Installing python modules using virtual environments first, as that is often easier.
The standard procedure for installing your own copy of a module is:
module load python/X.Y.Z
to select the version of python
you wish to use.mkdir ~/.mypython
will work.
You should also create lib
and lib/python
directories
beneath it, e.g. mkdir ~/.mypython/lib ~/.mypython/lib/python
.
~/.mypython
, something like
setenv PYTHONPATH ~/.mypython/lib/python
(bash/bourne shell
users should do PYTHONPATH=~/.mypython/lib/python; export PYTHONPATH
). You probably want to add this to your .cshrc.mine
or .bashrc.mine
.setup.py
python setup.py install --home ~/.mypython
If all goes well, the module should now be installed under
~/.mypython
or wherever you specified. If there are executables
associated with it, they should be in ~/.mypython/bin
. You
should be able to import the module in python now (this assumes that
PYTHONPATH is set as indicated above).
Of course, not all modules install easily. Unfortunately, the install process can fail in far too many ways than can be reasonably enumerated. If you are comfortable with building modules, you might find reasonable guidance from error messages to assist you in getting the module to build, but it is probably easiest to just request the module be installed to the system libraries.
Although the standard procedure described above works for most cases, there
are cases where more separation is required. Python3 includes a
venv
module which
allows you to create a fully independent virtual python environment,
copying the python executables and standard and system libraries to your own
directory, and allowing you to add/update/delete from there. This has the
advantage that the virtualenv is almost completely isolated; so changes made
in the system installation of python are unlikely to impact your virtualenv.
This can be important if you have a code or application which requires e.g.
version 1.6 of the foo package, but will break if it is upgraded to 1.7 (it
appears that when using standard scheme above using PYTHONPATH, the system
library directories are ALWAYS searched before PYTHONPATH, meaning that method
can be used to add modules, but not to upgrade or downgrade modules).
However, the virtualenv takes up a significant amount of diskspace, and the isolation from the system python can be a negative as well as upgrades and/or new modules added to the system python will NOT be visible --- this is good when as in the example above it breaks something, but most of the time the upgrades are desirable.
To install a package with the virtualenv
mechanism, you
must first create a virtual python environment.
module load python/X.Y.Z
to select the version of python
you wish to use in this virtual environment.my-venv
subdirectory of your home directory (e.g. ~/my-venv
).
python -mvenv --system-site-packages ~/my-venv
:
This variant will give the virtual environment access to system
installed python packages, e.g. numpy, scipy and matplotlib. This
is the easiest version, but as it is less isolated from the system
python installation it can lead to problems if there are version
compatibility issues.
python -mvenv ~/my-venv
:
This variant will isolate
the resulting environment from system packages. This is the safest
approach, but may require you to install packages available on the
system, and can be trickier in some cases.
mkdir ~/my-venv
), and then unset the PYTHONHOME
variable set by the module
load command (i.e. unsetenv PYTHONHOME
for the csh and tcsh
shells, and unset PYTHONHOME
for bash). Then issue either of the
following commands:
virtualenv --system-site-packages ~/my-venv
virtualenv ~/my-venv
--system-site-packages
flag, behaves
like the python3 version with the same flag --- the system packages are still
available. The second version isolates your virutal environment from the
system packages.
source ~/venv/bin/activate.csh
for csh or tcsh shellssource ~/venv/bin/activate
for bash shellsdeactivate
will deactivate the virtual environment.
Once the virtual environment is created and activated, installation is
usually relatively simple using the pip
command. You should
just be able to do pip install NameOfPackage
. Pip
should take care of downloading the package and installing it for you.
Of course, not all modules install easily. Unfortunately, the install process can fail in far too many ways than can be reasonably enumerated. If you are comfortable with building modules, you might find reasonable guidance from error messages to assist you in getting the module to build, but it is probably easiest to just request the module be installed to the system libraries.
me="tricks">
Agg
(for Anti-Grain Geometry engine)
which can produce PNG
files, Cairo
and
Gdk
are other options. Use would be something like:
import matplotlib
# This needs to be done *before* importing pyplot or pylab
matplotlib.use('Agg')
import matplotlib.pyplot as plt
#Do your plotting, e.g.
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(range(10))
fig.savefig('test.png')
The most recent versions of Python installed (e.g. 3.5.1) provide a python module called "numba". Numba allows for certain portions of python code to be compiled to a lower-level machine code to improve performance, in many cases simply by adding the directive "@jit" before the function to compile. Depending on the function, one might achieve order of magnitude sized performance gains. E.g. (example taken from wikipedia)
from numba import jit
@jit
def sum1d(my_array):
total = 0.0
for i in range(my_array.shape[0])
total += my_array[i]
return total
Here, the addition of the "@jit" (for just-in-time compilation) can result in code running 100-200 times faster than the original on a long Numpy array, and up to 30% faster than Numpy's builtin "sum()" function, on standard CPU cores.
Some codes can perform even better on GPUs, and Numba can make this fairly simple by importing "cuda" from numba and using "cuda.jit" in place of "jit". There are constraints imposed when using GPUs, so not every code can be easily converted for GPU use.
To use Numba with GPUs on the Deepthought clusters, you will need to
The details of using Numba, and especially using Numba with CUDA, is well beyond the scope of this document. Some useful links for more information are:
If you wish to take advantage of the multiple cores and even many nodes available on High Performance Computing (HPC) clusters, it is useful to use the Message Passing Interface (MPI) for coordinate and communicate among the various processes, a standard and ubiquitous programming methodology for distributed memory parallelism.
There is a package mpi4py
available on all Pythons installed
system-wide on the Deepthought clusters which basically makes the various
MPI calls available to python code. Because mpi4py basically mimics the
function calls in the standard MPI library/API, it makes the task of
transcribing algorithms from python to/from C much easier.
When you have python code (e.g. my-mpi4py-script.py
)
designed to use MPI via mpi4py, you will normally
wish to execute the python code using the mpirun
command.
It is important that you use the mpirun
command from the SAME
MPI library as was used to build mpi4py
for the python version
you are running --- typically this will mean using module load
to load the correct gcc compiler and openmpi version as used in building
the python interpretter and modules, as listed in the
version information table at the top of this
document. E.g., a job submission script to launch
my-mpi4py-script.py
on 40 cores using python/3.5.1 might look
like:
#!/bin/bash
#Assume will be finished in no more than 8 hours
#SBATCH -t 8:00:00
#Launch on 40 cores distributed over as many nodes as needed
#SBATCH -n 40
#Assume need 6 GB/core (6144 MB/core)
#SBATCH --mem-per-core=6144
#Make sure module cmd gets defined
. ~/.profile
#Load required modules
module load python/3.5.1
#Load correct gcc (4.9.3) and mpi (openmpi/1.8.6) for python/3.5.1
module load gcc/4.9.3
module load openmpi/1.8.6
#Normally do not need to give -n 40, as openmpi will determine from Slurm
#environment variables
mpirun mp-mpi4py-script.py
Although exploring mpi4py is beyond the scope of this document, we do provide some on-line tutorials, etc., to help if you wish to explore mpi4py further: