Package: | ENVI |
---|---|
Description: | ENVI software for processing and analyzing geospatial data |
For more information: | https://www.l3harrisgeospatial.com/Software-Technology/Enterprise-Solutions |
Categories: | |
License: | Proprietary |
ENVI (ENvironment for Visualizing Images) is a software application used to process and analyze geospatial imagery. It is commonly used by remote sensing professionals and image analysts. ENVI, version 5.7, includes IDL (Interactive Data Language), version 8.9 This module will add the envi, idl and related commands to your path. NOTE: If running on an HPCC and requesting less than all cores on a node, you should be setting IDL_CPU_TPOOL_NTHREADS environmental variable to be equal to SLURM_NTASKS for best performance; for more information see https://hpcc.umd.edu/hpcc/help/software/envi.html
This section lists the available versions of the package ENVIon the different clusters.
Version | Module tags | CPU(s) optimized for | GPU ready? |
---|---|---|---|
5.7 | envi/5.7 | x86_64 | Y |
Although for short computations on a personal workstation the interactive mode of IDL is nice, sometimes one wishes to have IDL work in a batch-style mode. This is a requirement for using IDL on the HPC clusters, where computationally intensive processes must be submitted to the scheduler for running on the compute nodes.
The easiest method is probably to invoke idl in batch mode on a simple
script file which then uses the .RUN
IDL executive command
to run a main program file containing the real code you wish to run. I.e.,
create a main program file the code you wish to run. For example, the
following is a simple main program to print factorials, which we will place
in a file called factorial_test.pro
:
f = 1
max = 7
for k=1,max do begin
f = k * f
print, k, f
endfor
end
You should also create a simple IDL batch script to invoke this program,
e.g. the batch_test.pro
file below:
.run factorial_test.pro
exit
|
Be sure to include the exit command if you wish for IDL to terminate when
the main program is finished. This is especially important if submitting
to the HPC compute nodes via sbatch, as otherwise your job will not terminate
when the calculation is finished, but wait until it hits the wall time limit,
wasting CPU cycles (and charging your allocation account)
|
You can then invoke your main program from the unix command line with
something like idl batch_test.pro
(NOTE:
you can omit the .pro
extension in the idl
and
.run
commands if so desired; the .pro
extension
will be used by default.) To use with sbatch, a job script like
#!/bin/bash
#SBATCH -ntasks=1
#SBATCH -t 15
#SBATCH --mem-per-cpu=1
. ~/.profile
module load idl
echo "Starting job ..."
idl batch_test
The two separate .pro
files are required in general because
in IDL batch mode, which batch_test.pro
runs under, each line is
read and executed immediately. Control statements, e.g. the for loop in
our factorial_test
example, often span multiple lines ---
although you can use ampersands and dollar signs to continue lines, this
quickly becomes messy. In main program parsing mode, such as used for
factorial_test
, the entire program unit is read and compiled as
a single unit. Since IDL code run in batch mode or submitted to the HPC
compute nodes is assumed to be complicated, it is probably best to use this
two file approach.
Recent versions of IDL support multi-threading, at least to some degree. This means that on systems with more than one CPU and/or multiple cores per CPU socket, IDL will use multiple threads to do work in parallel when the application determines it is advantageous to do so. This is automatic, and invisible to the user except for hopefully improved performance.
By default, when IDL encounters a calculation that would benefit from multi-threading, it will generate a thread for every CPU core on the system it is running. This is probably the desired behavior when IDL is the only (or the main) program running on a system. E.g., on a desktop or dedicated compute node.
But on some HPC systems, the available cores per node can be somewhat large (e.g. nodes on DT2 have at least 20 cores), and might be more than the optimal degree of parallelization for certain problems. On these systems, for certain problems, you might not wish to allocation all the cores on the node to the system (since you will be charged for the time on those cores). However, if you restrict the number of cores that you are requesting, you must tell this restriction to IDL as well, because otherwise IDL will just try to use all the cores anyway, which could adversely affect performance.
E.g., assume that you have determined through trial and error that for
the particular type of problem you are working on, the greatest efficiency
occurs for 6 cores (e.g., a 6 core job is 40% faster than a 4 core job, but
an 8 core job is only a few percent faster than a 6 core job). So you submit
a bunch of 6 core idl jobs, with the --share
flag so that
you are not charged for all
the cores on the node. And suppose that 3 of your jobs end up on the
same 20 core node (since you told slurm these are 6 core jobs, three will fit
on a 20 core node). If you do not tell IDL to restrict itself to 6 cores,
each jobs will determine that there are 20 cores on the node, and assume it
can use all of them, and multithreaded calculations will be split into 20
tasks. In addition to any performance hit from using too many tasks for each
job, you also have up to 60 tasks trying to run on a 20 core system, which
will further degrade performance.
To tell IDL to use less than all the cores it finds on the system, you
need to set the IDL_CPU_TPOOL_NTHREADS
environmental variable.
By default, it is 0, which means use all cores on the system. You should
set it equal to the number of cores your requested from Slurm; we recommend
that you set it to the SLURM_NTASKS
to ensure consistency between
what you requested from Slurm and what you tell IDL. E.g.,
#!/bin/tcsh
#SBATCH -n 8
#SBATCH --share
#SBATCH -t 2:00
#SBATCH --mem=8000
module load idl
setenv IDL_CPU_TPOOL_NTHREADS $SLURM_NTASKS
idl -e "myprogram.idl"
If you change the number of cores (the value after -n
), the
value of IDL_CPU_TPOOL_NTHREADS
will automatically agree.