Idl

Summary and Version Information
Running IDL in batch mode
Multithreading and IDL

NOTE: You might also wish to look at the page for the ENVI package.

Summary and Version Information

Package	Idl
Description	IDL interactive data language
Categories	Progamming/Development, Research

Version	Module tag	Availability^*	GPU Ready	Notes
8.3	idl/8.3	Non-HPC Glue systems Bswift HPCC Linux sun4x_510	N

Notes:
^*: Packages labelled as "available" on an HPC cluster means that it can be used on the compute nodes of that cluster. Even software not listed as available on an HPC cluster is generally available on the login nodes of the cluster (assuming it is available for the appropriate OS version; e.g. RedHat Linux 6 for the two Deepthought clusters). This is due to the fact that the compute nodes do not use AFS and so have copies of the AFS software tree, and so we only install packages as requested. Contact us if you need a version listed as not available on one of the clusters.

In general, you need to prepare your Unix environment to be able to use this software. To do this, either:

tap TAPFOO

module load MODFOO

where TAPFOO and MODFOO are one of the tags in the tap and module columns above, respectively. The tap command will print a short usage text (use -q to supress this, this is needed in startup dot files); you can get a similar text with module help MODFOO. For more information on the tap and module commands.

For packages which are libraries which other codes get built against, see the section on compiling codes for more help.

Tap/module commands listed with a version of current will set up for what we considered the most current stable and tested version of the package installed on the system. The exact version is subject to change with little if any notice, and might be platform dependent. Versions labelled new would represent a newer version of the package which is still being tested by users; if stability is not a primary concern you are encouraged to use it. Those with versions listed as old set up for an older version of the package; you should only use this if the newer versions are causing issues. Old versions may be dropped after a while. Again, the exact versions are subject to change with little if any notice.

In general, you can abbreviate the module tags. If no version is given, the default current version is used. For packages with compiler/MPI/etc dependencies, if a compiler module or MPI library was previously loaded, it will try to load the correct build of the package for those packages. If you specify the compiler/MPI dependency, it will attempt to load the compiler/MPI library for you if needed.

Running IDL in batch mode

Although for short computations on a personal workstation the interactive mode of IDL is nice, sometimes one wishes to have IDL work in a batch-style mode. This is a requirement for using IDL on the HPC clusters, where computationally intensive processes must be submitted to the scheduler for running on the compute nodes.

The easiest method is probably to invoke idl in batch mode on a simple script file which then uses the .RUN IDL executive command to run a main program file containing the real code you wish to run. I.e., create a main program file the code you wish to run. For example, the following is a simple main program to print factorials, which we will place in a file called factorial_test.pro:

f = 1
max = 7
for k=1,max do begin
        f = k * f
        print, k, f
endfor
end

You should also create a simple IDL batch script to invoke this program, e.g. the batch_test.pro file below:

.run factorial_test.pro
exit

Be sure to include the exit command if you wish for IDL to terminate when the main program is finished. This is especially important if submitting to the HPC compute nodes via sbatch, as otherwise your job will not terminate when the calculation is finished, but wait until it hits the wall time limit, wasting CPU cycles (and charging your allocation account)

You can then invoke your main program from the unix command line with something like idl batch_test.pro (NOTE: you can omit the .pro extension in the idl and .run commands if so desired; the .pro extension will be used by default.) To use with sbatch, a job script like

#!/bin/bash
#SBATCH -ntasks=1
#SBATCH -t 15
#SBATCH --mem-per-cpu=1

. ~/.profile

module load idl

echo "Starting job ..."
idl batch_test

The two separate .pro files are required in general because in IDL batch mode, which batch_test.pro runs under, each line is read and executed immediately. Control statements, e.g. the for loop in our factorial_test example, often span multiple lines --- although you can use ampersands and dollar signs to continue lines, this quickly becomes messy. In main program parsing mode, such as used for factorial_test, the entire program unit is read and compiled as a single unit. Since IDL code run in batch mode or submitted to the HPC compute nodes is assumed to be complicated, it is probably best to use this two file approach.

Multithreading and IDL

Recent versions of IDL support multi-threading, at least to some degree. This means that on systems with more than one CPU and/or multiple cores per CPU socket, IDL will use multiple threads to do work in parallel when the application determines it is advantageous to do so. This is automatic, and invisible to the user except for hopefully improved performance.

By default, when IDL encounters a calculation that would benefit from multi-threading, it will generate a thread for every CPU core on the system it is running. This is probably the desired behavior when IDL is the only (or the main) program running on a system. E.g., on a desktop or dedicated compute node.

But on some HPC systems, the available cores per node can be somewhat large (e.g. nodes on DT2 have at least 20 cores), and might be more than the optimal degree of parallelization for certain problems. On these systems, for certain problems, you might not wish to allocation all the cores on the node to the system (since you will be charged for the time on those cores). However, if you restrict the number of cores that you are requesting, you must tell this restriction to IDL as well, because otherwise IDL will just try to use all the cores anyway, which could adversely affect performance.

E.g., assume that you have determined through trial and error that for the particular type of problem you are working on, the greatest efficiency occurs for 6 cores (e.g., a 6 core job is 40% faster than a 4 core job, but an 8 core job is only a few percent faster than a 6 core job). So you submit a bunch of 6 core idl jobs, with the --share flag so that you are not charged for all the cores on the node. And suppose that 3 of your jobs end up on the same 20 core node (since you told slurm these are 6 core jobs, three will fit on a 20 core node). If you do not tell IDL to restrict itself to 6 cores, each jobs will determine that there are 20 cores on the node, and assume it can use all of them, and multithreaded calculations will be split into 20 tasks. In addition to any performance hit from using too many tasks for each job, you also have up to 60 tasks trying to run on a 20 core system, which will further degrade performance.

To tell IDL to use less than all the cores it finds on the system, you need to set the IDL_CPU_TPOOL_NTHREADS environmental variable. By default, it is 0, which means use all cores on the system. You should set it equal to the number of cores your requested from Slurm; we recommend that you set it to the SLURM_NTASKS to ensure consistency between what you requested from Slurm and what you tell IDL. E.g.,

#!/bin/tcsh
#SBATCH -n 8
#SBATCH --share
#SBATCH -t 2:00
#SBATCH --mem=8000

module load idl
setenv IDL_CPU_TPOOL_NTHREADS $SLURM_NTASKS

idl -e "myprogram.idl"

If you change the number of cores (the value after -n), the value of IDL_CPU_TPOOL_NTHREADS will automatically agree.

DIT-UMD

Idl

Contents

Summary and Version Information

Running IDL in batch mode

Multithreading and IDL