We maintain several versions of Python on the Shared Computing Cluster (SCC) and try to keep both the Python versions and associated packages up to date. Each Python installation has many packages pre-configured and readily available. The packages include popular libraries like NumPy, Pandas, SciPy among many others. Researchers have a choice of Python versions on SCC, but most will prefer the general purpose installations. Each installation can be loaded using the module system.

The categories of our Python installations are as follows:

 

The most recent listing of python installations on SCC can be queried using the module system.

scc1$ module avail python

General Purpose Python Modules

For most computational use cases, the most recent version of these general purpose installations is the best choice. We offer python2, python3, and maintain older versions for reproducibility. These modules are configured for general use and already contain many commonly used python packages.

To load the default version of python3, use the following command in terminal:

scc1$ module load python3/3.8.10

We recommend always specifying the version of the module you will load. This prevents breaking your workflow/pipeline in case we upgrade the default version of a module to a newer one.
 

Update Frequency:

We install new Python modules approximately every 6 months. The Python libraries installed within the Python modules are updated to their latest versions when the module is installed.

Intel Distribution for Python

The Intel Distribution for Python behaves like regular Python, but leverages Intel technologies to speed up many of the core python libraries, including NumPy, SciPy, Pandas, Scikit-Learn, Jupyter, matplotlib, and mpi4py. This distribution also integrates Intel Math Kernel Library (Intel MKL), Intel Data Analytics Acceleration Library (DAAL) and pyDAAL, Intel MPI Library, and Intel Threading Building Blocks (TBB). The following modules offer significant speedups for some computational workloads at the cost of potential incompatibility with other python packages.

To load the most recent version of Intel Python 3.7, issue the following command:

scc1$ module load python3-intel/2021.1.1

The optimizations available in the Intel Distribution of Python depend both OpenMP and Intel multi-processing libraries. Two environment variables are used coordinate the automatic parallel processing: OMP_NUM_THREADS and MKL_NUM_THREADS. It is important to understand the impact of these variables on your code and define appropriate values when running your code. Most commonly, you should set the value equal to the number of slots ($NSLOTS) your job requests, but not always. Please read Intel’s guidance on threaded applications.

 

Module Name Description Default Value Recommended
OMP_NUM_THREADS Sets the number of threads for OpenMP 1 $NSLOTS
MKL_NUM_THREADS Sets the number of threads for Intel Math Kernel Library 1 $NSLOTS
OPENBLAS_NUM_THREADS Sets the number of threads for OpenBLAS 1 $NSLOTS
NUMBA_NUM_THREADS Sets the number of threads for NUMBA 1 $NSLOTS

 

Anaconda Distribution (conda)

Anaconda is an open-source package and environment manager for Python. It has gained traction for ease of packaging and replicating modules or entire python environments on different systems. The distribution includes a set of core python packages and additional user packages can be installed from remote “channels”. Anaconda has also been known to cause confusion and package dependency issues in complex environments; virtualenv is an alternative for a virtual environment configuration within the cluster if you are not working with existing software that depends on a conda environment.

We recommend the use of the miniconda module to utilize Conda environments. The anaconda2 and anaconda3 modules are no longer updated and will be deprecated in the future. Note that you can only load and use either a Python or a miniconda module. You should not attempt to load miniconda and python modules at the same time.

Instructions for using Anaconda are found on the Anaconda Python page.