On the SCC, we have installed several versions of Python. We strongly recommend the use of anaconda python, which with its conda tool makes it very easy to install new or old versions of Python packages. Next we describe how to access the latest version of Python, then we explain how to use the conda tool and the anaconda installation, and finally we provide information about legacy or alternative installations.

Accessing Anaconda Python

Anaconda Python is just python including a collection of useful scientific packages, for example numpy, scipy, matplotlib and pandas. To access this installation, use the module command:

scc1% module load anaconda

That’s it! Now you’ll have access to the latest and greatest that Python has to offer. You’ll also have access to conda, which we discuss in the following sections.

Submitting batch jobs

As explained on the introductory page, you should place all your calls to Python in a script and submit that script to qsub. If you intend to use the anaconda version of Python, make sure to load the anaconda module either in your batch script or your login script.

Setting up a custom installation of Python

The conda command is a tool to interact with, create, and modify a personal installation of Python (called an environment). You would want to create your own environment if

  • you needed a different version of Python or of a Python package
  • you needed to install a package (with conda or pip) that is not already installed
  • you wanted to experiment with a Python package

conda is aware of a repository of pre-compiled copies of essential third-party dependencies, for example the graphics library wx-widgets needed by wxpython. So installing hard-to-compile dependencies is a simple process of downloading the pre-compiled versions (using the conda tool).

Below we provide several examples of initializing new conda environments. First we show you how to find and activate existing environments. This depends on which shell you use. The next example shows you how to start a new environment from scratch, and for that example we install Python 3.4 along with numpy, scipy and matplotlib. The following example shows you how to clone an existing environment, namely the “root” environment, which is the default environment you access when you load the anaconda module. This is the recommended way to start a new environment (see below: “Warning about mpich2, mpi4py and readline”). Lastly, we provide an example of installing Python packages using either conda or pip, the standard Python installer.

Warning about mpich2, mpi4py and readline

Note: Some packages installed with anaconda are not compatible with our system, including: mpich2, mpi4py, and readline. If you intend to use these from a newly created anaconda installation, please contact us for instructions. It is preferable to clone the root environment, which already has SCC-customized versions of these packages.

Warning about virtualenv

Note: virtualenv is not compatible with anaconda. conda, described below, is the substitute for virtualenv. If you require virtualenv for one reason or another, contact us and we will help devise a solution.

Example 1: Finding and Activating conda Environments

The conda command info provides you details about your anaconda installation. The flag -e, for environments, lists all the environments available to you, for example:

scc1% conda info -e
# conda environments:
#
cython_test              /usr1/scv/yannpaul/anaconda_envs/cython_test
personal                 /usr1/scv/yannpaul/anaconda_envs/personal
py3                      /usr1/scv/yannpaul/anaconda_envs/py3
custom                   /share/pkg/anaconda/2.0.0/install/envs/custom
root                  *  /share/pkg/anaconda/2.0.0/install

The * indicates which conda environemnt is currently active. The above example shows the default “root” environment is active. The “root” environment is what you get when you initially load the anaconda module.

Activating an Environment in Bash

With Bash, each environment is given a script to activate that environment. In the following example, we will activate an environment named py3:

scc1% source activate py3
discarding /share/pkg/anaconda/2.0.0/install/bin from PATH
prepending /usr1/scv/yannpaul/anaconda_envs/py3/bin to PATH
(py3)scc1%

You will find, as we see above, that your command prompt will change to indicate which environment is currently active. You can change the environment by activating another, even the root environment, but you can also deactivate the environment, as we show next:

scc1% source deactivate
discarding /usr1/scv/yannpaul/anaconda_envs/py3/bin from PATH
scc1%

This is the same as activating the ‘root’ environment, except now the prompt looks as it originally did.

Activating an Environment in csh/tcsh

In csh the activate script is not available, so you simply add the environment bin directory to your PATH environment variable. Where’s the environment installed? You can find the path printed next to the name in the output of a call to ‘conda info -e’ or when you create a new enviroment (more about this below) you’ll see a message in the output similar to the following:

Package plan for installation in environment /usr1/scv/yannpaul/anaconda_envs/py3

So you would want to add /usr1/scv/yannpaul/anaconda_envs/py3/bin to your PATH environment variable:

scc1% setenv PATH /usr1/scv/yannpaul/anaconda_envs/py3/bin:$PATH

Example 2: Creating a new environment with conda

In this example we use the conda create command to create a new environment named py3. This name can then be used later on to access this installation. Following the create instruction, we list the packages that should be installed into the new environment, namely python 3.4, numpy, scipy and matplotlib.

scc1% conda create -n py3 python==3.4 numpy scipy matplotlib
conda create -n py3 python==3.4 numpy scipy matplotlib
Fetching package metadata: ..
Solving package specifications: .............
Package plan for installation in environment /usr1/scv/yannpaul/anaconda_envs/py3:

The following NEW packages will be INSTALLED:

    dateutil:   2.1-py34_2
    freetype:   2.4.10-0
    libpng:     1.5.13-1
    matplotlib: 1.4.0-np19py34_0
    numpy:      1.9.0-py34_0
    openssl:    1.0.1h-1
    pyparsing:  2.0.1-py34_0
    pyqt:       4.10.4-py34_0
    python:     3.4.0-0
    pytz:       2014.7-py34_0
    qt:         4.8.5-0
    readline:   6.2-2
    scipy:      0.14.0-np19py34_0
    sip:        4.15.5-py34_0
    six:        1.8.0-py34_0
    sqlite:     3.8.4.1-0
    system:     5.8-1
    tk:         8.5.15-0
    zlib:       1.2.7-0

Proceed ([y]/n)? y

Linking packages ...
[      COMPLETE      ] |##################################################| 100%
#
# To activate this environment, use:
# $ source activate py3
#
# To deactivate this environment, use:
# $ source deactivate
#

The last instruction explains how to activate this environment. As explained above, this instruction only works if you are using Bash, for [t]csh you must manually modify your PATH environment variable.

Example 3: Cloning an existing environment with conda

scc1% conda create -n new_root --clone root
src_prefix: '/share/pkg/anaconda/2.0.0/install'
dst_prefix: '/usr1/scv/yannpaul/anaconda_envs/new_root'
Packages: 129
Files: 31
Fetching package metadata: ..
Linking packages ...
[      COMPLETE      ] |##################################################| 100%
#
# To activate this environment, use:
# $ source activate new_root
#
# To deactivate this environment, use:
# $ source deactivate
#

As with creating new environments, how you activate this new cloned environment depends on whether you’re using bash or [t]csh. After cloning and activating you own personal environment, you can modify it by installing new packages, either with conda or pip, as is described next.

Example 4: Installing Python packages: When to use conda and when to use pip

If you’ve created a custom Python environment with conda, you can install your own packages using either pip or conda. When you use conda, it downloads a pre-compiled version of the package you request. conda has access to a limited set of packages that are either essential to research computing or are hard to compile, or both. pip is a Python program to download and install Python packages from pypi.python.org, and it is available when you use the anaconda module. As opposed to conda, pip downloads, compiles and installs the source code.

Which should you use? First try conda, because it’s easier, then try pip. If you have trouble installing a package, please contact us for help.

How do you install a package? Well with conda, there is an install command. You can also specify a version by appending a package name with ‘==vesrsion-number’, for example numpy==1.7 will install numpy version 1.7. Here’s an example where we install pandas version 1.4 into our custom Python 3 environment, named ‘py3′:

scc1% source activate py3
discarding /share/pkg/anaconda/2.0.0/install/bin from PATH
prepending /usr1/scv/yannpaul/anaconda_envs/py3/bin to PATH
(py3)scc1% conda install pandas==1.4
Fetching package metadata: ..
Solving package specifications: .
Package plan for installation in environment /usr1/scv/yannpaul/anaconda_envs/py3:

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    pandas-0.14.0              |       np18py34_0         9.6 MB
    python-3.4.1               |                4        22.6 MB
    scipy-0.14.0               |       np18py34_0        28.9 MB
    setuptools-5.8             |           py34_0         431 KB
    ------------------------------------------------------------
                                           Total:        61.5 MB

The following NEW packages will be INSTALLED:

    pandas:     0.14.0-np18py34_0
    setuptools: 5.8-py34_0
    xz:         5.0.5-0

The following packages will be UPDATED:

    python:     3.4.0-0           --> 3.4.1-4
    scipy:      0.14.0-np19py34_0 --> 0.14.0-np18py34_0

The following packages will be DOWNGRADED:

    numpy:      1.9.0-py34_0      --> 1.8.2-py34_0

Proceed ([y]/n)? y

Fetching packages ...
pandas-0.14.0- 100% |################################| Time: 0:00:01   7.11 MB/s
python-3.4.1-4 100% |################################| Time: 0:00:01  16.70 MB/s
scipy-0.14.0-n 100% |################################| Time: 0:00:00  35.84 MB/s
setuptools-5.8 100% |################################| Time: 0:00:00   1.79 MB/s
Extracting packages ...
[      COMPLETE      ] |##################################################| 100%
Unlinking packages ...
[      COMPLETE      ] |##################################################| 100%
Linking packages ...
[      COMPLETE      ] |##################################################| 100%

Note that some packages needed to be downgraded, some needed to be upgraded and others needed to be installed to get pandas 0.14 installed. Regardless, the whole process is automated.

Now you can do the same thing with pip, but pandas would need to be compiled. pip is useful, rather, for packages not available to conda. Here as an example we also install pint, a package to manage unit conversions, using the pip program:

(py3)scc1% pip install pint
Downloading/unpacking pint
  Downloading Pint-0.5.2.zip (134kB): 134kB downloaded
  Running setup.py (path:/tmp/pip_build_yannpaul/pint/setup.py) egg_info for package pint

    no previously-included directories found matching 'docs/_build'
    no previously-included directories found matching 'docs/_themes/.git'
    warning: no previously-included files matching '*.pyc' found anywhere in distribution
    warning: no previously-included files matching '*~' found anywhere in distribution
    warning: no previously-included files matching '.DS_Store' found anywhere in distribution
    warning: no previously-included files matching '*__pycache__*' found anywhere in distribution
    warning: no previously-included files matching '*.pyo' found anywhere in distribution
Installing collected packages: pint
  Running setup.py install for pint

    no previously-included directories found matching 'docs/_build'
    no previously-included directories found matching 'docs/_themes/.git'
    warning: no previously-included files matching '*.pyc' found anywhere in distribution
    warning: no previously-included files matching '*~' found anywhere in distribution
    warning: no previously-included files matching '.DS_Store' found anywhere in distribution
    warning: no previously-included files matching '*__pycache__*' found anywhere in distribution
    warning: no previously-included files matching '*.pyo' found anywhere in distribution
Successfully installed pint
Cleaning up...

pip: command not found

Note: Depending on how you created your own conda environment, you might not have installed pip already. To solve this, use conda to install pip, following the same instructions we used above for pandas.

You can also use pip to install packages locally in your home directory. It is so easy to create a custom conda environment where there’s little confusion about where packages are installed that it’s not recommended to use locally installed packages with conda environments.

Alternatives

Historically several versions of Python were installed on SCC, some targeting different research groups. These can now all be replaced with the anaconda Python installation, which makes updating Python packages much easier. We have kept the original module files available, however, in case they are needed to rerun calculations exactly as they were done originally.

Here is the list of these legacy Python-specific system modules:

Module Name Use
python2.7/Python-2.7.3_gnu446 Python 2.7.3 was initially installed for use by people coming from the Medical Campus and the LinGA system. Anyone can use this module, but it has been setup with LinGA users in mind.
python/2.7.5 Python 2.7.5 was initially installed for use by members of the Earth & Environment department. Anyone can use this module, but it has been setup with E&E users in mind.
python/canopy-1.0 Enthought’s Canopy distribution allows you to install local copies of different Python packages/modules¬†if you sign up for an academic account.
epd/epd-2.7.3 Traditional Enthought Python Distribution, which provides access to a large list of Python packages automatically.

By loading one of these modules you will have access to the corresponding set of tools.