One can create Anaconda environments on the SCC by using the conda command line utility available under the SCC miniconda module. When you load miniconda for the first time you will get a warning message, like the one below, suggesting you should create a .condarc file using the setup_scc_condarc.sh bash script to avoid exceeding the quota for your Home directory.

scc1% module load miniconda
------------------------------------------------------------------------------
WARNING:
You do not have a .condarc file in your home directory. This file is used
to configure the default location for environments and packages. By
default conda will store environments and download packages in your
home directory. Home directory quotas are limited to 10GB and this is easily
reached with the installation of a few environments.

Before using conda on the SCC run this script to create a default .condarc
file that will use your /projectnb directory for storage:

    setup_scc_condarc.sh

------------------------------------------------------------------------------

Run the command below to run the bash script and follow the instructions in the prompt.

scc1% setup_scc_condarc.sh

After creating the .condarc you are all set to create an environment. Visit the conda documentation to learn more about how to use the conda command line utility and make sure to follow the Linux instructions.

Below are some general best practices on using conda on the SCC.

To accomplish certain things on the SCC using conda you may need take steps that are specific for the SCC to make it work. Below are examples of some tasks commonly done on the SCC that require special attention:

Avoid doing this on the SCC

You should avoid doing these things on the SCC while using conda:

  • Do not run conda update as this will cause changes to your .bashrc file which can cause unexpected behavior while using the SCC. If you need a newer version of miniconda, please submit the request using our software/application request form.
  • You should not add module load miniconda to your .bashrc file. This can cause unexpected behavior while using SCC OnDemand.

Increase Disk Space by Cleaning Anaconda Cached Packages

Miniconda stores an index cache, lock files, unused cache packages, and tarballs when packages are installed into environments. This is convenient for creating environments quickly when they contain similar packages as existing environments; however, you can delete these files to free up storage space. To remove (or clean) these cached files, run:

scc1% module load miniconda
scc1% conda clean -a

This will give you a summary of the files it will delete and ask for you to confirm if you want to proceed with the deletions.

Tips on Creating Environments

Here are some general tips on creating environments successfully.

  • Many current conda packages are available through the conda-forge channel. By default, this channel is not searched by the conda create command, but you can instruct conda to search this channel by adding the -c conda-forge flag.
    scc1% module load miniconda
    scc1% conda create -n my_env_w_spyder -c conda-forge python=3.8 spyder
    

    The same can be done for other channels, such as bioconda

  • Installing packages after creating the environment, by using the conda install command, can cause installation failures due to incompatible packages already installed in the environment. To avoid this situation, it is best to create a new environment and list all the packages you want installed during the conda create command. This will give conda the ability to create an environment that contains all of the compatible packages.
  • You can specify how a environment should be created by specifying a YML file. This can be especially useful if your environment requires a large list of packages and you want to easily recreate this environment on another machine or share it with someone else. Check out the following documentation to learn more:

Launching Jupyter Notebook with a specific environment

The easiest way to launch Jupyter Notebook using a specific environment is to install Jupyter during the environment creation. The following example shows doing this:

scc1% module load miniconda
scc1% conda create -n my_env python=3.8 jupyter

After the environment is created, activate the environment and launch Jupyter Notebook by running the jupyter notebook command:

scc1% conda activate  my_env
(my_env) scc1% jupyter notebook

In order to start a Jupyter Notebook Server session on OnDemand running a specific environment, you will need to specify a few additional items on the form. Make sure miniconda is listed under your List of modules to load. Under Pre-Launch Commands include the command to activate the environment that has Jupyter installed. Finally, specify the working directory where the Jupyter Notebooks you wish to run are located on the SCC. The top portion of the form should look something like this:

Launching Spyder with a specific environment

The easiest way to launch Spyder using a specific environment is to install Spyder during the environment creation. The following example shows doing this:

scc1% module load miniconda
scc1% conda create -n my_env_w_spyder python=3.8 spyder

After the environment is created, activate the environment and launch Spyder by running the spyder & command:

scc1% conda activate my_env_w_spyder
(my_env_w_spyder) scc1% spyder &

In order to start a Spyder session on OnDemand running a specific environment, you will need to specify a few additional items on the form. Make sure miniconda is listed under List of modules to load. Under Pre-Launch Commands include the command to activate the environment that has Spyder installed. The top portion of the form should look like this:

Running PyTorch, Tensorflow, and other Python Machine Learning Libraries on the SCC

SCC modules are provided for PyTorch, Tensorflow, and MXNet. You can see the available versions using the module command:

scc1% module avail tensorflow

----------------------- /share/module.7/machine-learning -----------------------
   tensorflow/1.12      tensorflow/2.1.0    tensorflow/2.7.0
   tensorflow/1.13.1    tensorflow/2.3.1    tensorflow/2.8.0 (D)
   tensorflow/1.15.0    tensorflow/2.4.1
   tensorflow/2.0.0     tensorflow/2.5.0

  Where:
   D:  Default Module

Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching
any of the "keys".

scc1% module avail pytorch
# etc...

These modules will generally work with Python 3.7-3.9. We highly encourage researchers to use the Python machine learning SCC modules because they are configured to work with the SCC, and specifically are configured to work with our GPUs. A default install of these packages, via conda install, may only enable CPU computation capability and have GPU capability disabled.

To utilize these machine learning SCC modules in a new environment, first create a basic environment by just specifying a python version. Activate the new environment and then load the Python machine learning SCC modules you want to use in your environment. Then install additional conda libraries into the environment. For example:

scc1% conda create -n my_ml_env -c conda-forge python=3.8
scc1% conda activate my_ml_env
(my_ml_env) scc1% module load tensorflow/2.8.0
(my_ml_env) scc1% conda install jupyter pandas  # ...etc... 

Every time you activate this environment, you will need to load the same SCC modules in order to make these tools available for your analysis.

If you are installing an environment from an environment.yml file you may need to edit the file to remove references to tensorflow, pytorch, or mxnet before creating the environment. For more help with this please email us at help@scc.bu.edu.

Recreating an existing environment from your Home Directory in your Project Space

First, make sure you have a .condarc file created and that it is configured to save the environment in your Project Disk Space. Either use our .condarc helper script described at the top of this page, or create one manually as described in Configuring a .condarc file manually section. Then run these commands and substitute my_env1 with the environment you wish to recreate:

scc1% module load miniconda
scc1% conda activate my_env1
(my_env1) scc1% conda list --explicit > my_env1_pkgs.txt
(my_env1) scc1% conda deactivate
scc1% conda create --name my_env2 --file my_env1_pkgs.txt
scc1% conda activate my_env2

Once the environment has been recreated in your Project Disk Space, you can remove the environment from your home directory using the following command:

scc1% conda remove --name my_env1 --all

Activating an Environment in csh/tcsh

In csh/tcsh the activate script is not available, so you need to add the environment bin directory to your PATH environment variable. Using the command conda info -e will print the path next to the environment name.

When you create a new environment, you’ll see a message in the output similar to the following:

scc1% conda info -e
# conda environments:
#
gbrs   /projectnb/scv/cjahnke/.conda/envs/gbrs

In this example, you would want to add /projectnb/scv/cjahnke/.conda/envs/gbrs/bin to your PATH environment variable:

scc1% setenv PATH /projectnb/scv/cjahnke/.conda/envs/gbrs/bin:$PATH

Configuring a .condarc file manually

Anaconda allows you to create an isolated programming environment known as a container. This requires installing all of the Python packages required to run your code for each environment. These environments can take up a considerable amount of disk space and should be saved in your Project Disk Space. To do this, in your home directory create the file ~/.condarc and add the following, making sure to replace your_project and your_loginname appropriately:

envs_dirs:
    - /projectnb/your_project/your_loginname/.conda/envs
    - ~/.conda/envs
pkgs_dirs:
    - /projectnb/your_project/your_loginname/.conda/pkgs
    - ~/.conda/pkgs
env_prompt: ({name})

Also, replace /projectnb with /restricted/projectnb if that is where your project has disk space available.