The Research Computing Services (RCS) group within Information Services & Technology at Boston University provides computing, storage, and visualization resources and services to support research that has specialized or highly intensive computation, storage, bandwidth, or graphics requirements. Typical applications include bioinformatics, geographic information systems (GIS), statistics, data analysis, molecular modeling, scientific and engineering simulation, and visualization. Research Computing resources and services are widely used by researchers across both the Charles River and Medical campuses.
This guide will give you the basics of getting an account on and using the Shared Computing Cluster (SCC) (Technical Summary) for research computing, with links to much more detailed information. You may also want to look at our Getting Started Reference Sheet (PDF) and Shared Computing Cluster Usage Cheat Sheet (PDF).
- Get an Account on the SCC
- Login to the SCC
- File Storage
- File Transfer
- Customizing your Environment
- Find the Software you Need
- Run a Batch Job
- Additional Help
- Professors – Apply for a new project (More information.)
Your SCC account (and project if applicable) will generally be set up within 1 business day of applying for or requesting it and you will receive email with instructions on logging in when it is ready.
scc1.bu.edu using SSH:
Note: BUMC users should instead connect to
scc4.bu.edu. You may otherwise not be able to access your project data.
Enter your BU Kerberos password when requested.
If you have successfully logged in, you should see something like:
Last login: Tue May 27 12:31:54 2014 from ist-ithclabnet-dhcp109.bu.edu ******************************************************************************** This machine is governed by the University policy on ethics. http://www.bu.edu/tech/about/policies/computing-ethics/ This machine is owned and administered by Boston University. See the Research Computing web site for more information about our facilities. http://www.bu.edu/tech/about/research/ For Cluster specific documentation see: http://www.bu.edu/tech/about/research/computation/scc/ Please send questions and report problems to "email@example.com". ******************************************************************************** [adftest2@scc1 ~]$
If you do not have or know you have an SSH application on your machine, please consult our Get Started – Connect (SSH) page for instructions based on your operating system.
If you will be using any graphical applications, make sure you have X Forwarding enabled; doing this varies based on your local machine’s operating system and the SSH client you are using. If you find the performance of graphical applications to be poor, you may also want to try using VNC.
We have a detailed page on Managing your Files. The main places you will be storing data are:
- Home Directory – This directory is entirely controlled by you and the default permissions are that nobody else can see or otherwise access your files. Home directories have a quota of 10 GB and this will generally not be increased. You will naturally store files directly related to your account here, such as dotfiles. It is also commonly used to store personal files, such as email or personal images. You can do work in your home directory if it fits within the 10GB limit. Home directories are both protected by Snapshots and also backed up off site.
- Backed Up Project Disk Space – Projects are by default granted 50 GB of space under
/restricted/project/project_name/for most BUMC projects). This number can be increased to a maximum of 200 GB at the request of the project leader(s) but it can not go beyond that. This data is both protected by Snapshots and also backed up off site. Depending on the workflow of the project, a reasonable approach is to keep code and files you hand-edit in
/project/and files downloaded or generated by code or applications in
- Not Backed Up Project Disk Space – Projects are by default granted 50 GB of space under
/restricted/projectnb/project_name/for most BUMC projects). This can be increased for free to a maximum total allocation of free disk space of 1000 GB and then beyond that additional Not Backed Up space can be purchased through either Buy-In or Storage-as-a-Service. Despite the name for this space, it is protected by both hardware RAID (protecting against disk failures) and daily Snapshots (protecting against accidental deletion of files). You will want to use this space for any large quantities of data you have. We have guidelines for what data should be stored in each partition.
You will use a file transfer application running on your local machine that supports the scp (secure copy) protocol. As such, this is extremely dependent on your operating system. We have instructions for Windows, Mac OS X, and Linux systems. Example applications are WinSCP for Windows, Fetch for Mac OS X, and the
scp command for Linux.
The default shell is
bash but you can change your shell to tcsh.
You can also do various other things to customize your environment.
We have a list of all the major software applications and languages available on the system. The table on that page is both searchable and sortable. Many popular packages have their own pages with detailed instructions on running the package or using the language.
We also use the module system to load and configure many packages. This is very important to be aware of if you wish to use the most recent versions of major software packages. For various technical reasons, for many packages the default version installed on the system may be multiple years and many versions behind the newest versions available through the module system.
We have extensive documentation on submitting jobs to the batch system, which is required for jobs that need either multiple processors or to run for more than 15 minutes of cpu-time. Non-interactive batch jobs are submitted with the qsub command. The general form of the command is:
scc % qsub [options] command [arguments]
For example, to submit the
printenv command to the batch system, execute:
scc % qsub -b y printenv Your job #jobID ("printenv") has been submitted
-b y tells the batch system that the following command is a binary executable. The output message of the
qsub command will print the job ID, which you can use to monitor the job’s status within the queue. While the job is running the batch system creates stdout and stderr files in the job’s working directory, which are named after the job with the extension ending in the job’s number, for the above example
printenv.e#jobID. The first one will contain the output of the command and the second will have the list of errors, if any, that occurred while the job was running.
The commands to run an Interactive Job through the batch system are
qsh (starts the job in its own window) and
qlogin (uses your current shell window).
The Research Computing Support pages at http://www.bu.edu/tech/support/research/ provide links to detailed information on all of the Research Computing services. Send email to firstname.lastname@example.org for additional help.