Overview

The Project Disk Space file system provides over a Petabyte (usable) of high performance online storage for research computing projects. Project Disk Space is allocated to individual research projects for exclusive use by its members, facilitating collaboration.

Each project is allocated a limited amount of Free quota. Those projects requiring additional may either purchase Project Disk through the Buy-in program or rent additional Project Disk through the Storage-as-a-Service program.

All Project Disk Space is protected by hardware RAID (protecting against disk failures) and daily
Snapshots (protect against accidental deletion of files).

Kinds of Project Disk Partitions

There are four Project Disk Space partitions on the SCC: /project, /projectnb, /restricted/project, and /restricted/projectnb. These four partitions have identical performance characteristics. The two /restricted partitions are dbGaP compliant for data that needs it (primarily Genomics projects). The two /project partitions are backed up nightly to an independent off-site system for disaster recovery and the two nb partitions are not-backed-up. Regardless, Snapshots are implemented on all four partitions enabling users to easily retrieve accidentally deleted files.

dbGaP Compliant

Data with dbGaP restrictions (primarily Genomics projects) must be stored on the /restricted/project and/or /restricted/projectnb partitions. These two partitions are accessible by all SCC compute nodes and the login node scc4.bu.edu only; they are not accessible by any of the other login nodes. These partitions are not suitable for HIPAA data or data classified as restricted use under the BU Data Protection Standards. Note that the scc4.bu.edu login node can only be accessed from within the BU campus network.

Allocations

Project Disk allocations can be in the form of any of three types: Free, Buy-in, or Storage-as-a-Service. Functionally, rented and purchased Project Disk augment and are indistinguishable from free storage. Regardless, the details on a project’s multiple allocations is tracked.

Forms for requesting both Free, Buy-in, and Storage-as-a-Service can be found with the other project management web pages on TechWeb at RCS Project Management.

Free Baseline Quota

By default, new projects on the SCC are created with 50 GB on /project and 50 GB on /projectnb. The LPI can specify whether or not it should be dbGaP compliant. Additional Project Disk Space may be requested by a project’s LPI or IT/Administrative Contact. There is no charge for requests up to a total of 1000 GB with a maximum of 200 GB of that backed up. For LPIs with multiple projects, there is an additional limit of a maximum of 3000 GB (with a maximum of 600 GB of that backed up) of Free Baseline quota across all projects.

Application form: RCS Project Management

Buy-in Program

The highly successful Buy-in Program is a convenient way to acquire dedicated storage at highly subsidized rates for an extended period of time. Any Researcher interested should contact buyin@bu.edu or review the Buy-in options web pages.

All grant rules apply when using grant funds.

Storage-as-a-Service

The Storage-as-a-Service program offers researchers an option to acquire additional disk quota for a flexible time duration at a subsidized rate of $91/Terabyte/year. The minimum time commitment is six months. Allocations are in whole Terabyte (1000 Gigabyte) units only. To purchase an allocation through this program, the PI should fill in the request form for Storage-as-a-Service and include their Financial Contact information. The Financial Contact will receive details on how to send an Internal Service Request for transmitting payment.

All grant rules apply when using grant funds.

Application form: RCS Project Management

Accessing Project Disk Space

When a project is created on the SCC, subdirectories will be created for the project under the appropriate /project, /projectnb, /restricted/project, and/or /restricted/projectnb directories. These subdirectories will have the same name as the project and will be writable by any member of the project. The structure and access to the files and subdirectories created under the project’s directory is entirely at the discretion of the project members. The Unix “group” file permission mechanism can be used to control permissions for the project’s subdirectories (see the man page for “chmod” for more details).

Quota Enforcement

Project Disk Space quotas on the SCC are enforced by the file system. Daily email reminders are sent to the project’s Lead Project Investigator and all project members to let them know when the project is over its quota, including breaking down how much space each user is using. Projects have a soft limit equal to their granted quota and a hard limit 10% greater (with a maximum of 100GB over the quota, regardless of its size). Projects can never exceed their hard limit and can only go over their soft limit for a maximum of 7 days. A project over its limit simply needs to delete enough files to get under the soft limit to have full write access restored immediately.

To help manage the project members’ Project Disk usage, PIs may specify a limit for each individual researcher. By default, each project member’s limit is set to project’s full allocation. A PI may reassign individual quotas at any time using the Project Disk Space update form found at the link above. These individual quotas are enforced by the honor system, with email reminders sent daily to the Lead Project Investigator and user who is over his or her personal quota.

PIs and users may review a daily record of their project’s and individual Project Disk usage off the RCS Project Management page.

Please note that the quota -v command will display a user’s home directory usage, not Project Disk Space usage.

Two helpful Linux commands for determining disk usage are du and df -h .. Researchers who keep all of their files in their own subdirectory can cd to that directory and type du -sk to display their usage. You can see your project’s overall usage, available space, and hard limit quota by running the command df -h . anywhere inside of your group’s appropriate Project Disk Space directory.

Backed up vs. Not-backed-up Project Disk

Most computational research projects will need a combination of /project (backed-up) and /projectnb (not-backed-up) disk space. Files on the /project partitions are backed up nightly while those on the /projectnb partitions are not backed up. The /projectnb partitions are appropriate for most files used in computational research on the SCC.

Backing up files requires additional resources and expense. We ask that you use the /project partitions only for files that need to be backed up. These will be restorable in the event of catastrophic failure.

Files that should be stored in /project and backed up are those that are being edited (e.g. codes), files that do not have a copy elsewhere, and files that cannot be regenerated.

Files that should be stored in /projectnb:

    • Data which exists elsewhere and is copied to the Project Disk Space for high performance access during computation.
    • Data which can be easily regenerated.
    • Data which is needed for only a short time.
    • Newly generated data which will be copied to another system for storage.

      If you accidentally delete or corrupt files stored in any of the Project Disk partitions, you may locate them in the Snapshots and copy them into your directory.